On Monday, the White House held the first of three workshops on big data and privacy, part of a 90-day review of the social and policy impact of big data technologies. Though the workshop focused on the technical elements of big data systems, there were several important takeaways for social justice.
First, it should be possible for analysts to extract useful insights from large, sensitive databases (like those from hospitals, banks, and schools) without compromising individual privacy. This might seem difficult at first, but renowned researcher Cynthia Dwork explained that new mathematical techniques (called differential privacy) make it possible. The techniques could be used to calculate important statistics about large groups of people, while making it difficult to target any particular individual whose information may be contained in the dataset. The Census Bureau was an early adopter of these techniques, using them to publish geographic data about where people live and work for its OnTheMap project.
Another promising area of research involves so-called accountable systems. MIT principal scientist (and event organizer) Daniel Weitzner encouraged building information systems that are publicly auditable — allowing anyone to ensure data is accessed according to certain rules. (Researchers at Princeton are working on a related idea for accountable algorithms.) Both ideas could bring greater fairness and confidence to important public processes, like “random” airport screenings and public school lotteries.
The last panel addressed the balance between innovation and privacy. “[A] word that I haven’t heard yet is ‘harm’,” said Harvard computer science professor Latanya Sweeney. “We want [companies] to innovate, but [they should also be transparent] because we want to know where the harms are falling.” Carol Ross from the ACLU of Massachusetts added that privacy shouldn’t be something that only rich people can afford. Unfortunately, Sweeney said, this has long been the case, citing the routine use of supermarket discount cards. (Coincidentally, Julia Angwin asked in a New York Times op-ed on Tuesday, “Has Privacy Become a Luxury Good?”)
The remaining two workshops are to be held at UC Berkeley and NYU in the coming weeks, with a greater focus on legal, policy and ethical issues.
It has never been so easy to gather data about ourselves. Companies like Klout purport to measure our online influence, fitbits track our sleep cycles, and Digifit analyzes the cadence of our steps. The self-tracking movement (also referred to as the “Quantified Self“) hopes all of this information will improve our lives and create helpful new metrics.
Privacy advocates worry about who has to access to these troves of personal data — and rightly so. But there are deeper questions too. In her recent post, Quantify Everything: A Dream of a Feminist Data Future, Amelia Abreu asks us to consider what new measurements prompt us to value.
“[B]efore we can talk about our data-drive lives, we need to talk about what is being measured and why, about who is being measured and why,” she writes. Abreu is particularly concerned with lifting up historically marginalized efforts, like caregiving and parenting, in the digital age. “As in other areas of contemporary life, human-relationship data points are rarely emphasized, or fully acknowledged [by the Quantified Self movement].”
Abreu hopes that data mining might eventually be “flexible enough to be genuinely empowering, allowing users to control their own narratives.”
Imagine, workers doing all sorts of labor engaging with their data traces in ways that make their work safer and their efforts better recognized. Rather than seeking to perfect measures and standards of that work through statistical working-over, can we envision workers taking their own data to management to improve working conditions?
This is a vision for data collection of all kinds. We should repeatedly ask law enforcement, insurance companies, our workplaces, and even ourselves what is being measured and why. Privacy isn’t the only thing at stake.
Last week, a broad coalition of civil, human and media rights organizations released Civil Rights Principles for the Era of Big Data, urging companies and the government to develop and use new technologies in ways that will promote equal opportunity and justice.
This is the first time that national civil and human rights organizations have spoken publicly about the importance of privacy and big data for communities of color, women, and other historically disadvantaged groups. The principles address a wide range of issues, including high-tech profiling, fairness in automated decisionmaking, and individual control of data.
Hopefully, the principles will contribute the policy discussion in D.C. and elsewhere. They come right on the heels of the White House’s “Big Data and the Future of Privacy” review, discussed below.
Signatories included the American Civil Liberties Union; Asian Americans Advancing Justice—AAJC; Center for Media Justice; Color of Change; Common Cause; Free Press; The Leadership Conference on Civil and Human Rights; NAACP; National Council of La Raza; National Hispanic Media Coalition; National Urban League; NOW Foundation; New America Foundation’s Open Technology Institute and Public Knowledge.
The Oakland City Council significantly curtailed the reach of a federally-funded intelligence center today, limiting its activities to the city’s port. The original plan called for deployment of street cameras and microphones in the city.
Y Combinator, an industry-leading seed funder for startups, is being asked to broaden its demographic profile. “For me, it’s an ethical issue,” writes Ellen Chisa. “You might make more money initially by focusing on only one demographic, but what’s the long term implication?”
In the 19 years since New York’s police department pioneered the CompStat crime mapping system, using computerized maps of crime activity to target law enforcement has become a best practice for major U.S. police departments. (The approach has also spread to other municipal services, helping officials identify systematic problems with garbage collection, for example.) But what happens when computers begin flagging people, rather than city blocks, as “high risk”?
In Chicago, a DOJ-funded pilot program is now predicting which residents have the greatest chance of being involved in a homicide (whether as perpetrator or victim). The programs maps social ties among neighborhood residents using internal police department records and (through a second, related targeting effort) school attendance records and other educational data. As Chicago Magazine recently explained, even “within [a] high-crime, high-risk neighborhood, there are very different levels of risk.” For example, within one African-American community, a network of “co-offenders” (people arrested together for the same crime) comprised just 4% of the population but suffered almost 40% of the homicides. The links among the group can be represented visually:
But police have not yet figured out how to use this information effectively.
The March issue of Harper’s magazine describes the story of Davonte Flennoy — a young black man in Chicago, flagged as ultra-high risk by the system, who was placed in a resource-intensive daily mentoring program, but nonetheless was killed in a gang shooting. Other interventions have included sending police captains to knock on the doors of those who appear on the computer’s “heat list” of high risk residents.
As Andrew Papachristos, a Yale sociologist helping lead the Chicago project, explained, “When you look at a geographic map and it says, ‘Here’s a hot spot on the corner of 63rd and Knox,’ you know what to do . . But when you say, ‘This is a group of people who are in a really high-risk social network,’ it’s not clear exactly how to interpret that for policing.”
Yesterday, president Obama announced new manufacturing innovation institutes aimed to connect workers, technology, and skills. “[W]e’ve got to make sure we’re on the cutting edge of new manufacturing techniques and technologies,” he said.
Some welcome increased use of technology and automation in the workplace. “The automation doesn’t replace us. It makes us better,” claimed the authors of The Decoded Company in Wired magazine last week. “Far from our workforce fearing automation, we need to embrace it — especially if we focus on designing the technology as a coach.” (The article continues to describe how UPS software helps drivers find their way.)
But automated workplaces can just as easily create difficult and hostile working environments, as many exposés of high-tech warehouses have shown. For example, Stephen Dallal, who worked at an Amazon warehouse for about six months reported “It just got harder and harder. It started with 75 pieces an hour. Then 100 pieces an hour. Then 125 pieces an hour. They just got faster and faster and faster.”
We should hope for technologies that coach and compliment human workers. But workplace advocates must be ready to demand it. The alternative is already apparent.
Law enforcement officers in Massachusetts must now get a search warrant before obtaining cell phone records from a carrier to track an individual’s location over a long period of time, the state’s high court ruled last week. The decision is a notable break with the third-party doctrine — an old legal rule that has sapped constitutional privacy protections in the digital age.
The third-party doctrine has roots in 1970s Supreme Court cases. It says we abandon our expectation of privacy, and thus many Fourth Amendment protections, when we give businesses certain kinds of information. For example, concerning bank deposit slips, the Supreme Court ruled in 1976 that “[t]he depositor takes the risk, in revealing his affairs to another, that the information will be conveyed by that person to the Government.”
Since the advent of cloud computing and mobile phones, the power of this reasoning has grown immensely. For example, the case in Massachusetts focused on cell site location information (CSLI). As we go about our day, our mobile phones routinely communicate with cell towers. Due to the growing density of these towers, especially in urban areas, carriers have records of our approximate locations. Are the radio waves constantly emitted by our devices information we “voluntarily” give to a business, like a bank deposit slip?
The Massachusetts court ruled that the Supreme Court’s 30-year-old third-party doctrine did not apply. Cell phones are “an indispensable part of modern [American] life,” noted the majority. “CSLI is purely a function and product of cellular telephone technology,” and “would be unknown and unknowable to the telephone user in advance.” And, perhaps most fundamentally, “the digital age has altered dramatically the societal landscape from the 1970s.”
It might not be long before the Supreme Court agrees. Last year, in her concurring opinion in United States v. Jones, Justice Sotomayor wrote that the third-party doctrine is “ill suited to the digital age, in which people . . . disclose the phone numbers that they dial or text to their cellular providers; the URLs that they visit and the e-mail addresses with which they correspond to their Internet service providers; and the books, groceries, and medications they purchase to online retailers.”
Massachusetts joins New Jersey in stepping back from the third-party doctrine on cell phone records. Hopefully, these cases will help shape the future of the debate.
African Americans make up 75 percent of “stop-and-frisk” stops in Newark, New Jersey, found a report released by the ACLU yesterday. “[O]ur report raises concerns about the high volume of stops, racial disparities in who is getting stopped and the fact that the vast majority of stops appear to be of innocent people,” said Udi Ofer, a co-author. The report also highlights the importance of transparent police department data, which is increasingly necessary safeguard civil liberties.
Berkeley saw an introductory computer science course of predominantly female students for the first time in history.
Netflix has begun paying Comcast for direct access to its broadband network, paving the way for smoother streaming of its movies and shows. However, CDT’s David Sohn worries such deals “could develop into a new way for broadband providers to pick winners and losers online, and a new entry barrier for small and emerging competitors who cannot afford a new layer of charges.”
Google is lobbying against state legislation that aims to forbid the wearing of Google Glass while driving, reports Reuters.
Several surveillance technology vendors sued the state of Utah last week, claiming the state’s new ban on automatic license plate reader (ALPR) technology violates their corporate freedom of speech.
The ACLU and others have highlighted how pervasive, automatic tracking of license plates (particularly when records are kept for long periods) can make it far easier and cheaper for the government to build long term histories on the movements of any vehicle.
In their complaint, the vendors challenged the idea that this new tracking is a privacy problem, writing that “the photographic recording of government-mandated public license plates does not infringe any ‘privacy’ interest” differently from when a human being views a license plate directly. In a related motion that seeks to stop the implementation of Utah’s new ban while this lawsuit unfolds, the surveillance firms claim that “unlike social security or credit card numbers, license-plate numbers implicate no privacy interest, which is why the government mandates their disclosure, while it restricts the disclosure of social security numbers and credit card numbers.”
The Utah court has yet to respond to this new challenge.
The cumulative record of a person’s movements is, and should be, more private than the letters on their license plate. But this suit illustrates a key challenge for new efforts to ensure the fairness of new “big data” systems that turn publicly observable information (such as a person’s license plate) into new insights on peoples’ lives. Detailed awareness of a person’s daily movements is sensitive information — and the law needs some way to guide its use — regardless of whether the elements from which it is constructed are themselves “private.”
Arguments like those of the ALPR vendors may help to explain a recent shift toward framing these concerns as “big data” issues (reflected, for example, in the White House’s ongoing “big data” review), a term that can more easily encompass discrimination concerns and other uses of data that are arguably not private.
Anil Dash, an entrepreneur and tech policy leader, recently revealed that he spent 2013 following a secret plan. On his widely-followed Twitter account, he retweeted — that is, republished to his half-million followers — tweets only from women. He reflects:
Maybe the most surprising thing about this experiment in being judicious about whom I retweet is how little has changed. I just pay a little bit of attention before I tap on the icon in my Twitter app, but it’s been effortless to make the switch, and has gotten me far more “thanks for the retweet!” messages than I used to get.
More broadly, I found the only times I even had to think about it were very male-dominated conversations like the dialogue around an Apple gadget event. Even there, I’d always find women saying the same (or better!) things about the moment whose voices I could amplify instead of the usual suspects. And for the bigger Twitter moments I love, like award shows and cultural events, there are an infinite number of women’s voices to choose from.
One thing that has happened, and I’m not sure if it’s attributable to my change in retweet behavior, is that I’ve been in far more conversations with women, and especially with women of color, on Twitter in the past year. That’s led to me following more women, and has caused a radical shift in how I perceive my time on Twitter, even though its actual substance isn’t that different.
Dash suggests he’ll continue his practice in 2014, and has invited others to do the same.