What does it mean that data mining can reproduce patterns of unfairness? What counts as unlawful discrimination when it comes to an algorithm? What guidance do our laws provide?
These are the questioned tackled by Big Data’s Disparate Impact, a recent paper by Solon Barocas and Andrew D. Selbst. The paper begins with the technical fundamentals of data mining, and uses this foundation to analyze the policy questions that lie beyond. It concludes with a clear and unsettling message: “to a large degree, existing law cannot handle these problems.”
I’ve been wanting to preview this piece for Equal Future for some time. Hopefully, this whirlwind tour—greatly simplified for brevity—inspires you to explore the full paper.
What is data mining?
Data mining is the practice of sifting through data for useful patterns and relationships. These patterns, once discovered, are usually incorporated into a “model” that can be used to predict future outcomes.
For example: By analyzing thousands of employees’ job histories, a computer might discover a close correlation between the distance of an individual’s commute and that individual’s tenure at their job. This insight might be incorporated into a job recruiting model and used by an employer to evaluate its applicants.
How can data mining “go wrong?”
Although data mining conjures up images of sophisticated computers, Barocas and Selbst explain that humans guide data mining processes at many steps along the way. Each of these steps is a chance for something to go awry:
People must translate real life questions into problems that a computer can understand. A data mining algorithm must be told what it’s looking for. This process, called determining “target variables,” can have a big impact on how a data mining process performs.
For example: What counts as a “good employee”? Different people will define “good” in many different ways (for example, someone who hits high sales goals, or who has a spotless discipline record, or who stays in the job for many years). This definition will frame the entire data mining process.
People must give a computer data to learn from. If this “training data” is incomplete, inaccurate, or itself biased, then the results of the data mining process are likely to be flawed.
For example: Sorting medical school applicants based on prior admissions decisions can be a prejudiced process if these prior admissions decisions were infected with racial discrimination.
People must decide what kind of information a computer should pay attention to. This process, called “feature selection,” determines which parts of the training data are relevant to consider. It may also be difficult to obtain data that is “sufficiently rich” to permit precise distinctions.
For example: Hiring decisions that significantly weigh the reputation of an applicant’s college or university might exclude “equally competent members of protected classes” if those members “happen to graduate from these colleges or universities at disproportionately low rates.” There may be more precise and accurate features available.
How can seemingly benign data serve as a “proxy” for more sensitive data?
It can be hard to “hide” demographic traits from a data mining process, especially when those traits are strongly tied to the ultimate purpose of the data mining. Here, it’s easiest to skip straight to an example.
For example: Imagine you are trying to build a model that predicts a person’s height. Men tend to be taller than women. Your training data doesn’t contain information about individuals’ sex, but it does include information about individuals’ occupations. A data mining process might learn that preschool teachers (more than 97% of whom are women) tend to be shorter than construction workers (more than 90% of whom are men). This insight simply reflects the fact that each profession is a reliable “proxy” for sex, which is itself correlated with height.
In the other words, given a big enough dataset, a data mining process will “determine the extent to which membership in a protected class is relevant to the sought-after trait whether or not [protected class membership] is an input.” There are ways to test for such proxying, but this can be a difficult and involved process.
How might data mining be harmful even when everything “goes right?”
A perfectly-designed data mining process can accurately expose existing unfairness in society. For example, mainstream credit scores are not equally distributed across racial groups. However, there is strong evidence that they are predictive (and not because they proxy for race). These scores reflect the fact that some minorities face a range of unique obstacles that make it more difficult for them to repay loans.
What does the law have to say when data mining has a disparate impact?
The authors conclude that the law is “largely ill equipped to address the discrimination that results from data mining.” They focus their analysis on Title VII—which prohibits employment discrimination based on race, color, religion, sex and national origin—because it expressly allows for “disparate impact” liability when a neutral practice has unfair results. Put simply, Title VII tries to put an end to historical discriminatory trends, while still permitting employers a reasonable amount of discretion in hiring. The crux of the law is the “business necessity” defense, which allows employment practices that are legitimate, useful, and accurate—even if those practices do have a disparate impact.
But the precise contours of disparate impact law remain murky, making it hard to measure what should be allowed when it comes to data mining. When is a factor too strongly correlated with protected status, making it illegitimate to use? How should we compensate for the fact that collected datasets themselves reflect existing real-world biases? Is restricting an algorithm’s inputs really the best way of making sure that its results will be fair?
No one yet has answers to these important questions. Giving concrete meaning to civil rights protections in the context of computerized decisions will require an ongoing exchange between technologists and the civil rights community.
The Senate narrowly failed to advance the USA Freedom Act last night. The bill would have ended bulk phone surveillance by the government. After the vote, Senator Leahy, the bill’s champion, criticized his opponents for “resort[ing] to scare tactics.”
90% of the world’s population over the age of six will have a mobile phone by 2020, projects a report from Ericsson. That would be 6.1 billion smartphone subscriptions globally, up from today’s count of 2.7 billion.
New York City is poised to turn all of its payphones into public Wi-Fi stations, an effort that will be funded entirely through advertising revenue. The consortium behind the project claims it will create “the fastest and largest free municipal Wi-Fi deployment in the world.”
Medium user EricaJoy shares a personal reflection about her experiences working as a black woman in the tech industry. “I don’t need to change to fit within my industry. My industry needs to change to make everyone feel included and accepted,” she concludes.
“In my humble opinion, it’s time we all stopped referring to [terms of service agreements] as ‘privacy’ policies,” notes Micah Stifry on Tech President. “My data would be private if it weren’t collected at all. These are ‘data usage’ policies.” (Fair point.)
The FCC will propose significantly raising the annual cap on spending for school Internet, reports the New York Times. “While the impact on consumers will be small, the impact on children, teachers, local communities and American competitiveness will be great,” says an FCC statement acquired by the Times.
Concerned communities are one step closer to asking the right questions about new surveillance technologies thanks to a new guide from the ACLU of California (ACLUC) called Making Smart Decisions About Surveillance.
The guide provides step-by-step instructions for evaluating and implementing new police tools. It also includes model legislation that would require governments to gather “information, including crime statistics, that help the community assess whether the surveillance technology has been effective at achieving its identified purposes.”
These tips couldn’t come at a better time. According to new research, surveillance technology is spreading throughout California with little oversight. “We found evidence of public debate related to surveillance technology adoption less than 15 percent of the time,” the ACLUC told Ars Technica. “None of the 52 communities with two or more surveillance technologies publicly debated every technology. We found a publicly-available use policy for fewer than one in five surveillance technologies.” These statistics were not easy to come by. “We just spent months digging through the minutes of city council and supervisors requests,” said Nicole Ozer of ACLUC.
As we’ve said many times, limited transparency makes it difficult, if not impossible, to determine whether police technologies are being used effectively and fairly.
The guide is breath of fresh air: a constructive effort at more open communication. For example, it encourages police departments to proactively reach out to “community groups, including those representing ethnic communities, and local media ….” It also lists questions that both police and citizens should be asking before investing in new technology, including: “What will this technology achieve? How will it be governed? Are we only collecting necessary data?”
We hope this is just the first step toward more frequent, and more productive, conversations with law enforcement about new surveillance tools.
President Obama urged the Federal Communications Commission (FCC) on Monday to treat broadband services like utilities and to prevent them from prioritizing internet traffic in exchange for payment. “The time has come for the FCC to recognize that broadband service is of the same importance [as the traditional telephone system] and must carry the same obligations as so many of the other vital services do,” he said, siding squarely with many consumer groups.
Telecommunications firms immediately disagreed, pledging legal action should the FCC attempt to reclassify them. FCC Chairman Tom Wheeler said his agency would “incorporate the President’s submission into the record of the Open Internet proceeding.”
The President’s proposal urges Wheeler to enact stronger rules than the “hybrid approach” that he was reportedly favoring (which would only reclassify parts of the broadband ecosystem).
The road ahead remains long, bumpy, and unpredictable. Even if the FCC chooses to reclassify, it will face fresh legal challenges. It will also have to consider “forbearance“: the process of deciding which parts of Title II apply to broadband services. “I think it’s a red herring to assume that you have to import all of the regulations that exist for the telephone and apply them to the Internet,” said Corynne McSherry of the Electronic Frontier Foundation. “No one is really pushing for that, and no one thinks that should happen.”
The FCC is likely to delay its new rules until next year. Wheeler said that he plans to “take the time to get the job done correctly,” and emphasized that his agency faces a complex legal puzzle. “[W]hether in the context of a hybrid or reclassification approach, Title II brings with it policy issues that run the gamut from privacy to universal service to the ability of federal agencies to protect consumers, as well as legal issues ranging from the ability of Title II to cover mobile services to the concept of applying forbearance on services under Title II.”
In short, we’ll be hearing about net neutrality for many, many months to come.
A few of the best links from the recent media blitz:
- Did Verizon’s Net Neutrality Win Backfire? (A brief history of how prior legal proceedings have shaped the FCC’s options.)
- A Super-Simple Way to Understand the Net Neutrality Debate
- Net Neutrality Debate: Internet Access and Costs Are Top Issues (A more critical look at Title II.)
Although big data can “advance public health, better our businesses, and improve our society,” we must remember that “all of us, as a society, are startlingly bad at protecting the data of vulnerable communities,” writes Alvaro Bedoya in Slate.
“It’s hard to find someone who doesn’t think that all of the country’s nearly 800,000 officers should be wearing cams, but there are plenty of questions about what the policy should be once those cameras are part of the police uniform,” says Kashmir Hill, reminding us that the cameras are not without their challenges. “Transparency is a weapon, and like any weapon, it can be misused. Police body cam videos can become another way to embarrass and intimidate citizens police interact with.”
“More companies are hiring professionals to help them navigate the waters of data collection and privacy, but the windfall of the privacy professional does not necessarily equate to more privacy for consumers,” reports Ars Technica. This is because privacy professionals “typically focus on minimizing risk to companies from the regulations focused on protecting consumers, not necessarily on improving consumer privacy.”
Google recently announced a promising new tool that might help researchers benefit from big data analysis, while at the same time protecting individuals’ privacy. This concept might seem sound counterintuitive at first, but it’s based on a solid mathematical foundation called “differential privacy.”
How is this possible? The basic idea is to sprinkle some “randomness” into the data, adding uncertainty as to whether any one value has been modified. This helps to protect privacy, because any individual piece of data (say, a sensitive attribute about a person) could have been changed from its original, true value. But at the same time, a statistician can adjust for these differences, and can still extract useful, accurate insights from the modified data.
[W]hen asking a sensitive question, researchers asked the person to flip a coin. If the coin came up heads, the respondent was told to answer Yes, regardless of the true answer. If the coin came up tails, the respondent was told to tell the truth, Yes or No. Effectively, this meant that any given Yes response was completely deniable by the respondent, preserving the privacy of those that answered Yes. However, because the researcher knew to expect 50% of responses to be Yes (purely by chance, because coin flips are random with a probability of 1/2), she could account for this when counting the total number of Yes responses from the sample.
You can think of differential privacy as a more sophisticated version of randomized response. And although academic researchers have made significant advances in the theory over the past decade, it has rarely been applied in practice.
Google’s new project, called RAPPOR, aims to change that by providing an open-source software toolkit to help bring differential privacy techniques into the real world. Today, Google uses RAPPOR in a very limited way: gathering data about how people use its Chrome web browser. But RAPPOR could soon be used—by Google or anyone else—in a wide variety of other big data scenarios.
Even with RAPPOR, applying differential privacy requires substantial statistical expertise. But it’s a step in the right direction, and should help to accelerate the adoption of one of the most promising privacy techniques on the horizon.
Writing in The Atlantic, Karen Levy takes a critical look at several recent technologies that aim at sexual violence. One, Undercover Colors, is a nail polish that changes color upon contact with the “date rape drug” rohypnol. Its creators bill themselves as “the first fashion company empowering women to prevent sexual assault.” Another, Good2Go, is an “affirmative consent app” designed to create an objective record of consent.
Undercover Colors and Good2Go are technological tattletales. Both are designed to tell the truth about an encounter, with the objectivity and dispassion of a database or a chemical reaction. Tattletale solutions make sense only if we see rape, fundamentally, as a problem of bad data. But thinking about rape this way implies that what we’re most worried about is men being wrongly accused of sexual assault. That the reports women provide aren’t reliable, and should be replaced by something “objective.” These technologies prioritize the creation of that data over any attempt to empower women or to change the norms around sexual violence; they’re rape culture with a technological veneer.
. . .
By looking at sexual assault through a data lens, technologies like these collapse complex experiences into discrete yes-or-no data points. . . . Focusing on data production drives us to think of sexual violence in black-and-white terms—a dangerous oversimplification of a far messier and more nuanced reality. . . . It’s encouraging to see techies trying to address knotty social issues like sexual violence. But if technology is going to intervene for good, it needs to adopt a more nuanced approach—one that appreciates that not every problem can be treated as a data problem.
Senator Mark Udall’s defeat yesterday is a loss for civil liberties, explains Timothy Lee in Vox.
Facebook is paying close attention to how its actions might affect the voting behavior of its users. In a previously little-known experiment, the company found that increasing the number of news stories shown to readers increased civic engagement and voter turnout. And yesterday, it rolled out a highly-visible tool allowing many of its American users to announce that they’d voted. “If past research is any guide, up to a few million more people will head to the polls partly because their Facebook friends encouraged them,” reports Mother Jones.
28% of registered voters have recently used their cell phones to track political news or campaign coverage, a number that has doubled since the 2010 midterm election, reveals a new study by Pew.
“It may be hard to believe, but there are big cities in the US where 30 to 40 percent of residents have no Internet service at all,” reports Ars Technica. Detroit ranked the worst, according to 2013 census data, with 56.9 percent of households lacking a fixed broadband subscription.
“Sneak and peek” warrants, which allow a person’s home to be secretly searched without notice, are authorized under the Patriot Act, an anti-terrorism law. But in the field, these new warrants are now far more often used in drug investigations than in terrorism cases:
Automated license plate readers have yet to pay off for Vermont police. “[E]ven with the millions of scans, the system has not led to many arrests or breakthroughs in major criminal investigations, and it hasn’t led to an increase in the number of tickets written for the offenses the technology is capable of detecting,” reports Vermont Public Radio.
That’s one way to change the subject: Verizon Wireless has launched SugarString, its own new publication on technology and society, run by the company’s marketing department. A recruiting pitch sent to journalists indicates the publication will have a big budget but will not be allowed to write about net neutrality or the NSA’s spying.
We here at Robinson + Yu have just released a new report, Knowing the Score, which provides a guided tour of the complex and changing world of credit scoring. It’s designed to be the “missing manual” for policy professionals seeking to better understand technology’s impact on financial underwriting and marketing.
The word “scoring” is used a lot these days. For example, a widely quoted New York Times story described a new crop of “consumer evaluation or buying-power scores . . . [which are] highly valuable to companies that want—or in some cases, don’t want—to have you as their customer.” A recent report from Privacy International inventoried a variety of “consumer scores”—such as measurements of online social media influence. And industry regulators have acknowledged a “big fuzzy space” between how different kinds of financial assessments are viewed by the law.
We were left with many questions: What are the legal and practical differences between a “credit score” and a “marketing score”? Are credit scoring companies that rely on social networking data reliable? Should new forms of payment information (such as cable and utility bills) be sent to credit bureaus? Can new scoring methods bolster financial inclusion?
Our report adresses all of these questions, providing historical and legal context along the way.
The key takeaways are:
Financial advocates should seriously consider advancing the inclusion of “mainstream alternative data” (such as regular bill payments) into credit files. This new data, which often goes unreported today, could allow credit scores to be calculated for more people, enhancing access to the mainstream financial system. However, the impact of this new payment information on credit scores is hard to analyze without access to proprietary credit bureau data. Thus, we encourage further collaboration and transparency between advocates and industry. We also emphasize that utility payment data carries special risks: it must be reported carefully so as not to interfere with state consumer protection laws.
The predictiveness and fairness of new credit scores that rely on social network data and other nontraditional data sources (including, for example, how quickly a user scrolls through a terms of service document) is not yet proven. We predict that to the extent these new methods are actually adopted, they may struggle to comply with fair lending laws.
Today’s most widely used credit scoring methods (such as the approach used by FICO) are fair in the sense that they accurately reflect individuals’ credit risk across racial groups. Many studies have documented huge differences in average credit scores between racial groups. But the best available evidence, a 2007 study conducted by the Federal Reserve, indicates that mainstream scoring models themselves are not biased–that is to say, they accurately predict individual credit risk, for individuals of all races. This means that racial differences in average credit scores are a map of real, underlying inequalities, rather than a quirk of the scoring system. It also confirms that credit scores can be a powerful yardstick by which to measure the fairness of particular financial products and practices.
Marketing scores, built by credit bureaus from aggregated credit report data, can be used to target advertisements and change the appearance of websites as individuals navigate the web. These marketing scores, computed on a household or block level, segment individuals based on their financial health. They can come within a hair’s breadth of identifying a person, which would subject them to the Fair Credit Reporting Act, but they appear to be operating just outside the scope of that law. Unfortunately, technological constraints make it difficult to understand through outside observation what effect these scores are having. We urge regulators to play a fact-finding role to learn more about how this data is used.
“[T]here has never been more confusion about what the term [anonymity] means,” reports the Wall Street Journal, explaining how new mobile apps often collect more data than meets the eye.
California Highway Patrol officers have been sharing explicit photos of female suspects for years as part of a ‘game,’ says an officer.
New America’s Open Technology Institute just released Data and Discrimination: Collected Essays, which claims that “digitally automated systems may be adding to [discrimination] in new ways.”
Powerful government authority granted by the Patriot Act — known as “sneak and peek” warrants — are often used for purposes other than terrorism, explains the EFF.