NTIA’s Big Data Questions Prompt Many Answers

From Issue 1.50 - By Erica Portnoy
Erica Portnoy  -  

In response to a call from the White House’s Big Data Report, the National Telecommunications and Information Administration (NTIA) requested comments on big data and the economy. Its questions focused on enabling innovation, avoiding harm, and the role of regulation.

Responses came in earlier this month. Academics, industry groups, and advocates offered a wide variety of opinions. Here are some common themes:

  • Consider the discriminatory impacts of big data. The Leadership Conference submitted the Civil Rights Principles for the Era of Big Data, which “represent the first time that national civil and human rights organizations have spoken publicly about the importance of privacy and big data for communities of color, women, and other historically disadvantaged groups.” And in a joint statement, The Leadership Conference and American Civil Liberties Union suggest appointing a “public advocate within the administration… to augment the analysis and consideration of civil rights concerns within the multi-stakeholder process.”

  • Give people more control over data collection. Alvaro Bedoya and David Vladeck of Georgetown Law pointed out that “[w]hile data collection sometimes happens by accident, more often than not, it’s the result of deliberate and often expensive engineering and policy decisions.” Privacy law scholar Chris Hoofnagle wrote that limiting data collection is important because “use regulations are very difficult to enforce.” The Center for Democracy and Technology argued:

Based on the sensitivity of the data, the context in which it is collected, and the necessity of the processing, consumers should in many (if not most) cases have the ability to make decisions about how their personal information will be collected, used, and retained.

  • Ensure that data is used appropriately. Both regulators and industry groups stressed the importance of using data appropriately, but in different ways. For example, FTC Commissioner Maureen Ohlhausen praised the model of the Fair Credit Reporting Act (FCRA) and suggested “it would be useful to explore whether such frameworks, by specifically prohibiting certain clearly impermissible uses of data, could help protect consumers while enabling continued innovation in big data.” Industry groups, on the other hand, sometimes presented a focus on data use as an alternative to regulation. According to Software & Information Industry Association, existing laws are well suited to “enable a greater focus on responsible use of data….”

  • Protect big data’s utility. Some respondents highlighted the importance of big data for research and the public good. For example, KEI discussed the importance of medical records for health researchers and Reed Elsevier cited the need to “combat identity theft and fraud.” Some trade associations argued that big data’s usefulness would be hampered by regulation. But privacy advocates like the EFF disagreed, warning that “policymakers should be careful and skeptical about claims made for the value of big data, because over-hyping its benefits will likely harm individuals’ privacy.”

  • Pass federal privacy legislation. Many respondents , including the author of White House’s Consumer Privacy Bill of Rights, the Online Trust Alliance, Microsoft, and leading privacy and civil liberties groups advocated for comprehensive federal consumer privacy legislation.

  • Regulatory oversight is essential. Privacy and civil liberties groups argued that industry self-regulation is simply not sufficient. In a statement that departed from the usual industry response, Microsoft agreed, noting that regulatory oversight will be helpful in targeting “improper discrimination against individuals and groups.”

Researchers Look Beyond Flagging to Combat Online Harassment

From Issue 1.50 - By Erica Portnoy
Erica Portnoy  -  

After her father’s death, Zelda Williams was driven from Twitter and Instagram by harassment. Both sites offer tools to help combat abuse: users can “flag” offensive or harassing content, providing some guidance to the the sites’ review teams. But researchers Kate Crawford and Tarleton Gillespie argue that flagging is not itself an adequate tool for enforcing community standards.

Flagging is an honest attempt to integrate user input into the process of regulating online communities. Crawford and Gillespie acknowledge this, and recognize that websites enjoy important legal protections with respect to the actions of their users. Already, they note, “social media platforms generally go well beyond what is legally required, forging their own space of responsibility[.]“

But flags are imperfect. Flagging will never be “a direct or uncomplicated representation of community sentiment,” because there are many opinions as to what is appropriate. And flags can themselves be abused: a group with malicious attempt can collude to repeatedly flag content that otherwise follows the community guidelines, such as in the case of conservative group “Truth4Time” coordinating attacks against pro-gay Facebook groups.

Ultimately, Crawford and Gillespie are hopeful that sites will experiment with new tools. For example, they highlight a video game that allows its users to evaluate in-game behavior through a collective “tribunal.” And they point out that Wikipedia’s discussion pages “reveal the ongoing contests over knowledge that shape the entries that users read, and can be accessed by Wikipedia editors as well as casual readers.”

Combating online abuse will be a struggle for years to come. But online platforms should never give up searching for new ideas.

Also of Note: August 27, 2014

From Issue 1.50 - By Aaron Rieke
Aaron Rieke  -  
  • Using “markers of socioeconomic disadvantage” to shape sentencing decisions is “deeply unfair, and almost certainly unconstitutional,” argues law professor Sonja B. Starr in the New York Times. “Criminal justice policy should be informed by data, but we should never allow the sterile language of science to obscure questions of justice.”

  • California recently passed a law requiring all new phones to have a “kill switch,” which would allow the user to render the device inoperable if lost or stolen. But the law also gives police the authority to use the feature. “This week’s events in Ferguson, Missouri highlight the risks of abuse all too clearly,” writes Jake Laperruque of the Center for Democracy and Technology, arguing the law might be use to disrupt future protests.

  • Many civil libertarians support use of police cameras and resistance from police unions is fading, reports the New York Times.

  • Events in Ferguson have prompted calls for police demilitarization in Washington. President Obama ordered a comprehensive review of government’s long tradition of outfitting police with military gear, and lawmakers from both ends of the political spectrum are proposing legislation. But change may not be imminent. “No one will have the appetite to vote on this issue two months before an election,” writes Steve King of Policy Mic.

  • FCC Chairman Wheeler recently reiterated his support for municipal broadband, writing in a letter to a Congressman that “many states have enacted laws that place a range of restrictions on communities’ ability to invest in their own future.” But soon after, a high-ranking Republican FCC official argued in a speech that states should retain control.

Scheduling Software Can Leave Employees’ Lives Behind

From Issue 1.49 - By Aaron Rieke
Aaron Rieke  -  

Almost every major retailer and restaurant chain uses software to help schedule its workforce. Software vendors claim that their technologies “eliminate manual scheduling and ensure optimal labor coverage for every shift, every day.” But this automation can come with a heavy human cost.

A recent New York Times story, Working Anything but 9 to 5, told of a woman’s struggle with extreme and fast-changing Starbucks shifts. She was scheduled, for example, to work “until 11 p.m. on Friday, July 4; report again just hours later, at 4 a.m. on Saturday; and start again at 5 a.m. on Sunday.” Experts are concerned that these schedules are “injecting turbulence into parents’ routines and personal relationships, undermining efforts to expand preschool access, driving some mothers out of the work force and redistributing some of the uncertainty of doing business from corporations to families.” The Times also collected more than 300 reader responses, many of which told of similar scheduling stresses.

In some cases, corporate policies are not reflected in scheduling software’s output. For example, Starbucks told the Times that its policy is to give all employees a week’s notice regarding their work hours. But of employees interviewed at 17 Starbucks across the country, only two said they’d received such notice.

A solution requires that enforceable corporate policies be properly reflected in scheduling software. For example, one popular software suite, Kronos, claims that it will strictly adhere to an employer’s “scheduling-related rules and policies.” But what are the applicable rules and policies? How well can they be understood by the software? Can managers override the rules (and when allowing this discretion be desirable)?

Scheduling software is just a tool — but it must be well-designed and carefully used. Employers should redouble their efforts to be attentive to the needs of their employees.

In Ferguson, Cameras a Focal Point

From Issue 1.49 - By Aaron Rieke
Aaron Rieke  -  

Military-grade vehicles, assault rifles, and riot gear are not the only technologies owned by the Ferguson police. They also have “a stock of body-worn cameras, but have yet to deploy them to officers,” reports the Wall Street Journal. Hopefully, the cameras will come out of storage soon.

Yesterday, Ferguson city leaders wrote that they were committed to securing (and, we assume, deploying) dash and vest cameras for its police force. This isn’t the first time body-worn cameras have been proposed to help ease tensions between police and the communities they serve. For example, in 2013, a federal district court judge, in ruling the NYPD’s stop-and-frisk tactics unconstitutional, ordered that a body-worn camera program begin in New York.

Body-worn cameras have not yet been systematically studied, but anecdotal evidence is positive. For example, in Rialto, California, the entire police force wears small cameras. In the first year following the cameras’ introduction, “the use of force by officers declined 60%, and citizen complaints against police fell 88%.”

Ferguson police should be using more of the right kind of technology: technology that promotes accountability. And they should stop taking the same from journalists and citizens.

But we must also acknowledge that technology itself will never be enough. Ferguson police have already reportedly removed their badges and nametags and refused to identify themselves to citizens. In the same way, body-worn could be removed or turned off. Ultimately, a range of policies must be established and followed.

Also of Note: August 20, 2014

From Issue 1.49 - By Aaron Rieke
Aaron Rieke  -  
  • A New York City initiative called ClaimStat “seeks to collect and analyze information on the thousands of lawsuits and claims filed each year against the city.” The city hopes these insights will help it reduce the money it spends on settlements and judgments. For example, the approach has already identified that “several precincts in the South Bronx and Central Brooklyn had far more claims filed against their officers than other precincts in the city.”

  • Big data requires a significant amount of “janitorial” work, reports the New York Times. Data scientists “spend from 50 percent to 80 percent of their time mired in this more mundane labor of collecting and preparing unruly digital data, before it can be explored for useful nuggets.”

  • Cryptography can help ensure that bulk surveillance is conducted more fairly, argue two Yale computer scientists. They suggest making data collection processes public — including what data is collected, from whom, and how it was encrypted, stored, searched, and decrypted — and insist that “any surveillance process that collects or handles bulk data or metadata about users not specifically targeted by a warrant must be subject to public review and should use strong encryption to safeguard the privacy of innocent users.”

Can Tech Startups Really Fix Payday Loans?

From Issue 1.48 - By Aaron Rieke
Aaron Rieke  -  

Payday loans aren’t serving communities well. Most are made to borrowers “caught in a revolving door of debt” and renewed so many times that a borrower will pay more in fees than the amount originally borrowed.

Financial regulators are looking for a better way. “As we work to bring needed reforms to the payday market, we want to ensure consumers have access to small-dollar loans that help them get ahead, not push them farther behind,” said CFPB Director Richard Cordray. Can technology help?

Tech startups are clamoring to provide better loans. But proven, stable solutions are still in short supply. Two young companies illustrate different approaches.

LendUp is an online company that calls itself a “socially responsible direct lender.” Like many payday lenders, it imposes triple-digit annual percentage rates (APRs). But it claims its rates are lower thanks in part to “big data” underwriting (which might incorporate social network data and how quickly a user scrolls through its site). The company also claims it is more transparent than other payday lenders and lowers users’ rates over time.

ActiveHours, on the other hand, eschews traditional fees, hoping that the goodwill of its users will keep it afloat. The company doles out small loans to its users reflecting the time that a user has already worked at their job, but in advance of their paycheck. (This approach relies on new verification techniques like geolocation, which helps ensure a user was at their workplace when reporting their time card to ActiveHours.) When payday arrives, ActiveHours attempts to repay itself from the user’s bank account, at which point the user can opt to provide ActiveHours with a tip (or none at all). The company claims it charges “no fees,” except those imposed upon it as “result of a failed transaction….”

For these companies, and others like them, the jury is still out. Many financial advocates are skeptical. And many important questions still need answers:

  • To what extent can nontraditional data sources, such as those touted by LendUp, accurately and fairly predict credit risk? (Thorough public research on this topic is lacking.)
  • Can more altruistic services like ActiveHours survive without fees, or support a large number of users?
  • Does free short-term borrowing actually help individuals resolve long-term financial instability? (“Everyone thinks they’ll use the service ‘just this once,’ yet it becomes such an easy fix that they end up addicted to the easy money,” warns Gail Cunningham of the National Foundation for Credit Counseling.)

The financial technology (or “fintech”) sector is well-funded and full of new ideas. Its companies should be greeted with caution, but not without some hope.

Florida’s Approach to Child Protection Analytics Shows Promise

From Issue 1.48 - By Erica Portnoy
Erica Portnoy  -  

In Florida, predictive analytics could help reduce the risk of child fatalities caused by abuse and neglect. The Florida Department of Children and Families is working with business consulting groups to use analytics software to pinpoint statistically significant risk factors in the Child Death Review Database. It also integrated a variety of professional insights, considering “[i]nput from… several child protection experts in the medical, legal, community-based care, law enforcement, prevention and quality assurance professions.”

Their report highlighted factors that improve or worsen risk:

Baseline risk factors for all child deaths

This data will help focus case-worker time and attention where it’s needed the most – like in-home visits, which reduce risk of death by 90%. These recommendations follow an announcement by Florida’s governor to hire more case workers.

Florida’s efforts show that computerized analyses can be used to encourage effective, non-harmful interventions. The project is similar in some ways to Chicago’s heat list, but is far more transparent. The available evidence suggests Florida is taking a careful approach and not acting upon weak correlations tied specifically to ethnicity and gender.

If the findings of the report hold, targeted interventions may prove to be effective in preventing harms to children. Florida’s example shows that predictive analytics can, when carefully considered, yield significant positive benefits.

Also of Note: August 13, 2014

From Issue 1.48 - By Erica Portnoy
Erica Portnoy  -  
  • The National Telecommunications & Information Administration (NTIA) has received public comments concerning “Big Data and Consumer Privacy in the Internet Economy.” Among these are the Civil Rights Principles for the Era of Big Data, submitted by the Leadership Conference.

  • A draft of Solon Barocas and Andrew Selbst’s award-winning paper “Big Data’s Disparate Impact” is now available. The paper “examines the operation of anti-discrimination law in the realm of data mining and the resulting implications for the law itself.”

  • “Is big data spreading inequality?” asks a recent New York Times feature. Six debaters weighed in, including Seeta Peña Gangadharan of the New America Foundation. “The rise of commercial data profiling is exacerbating existing inequities in society and could turn de facto discrimination into a high-tech enterprise,” she wrote.

  • Ars Technica has a primer on “tech policy problems Congress failed to fix this year.” The list includes email privacy, mass surveillance, and immigration issues.

Big Data Sentences Could Undermine Fairness, Attorney General Argues

From Issue 1.47 - By David Robinson
David Robinson  -  

Using computerized predictions of future crime to sentence today’s offenders “may inadvertently undermine our efforts to ensure individualized and equal justice,” argued Attorney General Eric Holder last Friday, addressing a Philadelphia meeting of the National Association of Criminal Defense Lawyers. “By basing sentencing decisions on static factors and immutable characteristics – like the defendant’s education level, socioeconomic background, or neighborhood,” such policies “may exacerbate unwarranted and unjust disparities that are already far too common in our criminal justice system and in our society.”

Holder distinguished automated sentencing decisions from automated predictions regarding probation or parole, which he said have “for years, successfully aided in parole boards’ decision-making about candidates for early release. . . Data can also help design paths for federal inmates to lower these risk assessments, and earn their way towards a reduced sentence, based on participation in programs that research shows can dramatically improve the odds of successful reentry. Such evidence-based strategies show promise in allowing us to more effectively reduce recidivism.”

But as Philadelphia’s own experience shows, many of Holder’s sentencing concerns apply with similar force to probation and parole decisions. Philadelphia’s data-driven parole scoring system, developed together with researchers at the University of Pennsylvania and described in a federally-funded technical report, attempts to predict the likelihood of a serious offense after release, and imposes more stringent conditions on the release of inmates judged to be at higher risk. It relies heavily on factors that might be unfair to the individuals being judged:

  • The offender’s home zip code is a factor used predict recidivism risk. As a result, offenders from the wrong parts of town face greater scrutiny than others, when released into supervision.

  • The predictions in Philadelphia’s system are based on “criminal history data for Philadelphia alone,” ignoring offenses committed outside the city limits. As a result, the system is biased toward viewing long-time city residents (whose past contact with the legal system is included in its database) as relatively higher-risk, while artificially underestimating the risk levels for offenders whose past history took place in other jurisdictions.

  • The Philadelphia model uses “charge counts — as opposed to conviction counts — to represent each offender’s criminal history,” contradicting what the report’s authors call “a certain desire to structure supervision around what the offenders were convicted of in court, instead of the offenses that [they] were merely charged . . . with committing.” This puts the matter lightly. Particularly given the culture of plea bargaining, it is not unusual for defendants to be over-charged with crimes more serious than they’ve actually committed.

The predictive payoff from Philadelphia’s system is real, but far from perfect: less than 10% of the people released are charged with a serious offense within two years, but nearly 40% of those judged “high risk” are charged with one.

But even given these imperfections, risk-based assessments perform an important role. As Attorney General Holder observed, they have resulted in “reduced prison populations – and importantly, those reductions are disproportionately impacting men of color.”

Avoiding excessive punishment and treating individuals fairly to their individual circumstances are both important goals. And they are both achievable. The social justice community has a crucial role to play in making sure that neither is neglected.