Highlighted Selections from:

From Open Data to Information Justice

DOI: 10.2139/ssrn.2241092

Johnson, Jeffrey Alan, From Open Data to Information Justice (April 8, 2013). Midwest Political Science Association Annual Conference, April 2013 Version 1.1. Available at SSRN: http://ssrn.com/abstract=2241092

p.1: This paper argues for subsuming the question of open data within a larger question of information justice. I show that there are several problems of justice that emerge as a consequence of opening data to full public accessibility, and are generally a consequence of the failure of the open data movement to understand the constructed nature of data. -- Highlighted mar 12, 2014

p.1: This necessitates a theory of information justice. I briefly suggest two complementary directions in which such a theory might be developed: one leading toward moral principles that can be used to evaluate the justness of data practices, and another exploring the practices and structures that a social movement promoting information justice might pursue. -- Highlighted mar 12, 2014

p.2: With a culture of technological neutrality (Johnson 2007) and radical individualism (Walls and Johnson 2011) dominating the open data movement it is exceptionally easy for data scientists and users to accept current data practices and outcomes as natural or inevitable, and to make data use the only moral question of interest. -- Highlighted mar 12, 2014

p.3: Data is, in an important sense, a form of communication between actors that embeds the assumptions and worldview of those actors in what is communicated. It is, like all technologies, a construct, an operationalization of an actor’s concept and reality, interpreting between the physical world and the intellectual structures by which actors understand that world, and embedded in a set of social practices by which it is created, interpreted, and used. -- Highlighted mar 12, 2014

p.4: Data over-represents some, and where those over-representations parallel existing structures of social privilege, it over-represents those already privileged and under-represents those less likely to be part of data producing interactions. -- Highlighted mar 12, 2014

p.4: One well-studied example is the undercount of the decennial United States Census. (Prewitt 2010) Since the problem of undercounting was first quantified in the mid-Twentieth Century, black and Hispanic households have been undercounted at higher rates than nonblack households. The causes of this undercount are myriad: Households are not missed in the census because they are black or Hispanic. They are missed where the Census Bureau’s address file has errors; where the household is made up of unrelated persons; where household members are seldom at home; where there is a low sense of civic responsibility and perhaps an active distrust of the government; where occupants have lived but a short time and will move again; where English is not spoken; where community ties are not strong. (Prewitt 2010, 245) -- Highlighted mar 12, 2014

p.5: What information that will be is not a natural consequence of the interaction but a design choice on the part of the data architects that reflects their purposes, resources, and values. -- Highlighted mar 12, 2014

p.7: This bureaucratic mindset builds data that reflects the bureaucratic values of efficiency and consistency, doing so at the cost of excluding data that cannot be accommodated to those values. Donovan (2012) cites this as an instance of Scott’s (1998) “seeing like a state” in which the local government sought to simplify society by making it legible. The open data system incorporated this value in its choice of what to datize about the moment in which land was transferred. This incorporated a value structure into the data, one that is clearly not neutral in the competition for power. -- Highlighted mar 12, 2014

p.7: Whatever steps are taken to promote fairness in using data that is at its root unjust, the result will almost inevitably be unjust as well. Data is very much a case of “Injustice in, injustice out.” -- Highlighted mar 12, 2014

p.8: Gurstein posits a seven-layer model for promoting effective use of open data that identifies many of the most important complementary structures: 1 Sufficient internet access that data can be accessed by all users. 2 Computers and software that can read and analyze the data. 3 Computer skills sufficient to use them to read and analyze data. 4 Content and formatting that allows use at a variety of levels of computer skill and linguistic ability. 5 Interpretation and sense-making skills, including both data analysis knowledge and local knowledge that adds value and relevance. 6 Advocacy in order to translate knowledge into concrete benefits. 7 Governance that establishes a regime for the other characteristics. In the absence of these conditions it is not likely that any open data will promote justice. -- Highlighted mar 12, 2014

p.9: But “citizen-open” pales in comparison to what might be called “enterprise-open” data. Enterprises will have the resources to get the most out of open data -- Highlighted mar 12, 2014

p.12: what Adams (2013) refers to as “post-panoptic surveillance.” She identifies three types of observation that can replace the intensive central surveillance in Foucault’s own work, two of which would be enhanced by open data. “Sousveillance” involves observation from below rather than through hierarchy. Such surveillance occurs when, for instance, users of a service are asked to evaluate service providers by the providers’ supervisors. The role of observation is shifted in ways similar to the distinction between police patrol and fire alarm oversight of the legislative-executive relationship: (McCubbins and Schwartz 1984) the labor-intensive burden of observation is shifted away from the central authority to actors with a closer interest in observation in such a way that deviations from the desired outcomes will still be brought to the authority’s attentio -- Highlighted mar 12, 2014

p.14: It seems likely, then, that a theory of information justice will be built around forms of pluralism. Information pluralism would embrace, rather than problematize, the “messiness” of data. Rather than seeing conflicting data as inherently erroneous it would encourage information systems to be designed to incorporate and highlight differences in data, identifying them as moments of conflict among assumptions and values to be resolved through social rather than algorithmic solutions. -- Highlighted mar 12, 2014

p.15: Nor might we expect those with power to voluntarily cede it without challenge; even those who seek information justice—and a fair assessment of open data advocates would certainly suggest that many believe that they do so—are often unable to do so because of their unconscious biases and invisible privileges that they would change if they were conscious and visible. -- Highlighted mar 12, 2014

p.16: Ultimately it is the organizations in civil society, not philosophers, that make it possible for marginalized groups to participate collaboratively or to challenge embedded power structures in information systems. -- Highlighted mar 12, 2014