More Ethical and Empowering Data Science

The economy of today and the future is a data economy, one driven by measuring and commodifying information about users’ actions, habits, even desires. But do we understand the power dynamics, and their implications, behind the data that is shaping both our digital and corporeal world? For example, when people consume data visualizations, they frequently identify them as neutral and objective because they are rooted in quantitative metrics and use conventions like clean lines and geometric shapes. Catherine D'Ignazio, Assistant Professor of Urban Science and Planning, is adding a feminist lens to challenge our assumptions concerning data science and its communication, exposing the very human and imperfect design processes behind models and algorithms, and exploring pathways for making more ethical and empowering practices.

D'Ignazio is an activist, scholar, artist/designer, and “hacker mama” whose work focuses on feminist technology, data literacy and civic engagement. She has organized and implemented participatory design events that address a wide range of issues including breastfeeding and postpartum support, menstruation, sea level rise, and water quality. Her research interests examine the intersection of the fields of technology, design & social justice. D'Ignazio is also the director of the Data + Feminism Lab which uses data and computational methods to promote gender and racial equity, with a particular focus on space and place.

Q1 Could you speak more about your forthcoming book, Data Feminism (MIT Press)?

D'Ignazio: Lauren F. Klein and I began writing Data Feminism because we were struck by how the hype around Big Data and AI was very positivist – meaning, the false idea that data are always a one-to-one correspondence with reality; that a dataset from over here can be combined with one from over there and always yield correlations (and profit); that many “data for good” projects are not considering who gets to “do good” and who gets someone else’s “good” done to them, and so on. That, and the fact that there is now so much coverage of bias and discrimination in AI. While I’m very encouraged by the fact that we are having more public discussions about ethics in data science and concepts like fairness and accountability in AI, we are still not going to avoid creating discriminatory data products without a basic understanding of structural oppression – what sociologist Patricia Hill Collins calls “the matrix of domination.”

So, we set out to write a book that could introduce feminist concepts and ideas to data scientists, and vice versa, to introduce data, algorithms and visualization to feminists. We tried to look across feminist scholarship and activism from the past forty or so years with a particular focus on Black feminism in the US context and on feminist approaches that are coming out of applied fields like human-computer interaction, geography and GIS, and participatory design. We came up with seven principles for undertaking a more feminist approach to data science, and each chapter is structured around one of those. And we try to make it interesting by showcasing many examples of folks who are working with data in feminist ways.

Q2. A central tenet of the Data + Feminism Lab is activism, we don’t often conflate activism with data science, so I am curious, how does data inform the activism of the lab?

D'Ignazio: In Data Feminism, we discuss how many of our current ethical quandaries in data and AI can be traced back to a lack of understanding of oppression, meaning racism, sexism, colonialism and more. Oppression shows up in our datasets and algorithms in the same way that it shows up in our institutions and our research meetings. For example, while it might be shocking to some people that Amazon can’t develop a gender-neutral hiring algorithm, Lauren and I are definitely not surprised by that given the historical hiring practices at tech companies. Just as we have “garbage in, garbage out” we have “sexism in, sexism out.”

Intersectional feminism starts with the idea that oppression is real, historic, on-going, and worth dismantling. The activist part comes in at the end there – the dismantling stage. But I really don’t see it as radical to work for a world in which all people – women of color, white women, disabled people, Indigenous people, non-binary people – have equal life chances. To me, using data science to rectify these basic but pervasive power imbalances should be mainstream practice.

Q3 For students and colleagues who are excited to connect and work together, would you describe some of your future research projects?

D'Ignazio: I hope to co-create a community of people who are interested in anti-oppressive approaches to computation. To this end, the Data + Feminism Lab will have bi-weekly, open, research meetings where we will read and discuss papers, give feedback on work and deepen our understanding of feminism and anti-racism together.

I will be continuing to organize hackathons in conjunction with Alexis Hope at the Media Lab. Our next one is September 12th and 13th, 2020. It’s called There Will Be Blood and it will be convening students, community groups, and national leaders to hack on menstrual equity (everything from products to policy to culture change).

I’m also starting new research projects. One has to do with feminicides – gender-based killings of women, including cis and trans women. We discuss feminicides in Data Feminism as a case of “missing data” as Mimi Onuoha calls it, or what Caroline Criodo-Perez calls “the gender data gap”. As with so many data that relate to women, these data go underreported, uncollected, and unprioritized, so individual activists and civic collectives step in to fill that gap, often at great personal and political cost. I’m in early stages of conversations with collaborators who are based in Latin America and we are interested in looking at how to build both technologies and solidarities. I’m not doing it alone for the activist mappers who are monitoring and recording instances of femicides when their governments have systematically failed to do so.

And then the other project I’m getting off the ground has to do with gender equity in the city’s symbolic landscape. That’s the topic of my spring course called Crowd Sourced City – how many names in your city are named for men vs women vs non-binary people? In places where people have done these audits, like Rome, the percentage of streets named for women is on the order of 3% – terrible! We have a couple exceptional partners who are interested in doing this work with us in different geographic contexts. For example, the Boston Public Library in the Boston area and Geochicas, a feminist collective that has mapped a number of cities in Latin America. I’m also in discussion with a city government who is interested in auditing their symbolic infrastructure for its connections with slavery, which would be a really important project if that moves forward.

While collecting and analyzing data doesn’t solve all the problems on its own, quantifying disparities related to race and gender is one first step towards shifting the balance of power. Structural oppression is systemic and operates at macro and micro scales. Forces of oppression are actually uniquely suited to interrogation and visualization using data science – what previously may have appeared to be an isolated incident (a sexist hiring algorithm, a racist face detection system) is revealed to be part of a much larger pattern. Recognizing the pattern is a first step towards challenging and changing it.