Data Feminism - Review and Application

Data Feminism - Review and Application

Co-Authored by Annice Joseph and Sarayu Acharya 

In the age of information, data is the lens through which we collectively view reality. Whether we’re looking for the best place to eat, how to reach our destination, or even whom to hire, data vetoes our every move. Sometimes, however, we’re forced to wonder where we’re getting this data from and if it’s being responsibly sourced. How would you like it if you were addressed by a different name at a restaurant? Or if you received the wrong tickets to a show you were looking forward to? These seem like small issues in the grand scheme of things, but this lack of verification could easily cause a sense of discrimination, so just imagine how much more dangerous it could be if there were errors in data related to larger issues? This would cause systemic discrimination in healthcare, the ability to rent/own a home, hiring, etc. With great power comes great responsibility, but how responsibly collected and disseminated is the data we regard so greatly? 

The Problem with Data

In a world that relies on data for every minuscule decision we take, it is often assumed that objective data is the purest way of knowing. However, as with any other form of information, two questions must be asked when dealing with data: Who collects it? And, why?

In India, Persons with Disabilities (PwD) were left out of the census data for almost half a century (from 1941-2001)! The impact of this loss of data is not only a sense of dehumanization and erasure for Persons with Disabilities, but has also led to the lack of inclusion in overall development in all national and state support plans for PwD. Despite new laws and years of activism, not much has changed even today. It continues to be a challenge in every sphere from education, to medicine, to employment. It is imperative to adapt our environment to be inclusive of persons with disabilities, but incredibly hard to do with such little verifiable data.

In the US, the black homeownership rate at the end of 2020 was 44% compared to the 74.5% rate of non hispanic white consumers. While credit scores in the U.S were originally designed to stop racial discrimination through redlining and other unfair practices, they have recently come under criticism due to the use of data that is reflective of historical bias while omitting certain types of data (rental and cell phone payments) that might include a broader swath of people, including Black and Hispanic consumers. In terms of interpersonal interactions, a 2018 paper by Cornell University titled Debiasing Desire: Addressing Bias & Discrimination on Intimate Platforms dissects the complex ways in which dating app algorithms prevent social disruption by racially and socio-economically biasing matches.

For some of us recruiters, often wondering why we are unable to find the right candidates, this Harvard Business Review article published in 2019, describes how the algorithms that source the pool of candidates for hiring can also be programmed with bias if the analysts in charge do not actively program them towards promoting equity. This could be caused by how the candidate pool is decided or the criteria used to narrow the funnel. For example: You want candidates from a certain educational or work background, but your search pool is geographically limited to only one or two cities, your candidate pool will become limited to a very small cross-section of society, thereby losing out on the best diverse candidates. Since data is inherently controlled by a hegemonic group, there is inherent censorship and crafting of the data we have access to. 

The book ‘Data Feminism’ by MIT and Georgia Tech professors Catherine D'Ignazio and Lauren Klein challenges the “cleanliness of data”, by fundamentally accepting that: power is not distributed equally in the world. Those who wield power are disproportionately elite, straight, white, able-bodied, cisgender men from the Global North. They introduce the idea of data feminism, the idea of challenging oppressive systems of power that harm all of us, that undermine the quality and validity of diverse work, and hinder us from creating true and lasting social impact with data science. This makes us question our trust in data, and realise that what we interpret as objective information is in fact filtered based on who is seeking it, who is mining it, how much of it is being presented, and what purpose it is serving. 

Across the globe, systems of power have historically left out data that is "inconvenient" to record. Leaving marginalised groups and communities erased or invisible in the larger social narrative establishing a successful matrix of domination and maintaining control over all areas of life be it citizenship, interpersonal experiences, or hiring data. So how do we de-stabilize this hegemony of data and access more inclusive ways of knowing?

Embracing Pluralism 

There are seven principles that the book focuses on overcoming the hegemony of data— examining power, challenging power, elevating emotion and embodiment, rethinking binaries and hierarchies, embracing pluralism, considering context, and making labor visible. 

Principle #5 Embrace Pluralism is one we would like to discuss here, this principle highlights: 

  • Take into account various intersections and take into consideration who views the data and the way it is being analysed. 
  • Examine a transdisciplinary view of data science, AI, chatbot, and various data sources 
  • Note that qualitative data is equally relevant 
  • Filter out the fringes and focus on the middle of the curve, note that the “tidyverse” is a diversity hiding trick, so it is important to not only look at averages but ask why there are outliers in data sets. 

Data feminism thus proposes that the most complete knowledge comes from synthesizing multiple perspectives, with priority given to local, Indigenous, and experiential ways of knowing. Applied this would mean increasing the representation of diverse groups in data analytics as well as re-structuring our acceptance of objective quantitative data as the “purest form of knowledge” i.e giving room for qualitative data and non-algorithm-based ways of collecting information. 

Our strategy as a society to diversify data will need to disrupt all the aforementioned matrixes of domination at a national, organisational, and interpersonal level. And while there are ways to go in developing a solution at a macro scale, here are some next steps we can take across levels to impact change in data today. 

Key TakeAways

  1. Diversity in Data Analytics: Ensuring that the people collecting and analysing data come from diverse backgrounds and experiences will help rule out bias in the way data is collected and disseminated. Example: If all hiring data is analysed and filtered through someone who has always worked in the corporate sector who would not consider a source from nonprofit/social enterprises, you might miss out on an incredibly qualified hire from that background who would bring in a diverse perspective.
  2. Mindful Algorithms: Revising algorithms across hiring (and dating) platforms to consciously exclude bias is a vital if difficult task. Since algorithms operate on specific set rules of segregating data, programming in equity is the only way to achieve it at the organisational and interpersonal level. Example: If a recruiter is searching for Java programmers in India using a location filter for Bangalore, the result will be a narrow funnel of candidates who were in Bangalore at that point in time, missing out on all candidates who could move to Bangalore from nearby locations. 
  3. Creating Awareness: While the problem with data has been around since the advent of data analytics, a very small number of people are aware of the fact that data may not always be accurate. Creating awareness about the sometimes faulty nature of data in a way that encourages deeper critical thinking and factoring in other ways of knowing can greatly benefit each and every one of us. When we are faced with anomalies, outliers, and obscure highlights of data, instead of discounting their presence, it is important to seek the cause of these data changes. Either by drilling down for more details or considering the effects of external factors. Example: A company had two locations (A&B) in the same Country, and one had excellent retention of campus hires while at the other either candidate either asked for a transfer or left the Company, the assumption was that location B is just not attractive. However, after a group discussion with the campus hires, we found that location B did not have an airport which made it difficult to get back home during weekends especially since most campus hires were foreign students. This point was left out of the traditional data set and only discovered due to the non-quantitative experience of a group discussion. Being open to such awareness will help us to be more equitable and use data to truly bring about inclusiveness and not exclusiveness

While data is now and has always been knowledge, it is also power. We must do everything in our capacity to ensure that this power is channelised in a way that provides an inclusive, equitable, lived experience for all.

Ushma Desai

OD Facilitator, Mandala Artist, Independent Researcher

3y

Very informative and thought provoking! Prarthana Shah - might be of interest to you

This article caused me to B. become intrigued.

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics