
The Trouble with Bias - Kate Crawford - NIPS (NeurIPS) 2017 Keynote
Suraj
Overview
Kate Crawford's keynote address at NIPS 2017 explores the pervasive issue of bias in machine learning systems. She argues that bias is not merely a technical glitch but a deeply rooted social problem, stemming from historical discrimination and cultural assumptions embedded in data and classification systems. Crawford distinguishes between 'harms of allocation' (e.g., loan denials) and 'harms of representation' (e.g., stereotyping, denigration), emphasizing the need to address the latter, which are often overlooked. She traces the history of classification to highlight its inherent subjectivity and cultural dependence, urging the machine learning community to adopt interdisciplinary approaches, practice 'fairness forensics,' and critically examine the ethics of the systems they build, warning that unchecked bias could lead to a loss of trust and an 'AI winter.
Save this permanently with flashcards, quizzes, and AI chat
Chapters
- Machine learning is rapidly expanding into critical areas like healthcare and criminal justice, mirroring the societal impact of computing and mass media.
- Despite excitement, bias, stereotyping, and unfair determinations are prevalent in ML systems, from object recognition to sentiment analysis.
- High-profile examples include gender bias in job ads, racial disparities in delivery services mirroring historical redlining, and biased risk assessment scores.
- The surge in interest in bias is justified due to the daily impact of ML on millions, but many back-end systems propagating bias are hidden and harder to detect.
- The term 'bias' has multiple, sometimes contradictory, meanings across mathematics, statistics, law, and popular usage, causing confusion.
- Statistical bias refers to systematic differences between a sample and a population (e.g., selection bias), distinct from the legal/popular sense of prejudice.
- ML technical bias (e.g., underfitting/overfitting) is different from legal bias, which involves judgment based on preconceived notions, even if the model technically performs well on data.
- Bias in ML often originates from biased training data, human labeling, and cultural assumptions, reflecting historical societal discrimination.
- Existing ML fairness research primarily focuses on 'harms of allocation,' where systems unfairly distribute opportunities or resources (e.g., loans, jobs).
- 'Harms of representation' occur when systems reinforce the subordination of groups by perpetuating stereotypes or misrepresentations, regardless of resource allocation.
- Examples of representational harms include stereotyping (gendered occupations), denigration (offensive labels like 'gorilla' for Black individuals), and under-representation (lack of diversity in image search results).
- Harms of representation are often overlooked because they are harder to quantify and are a more diffuse, long-term cultural issue compared to immediate allocation decisions.
- Classification is not a neutral technical act but a social and political one, always reflecting the time, culture, and biases of its creators.
- Historical attempts at classification, from Aristotle to Enlightenment taxonomies, show how social, religious, and linguistic assumptions are embedded in categorization.
- Modern ML systems are engaged in the largest classification experiment in history, with choices about categories having significant social consequences (e.g., Facebook's evolving gender categories).
- Datasets often reflect societal hierarchies, leading to under-representation or misrepresentation of certain groups (e.g., George W. Bush being the most represented face in the 'Labeled Faces in the Wild' dataset).
- Technical fixes like 'scrubbing to neutral' are insufficient because defining 'neutrality' is complex and often ignores historical discrimination.
- Representational harms often exceed the scope of individual technical interventions and require different theoretical tools.
- The ML community must embrace 'fairness forensics' (e.g., pre-release trials across populations) and take interdisciplinarity seriously by collaborating with social scientists, ethicists, and legal experts.
- There's a need to critically question the ethics of classification itself: 'Should we build this?' and consider who benefits and who is harmed by the systems we create.
Key takeaways
- Bias in machine learning is a complex socio-technical problem, not just a data or algorithmic issue.
- Understanding the historical and cultural context of data is crucial for identifying and mitigating bias.
- Harms of representation, which shape societal perceptions, are as significant as harms of allocation, which affect resources and opportunities.
- Classification systems are inherently subjective and reflect the values and biases of their creators and their time.
- Interdisciplinary collaboration is vital for addressing the multifaceted nature of bias in AI.
- The machine learning community has a responsibility to question the ethical implications of the systems they build and to consider who might be harmed.
- There is no single 'silver bullet' solution to bias; continuous vigilance and critical assessment are required.
Key terms
Test your understanding
- How does the definition of 'bias' in statistics differ from its meaning in legal and social contexts, and why is this distinction important for machine learning?
- What are the key differences between harms of allocation and harms of representation, and why is it important to consider both?
- Explain why classification systems in machine learning are considered inherently political and social, rather than purely technical.
- What does Kate Crawford mean by 'fairness forensics,' and how can it help mitigate bias in ML systems?
- Why is interdisciplinary collaboration essential for addressing bias in AI, and what are some examples of disciplines that should be involved?