AI and tech ethics resources

April 2, 2023

I follow as much discourse around ethics in machine learning, data analysis, and artificial intelligence as I can. These are the resources I’ve used over the years to help me gather knowledge and perspectives and form my own opinions about these types of technology and implementations.

I’ve co-presented two talks on machine learning bias, and gave another on my own about the effects of missing data on data analysis. The slides for those talks can be found on About Sarah.

As an individual person, I have specific interests and biases that inform what content goes from “encountered in my news feed or RSS reader” to “opened in a new tab” to “actually read the thing”. This list is categorized according to those loose interests and resultant taxonomy. Some items might fall into multiple categories.

I’ve denoted things as follows:

📚 for books
📄 for research papers
🎙️ for podcasts
📰 for articles in non-academic publications
💻 for blogs or blog posts
📧 for email newsletters

I’ve also included some content that I haven’t yet made the time to consume, denoted with a ⌛️ emoji.

Dataset creation and curation, including data labeling #

📚 Living in Data by Jer Thorp
📚 Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass by Mary L Gray and Siddharth Suri
- I found this book to be pretty dry, and there were some editing issues in the Kindle version that I read, but still highly informative.
📄 Data and its (dis)contents: A survey of dataset development and use in machine learning research
📄 The Data-Production Dispositif
📄 Large image datasets: A pyrrhic win for computer vision?
📄 Datasheets for datasets
📄 An Assessment of Intrinsic and Extrinsic Motivation on Task Performance in Crowdsourcing Markets
📰 Underpaid Workers Are Being Forced to Train Biased AI on Mechanical Turk (Vice)
📰 The Exploited Labor Behind Artificial Intelligence (Noema)
🎙️ GPT-4 and the Politics of OpaqueAI (ft. Abeba Birhane) by This Machine Kills
- 📄 Discussed in this episode The Values Encoded in Machine Learning Research
🎙️ How Britain Killed its Computing Industry w/ Mar Hicks - Tech Won’t Save Us
- I have a note from listening to this podcast with the phrase “do away with the notion of data as a mirror to society, it distorts society” so relevant if only for that line.
🎙️ How AI Makes Living Labor Undead - This Machine Kills
📚 Atlas of AI by Kate Crawford ⌛️
📚 Data Feminism by Catherine D’Ignazio and Lauren F. Klein ⌛️
📄 Wisdom for the Crowd: Discoursive Power in Annotation Instructions for Computer Vision ⌛️
📄 Do Datasets Have Politics? Disciplinary Values in Computer Vision Dataset Development ⌛️
📄 Documenting Computer Vision Datasets ⌛️
📄 We Haven’t Gone Paperless Yet: Why the Printing Press Can Help Us Understand Data and AI ⌛️
📄 Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data ⌛️
📄 Towards Accountability for Machine Learning Datasets ⌛️

Representation in tech and machine learning, globally, linguistically, racially, or otherwise #

📚 Invisible Women: Data Bias in a World Designed for Men by Caroline Criado Perez
- This book is the canonical “missing data = bias” book, but I found it to be problematic for two reasons:
  1. The writing is fairly academic, with citations upon citations and discourse about specific research findings that go on for so long that you can’t quite remember what point the research summary is supporting.
  2. The premise of the book is deeply gender essentialist. A book about missing data that never mentioned gender non-conforming or trans folks, and relied on a definition of gender that was rooted in biological sex. A great reminder to be mindful of the biases present even in books about bias!
📰 An MIT Technology Review Series: AI Colonialism (MIT Technology Review) ⌛️
📄 [2111.15366] AI and the Everything in the Whole Wide World Benchmark ⌛️
📚 More Than a Glitch: Confronting Race, Gender, and Ability Bias in Tech by Meredith Broussard ⌛️
📚 Programmed Inequality by Mar Hicks ⌛️
📚 Black Software by Charlton D. McIlwain ⌛️
📚 Mismatch: How Inclusion Shapes Design by Kat Holmes ⌛️
📄 The Coloniality of Data Work in Latin America ⌛️
📄 Towards decolonising computational sciences ⌛️
📄 The Limits of Global Inclusion in AI Development ⌛️
📄 The Forgotten Margins of AI Ethics ⌛️

Machine learning bias, especially for decision-making #

📚 The Alignment Problem by Brian Christian
🎙️ Transcript: Ezra Klein Interviews Brian Christian
📚 Weapons of Math Destruction by Cathy O’Neil
- Much more casually written and full of personal anecdotes than the Alignment Problem, but that probably explains why this book was such a mainstream hit.
📄 Studying Up Machine Learning Data: Why Talk About Bias When We Mean Power?
📰 Machine Bias — ProPublica (ProPublica)
- Basically the canonical example of irresponsibly deployed machine learning models.
🎙️ The Limitations of ChatGPT with Emily M. Bender and Casey Fiesler — The Radical AI Podcast
📰 Health Care Bias Is Dangerous. But So Are ‘Fairness’ Algorithms | Wired hopped by 12ft.io ⌛️
📰 The Fraud-Detection Business Has a Dirty Secret | WIRED (series) ⌛️
📚 Algorithms of Oppression: How Search Engines Reinforce Racism by Safiya Umoja Noble ⌛️
📚 Automating inequality: how high-tech tools profile, police, and punish the poor by Virginia Eubanks ⌛️
📚 The Black Box Society:The Secret Algorithms That Control Money and Information by Frank Pasquale ⌛️
📚 Predict and Surveil: Data, Discretion, and the Future of Policing by Sarah Brayne ⌛️
📚 Artificial Unintelligence: How Computers Misunderstand the World by Meredith Broussard ⌛️
📚 Technically Wrong: Sexist Apps, Biased Algorithms, and Other Threats of Toxic Tech by Sara Wachter-Boettcher ⌛️
📚 Hello World by Hannah Fry ⌛️
📚 Your Computer Is on Fire edited by Thomas S. Mullaney, Benjamin Peters, and Mar Hicks ⌛️
📚 Captivating Technology: Race, Carceral Technoscience, and Liberatory Imagination in Everyday Life edited by Ruha Benjamin ⌛️
📚 Sorting Things Out: Classification and Its Consequences by Geoffrey C. Bowker and Susan Leigh Star ⌛️

Machine learning development and implementation #

📚 The Ethical Algorithm by Michael Kearns and Aaron Roth
- A practical discussion of ethical approaches to algorithm design and the involved tradeoffs.
🎙️ Deploying Machine Learning Models Safely and Systematically – The Data Exchange
📄 On the Dangers of Stochastic Parrots
🎙️ Emily M. Bender — Language Models and Linguistics | gradient-dissent – Weights & Biases
🎙️ The Power of Linguistics: Unpacking Natural Language Processing Ethics with Emily M. Bender
📚 Understand, Manage, and Prevent Algorithmic Bias: A Guide for Business Users and Data Scientists by Tobias Baer ⌛️
📚 The Oxford Handbook of Ethics of AI edited by Markus D. Dubber, Frank Pasquale, and Sunit Das ⌛️
📚 You Look Like A Thing and I Love You by Janelle Shane ⌛️
- I’m halfway through this and it’s excellent so far. A very straightforward discussion of how ML works, how it doesn’t work, and why.
📄 The Fallacy of AI Functionality ⌛️

Auditing, testing, and monitoring machine learning #

Alternate approaches to data-driven and ML-driven systems #

General resources #

🎙️ The Radical AI Podcast
- A podcast at the intersection of ethics and artificial intelligence.
🎙️ The Data Exchange Podcast
- A podcast from folks in the data industry.
🎙️ This Machine Kills
- A podcast from researchers and critics of AI and the data industry.
🎙️ Tech Won’t Save Us
- A podcast that critically examines the tech industry, data included.
🎙️ Gradient Dissent from Weights and Biases
- A podcast from folks in the data industry.
📧 The AI Ethics Brief
💻 ★❤✰ Vicki Boykis ★❤✰
- An excellent blog from an ML practitioner and champion of data grunt work.
📧 Import AI
- Not about ethics or bias specifically, but a worthwhile perspective from an AI insider.
📧 AI Snake Oil
📧 Counting Stuff
📧 Benn Stancil
📧 Rebel Tech
📄 arXiv > Computer Science > Artificial Intelligence
📄 arXiv > Computer Science > Computer Vision and Pattern Recognition
📄 arXiv > Computer Science > Computers and Society
📄 arXiv > Computer Science > Computation and Language
📄 arXiv > Computer Science > Machine Learning
📄 arXiv > Computer Science > Human-Computer Interaction
- I use a keyword filter on the Computer Science portion of arXiv with my RSS feed reader to pull up potentially relevant articles, but these are the areas in which they usually appear.
📄 AAAI/ACM Conference on AI, Ethics, and Society (AIES)
📄 ACM Conference on Fairness, Accountability, and Transparency (FAccT)
- Useful conference proceedings to dig into each year.