Considering how powerful AI systems are, and the roles they increasingly play in helping to make high-stakes decisions about our lives, homes, and societies, they receive surprisingly little formal scrutiny.
That’s starting to change, thanks to the blossoming field of AI audits. When they work well, these audits allow us to reliably check how well a system is working and figure out how to mitigate any possible bias or harm.
Famously, a 2018 audit of commercial facial recognition systems by AI researchers Joy Buolamwini and Timnit Gebru found that the system didn’t recognize darker-skinned people as well as white people. For dark-skinned women, the error rate was up to 34%. As AI researcher Abeba Birhane points out in a new essay in Nature, the audit “instigated a body of critical work that has exposed the bias, discrimination, and oppressive nature of facial-analysis algorithms.” The hope is that by doing these sorts of audits on different AI systems, we will be better able to root out problems and have a broader conversation about how AI systems are affecting our lives.
Regulators are catching up, and that is partly driving the demand for audits. A new law in New York City will start requiring all AI-powered hiring tools to be audited for bias from January 2024. In the European Union, big tech companies will have to conduct annual audits of their AI systems from 2024, and the upcoming AI Act will require audits of “high-risk” AI systems.
It’s a great ambition, but there are some massive obstacles. There is no common understanding about what an AI audit should look like, and not enough people with the right skills to do them. The few audits that do happen today are mostly ad hoc and vary a lot in quality, Alex Engler, who studies AI governance at the Brookings Institution, told me. One example he gave is from AI hiring company HireVue, which implied in a press release that an external audit found its algorithms have no bias. It turns out that was nonsense—the audit had not actually examined the company’s models and was subject to a nondisclosure agreement, which meant there was no way to verify what it found. It was essentially nothing more than a PR stunt.
One way the AI community is trying to address the lack of auditors is through bias bounty competitions, which work in a similar way to cybersecurity bug bounties—that is, they call on people to create tools to identify and mitigate algorithmic biases in AI models. One such competition was launched just last week, organized by a group of volunteers including Twitter’s ethical AI lead, Rumman Chowdhury. The team behind it hopes it’ll be the first of many.
It’s a neat idea to create incentives for people to learn the skills needed to do audits—and also to start building standards for what audits should look like by showing which methods work best. You can read more about it here.
The growth of these audits suggests that one day we might see cigarette-pack-style warnings that AI systems could harm your health and safety. Other sectors, such as chemicals and food, have regular audits to ensure that products are safe to use. Could something like this become the norm in AI?
Anyone who owns and operates AI systems should be required to conduct regular audits, argue Buolamwini and coauthors in a paper that came out in June. They say that companies should be legally obliged to publish their AI audits, and that people should be notified when they have been subject to algorithmic decision making.
Another way to make audits more effective is to track when AI causes harm in the real world, the researchers say. There are a couple of efforts to document AI harms, such as the AI Vulnerability Database and the AI Incidents Database, built by volunteer AI researchers and entrepreneurs. Tracking failures could help developers gain a better understanding of the pitfalls or unintentional failure cases associated with the models they are using, says Subho Majumdar of the software company Splunk, who is the founder of the AI Vulnerability Database and one of the organizers of the bias bounty competition.
But whatever direction audits end up going in, Buolamwini and co-authors wrote, the people who are most affected by algorithmic harms—such as ethnic minorities and marginalized groups—should play a key part in the process. I agree with this, although it will be challenging to get regular people interested in something as nebulous as artificial intelligence audits. Perhaps low-barrier, fun competitions such as bias bounties are part of the solution.
Technology that lets us “speak” to our dead relatives has arrived. Are we ready?
Technology for “talking” to people who’ve died has been a mainstay of science fiction for decades. It’s an idea that’s been peddled by charlatans and spiritualists for centuries. But now it’s becoming a reality—and an increasingly accessible one, thanks to advances in AI and voice technology.
MIT Technology Review’s news editor, Charlotte Jee, has written a thoughtful and haunting story about how this kind of technology might change the way we grieve and remember those we’ve lost. But, she explains, creating a virtual version of someone is an ethical minefield—especially if that person hasn’t been able to provide consent. Read more here.
Bits and Bytes
There is a lawsuit brewing against AI code generation initiative GitHub Copilot
GitHub Copilot allows users to use an AI to automatically generate code. Critics have warned that this could lead to copyright issues and cause licensing information to ber lost. (Github Copilot Investigation)
France has fined Clearview AI
The French data protection agency has fined the facial-recognition company €20 million ($19.7 million) for breaching the EU’s data protection regime, the GDPR. (TechCrunch)
One company’s algorithm has been pushing rents up in the US
Texas-based RealPage’s YieldStar software is supposed to help landlords get the highest possible price on their property. From the looks of it, it’s working exactly as intended, much to the detriment of renters. (ProPublica)
Meta has developed a speech translation system for an unwritten language, Hokkien
Most AI translation systems focus on written languages. Meta’s new open-source speech-only translation system allows speakers of a mostly oral language, Hokkien, mostly spoken in the Chinese diaspora, to have conversations with English speakers. (Meta)
Brutal tweet of the week
People are inserting pictures of themselves into CLIP interrogator to find out what an AI recommends the best prompts should be for a text-to-image AI. The results are brutal. (h/t to Brendan Dolan-Gavitt or “an orc smiling to the camera”)
Thanks for making it this far! See you next week.