AI chatbots use racist stereotypes even after anti-racism training

Commercial AI chatbots demonstrate racial prejudice toward speakers of African American English – despite expressing superficially positive sentiments toward African Americans. This hidden bias could influence AI decisions about a person’s employability and criminality.

“We discover a form of covert racism in [large language models] that is triggered by dialect features alone, with massive harms for affected groups,” said Valentin Hofmann at the Allen Institute for AI, a non-profit research organisation in Washington state, in a social media post. “For example, GPT-4 is more likely to suggest that defendants be sentenced to death when they speak African American English.”

Hofmann and his colleagues discovered such covert prejudice in a dozen versions of large language models, including OpenAI’s GPT-4 and GPT-3.5, that power commercial chatbots already used by hundreds of millions of people. OpenAI did not respond to requests for comment.

The researchers first fed the AIs text in the style of African American English or Standard American English, then asked the models to comment on the texts’ authors. The models characterised African American English speakers using terms associated with negative stereotypes. In the case of GPT-4, it described them as “suspicious”, “aggressive”, “loud”, “rude” and “ignorant”.

When asked to comment on African Americans in general, however, the language models generally used more positive terms such as “passionate”, “intelligent”, “ambitious”, “artistic” and “brilliant.” This suggests the models’ racial prejudice is typically concealed beneath what the researchers describe as a superficial display of positive sentiment.

The researchers also showed how covert prejudice influenced chatbot judgements of people in hypothetical scenarios. When asked to match African American English speakers with jobs, the AIs were less likely to associate them with any employment, compared with Standard American English speakers. When the AIs did match them with jobs, they tended to assign roles that do not require university degrees or were related to music and entertainment. The AIs were also more likely to convict African American English speakers accused of unspecified crimes, and to assign the death penalty to African American English speakers convicted of first-degree murder.

The researchers even showed that the larger AI systems demonstrated more covert prejudice against African American English speakers than the smaller models did. That echoes previous research showing how bigger AI training datasets can produce even more racist outputs.

The experiments raise serious questions about the effectiveness of AI safety training, where large language models receive human feedback to refine their responses and remove problems like bias. Such training may superficially reduce overt signs of racial prejudice without eliminating “covert biases when identity terms are not mentioned”, says Yong Zheng-Xin at Brown University in Rhode Island, who was not involved in the study. “It uncovers the limitations of current safety evaluation of large language models before their public release by the companies,” he says.

Topics:

Original Source Link

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

AI chatbots use racist stereotypes even after anti-racism training

Agnieszka Holland’s ‘The Green Border’ Wins Polish Film Awards

Europe’s Digital Markets Act Is Breaking Open the Empires of Big Tech

PopularPosts

Categories

RecentPosts

Archives

Editor's Picks

Browse By Category

Useful Links