Quantifying bias in society with ChatGTP-like tools

ChatGPT is an implementation of a so-called ‘large language model’. These models are trained on text from the internet at large. This means that these models inherent the bias that exists in our language and in our society. This has an interesting consequence: it suddenly becomes possible to see how bias changes through the times in a quantitative and undeniable way.

Brian Christian explains this quite brillianty in his book The Alignment Problem. He shows how large language models use ’word embeddings’ to represent words in a mathematical way. This allows you to calculate with words. For example, you can ask the model something like “Paris + France – Rome”, and it will answer “Italy”. Unfortunately, this means that the model will also replicate horrible societal stereotypes, so “Man + Doctor – Woman” will likely lead to “Nurse”. These models are biased in the way that society is biased. And this bias can be quantified.

Researchers have found out how you can use this quantification of bias to track historical societal trends. Christian writes:

The embeddings […] show a detailed history of the shift in racial attitudes. In 1910, for instance, the top ten words most strongly associated with Asians relative to Whites included “barbaric,” “monstrous,” “hateful,” and “bizarre.” By 1980 the story could not be more different, with the top ten words topped by “inhibited” and “passive” and ending with “sensitive” and “hearty”: stereotypes in their own right, of course, but ones that reflect an unmistakable cultural change.

He sees potential to actively track whether biases are getting better or are getting worse:

One might imagine a kind of real-time dashboard of whether society itself—or, at the very least, our public discourse—appears to be getting more or less biased: a bellwether for the shifts underway, and a glimpse of the world to come.

See: The Alignment Problem by Brian Christian, and Word embeddings quantify 100 years of gender and ethnic stereotypes by Nikhil Garg, Londa Schiebinger, Dan Jurafsky, and James Zou.

Comments are closed.

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑