In mid October 2021, the Allen Institute for AI launched Delphi, an AI in the form of a research prototype that is designed “to model people’s moral judgments on a variety of everyday situations.” In simple words: they made a machine that tries to do ethics.
You can give it a simple free form description of a situation or ask it a question (e.g. “Ignoring a phone call from a friend”), and the AI will give its moral judgment (“It’s rude”).
Soon after the launch it became clear that Delphi was full of racist and bigoted stereotypes. Futurism published how Delphi decided that “Being a white man” is more morally acceptable than “Being a black woman”, and that Delphi thought that a white man walking towards you at night was “okay”, while a black man doing the same was “concerning”.
The surprised research team quickly updated the prototype and wrote a reply trying to address the problems. Basically they said they hadn’t expected the adversarial input (apparently this was the first time they ventured onto the internet), they think it is better to act and try and make more inclusive, ethically-informed, and socially-aware AI’s, than not to act, and they quickly improved their model (racist answers in cases where race was mentioned went from 9% to 2%). They failed to address any of the more structural problems.
When you read their original paper, you quickly realize that they couldn’t have gotten anything else but this result. Their formalization of morality is based on a (terribly naive) descriptive and situational ethics. This means they aren’t modeling morality, they are actually modeling the racist bias that exists in society. Maybe it is time these computer scientists add one or more moral philosophers to their team?
When you ask Delphi: “Is it okay to continue to let your AI prototype spew out racist results?”, it will answer: “It’s bad”. At least on that point it has a deeper understanding of morality than its creators…