An example of racial bias in machine learning strikes again, this time by a program called PULSE, as reported by The Verge. Input a low resolution image of Barack Obama – or another person of colour such as Alexandra Ocasio-Cortez or Lucy Liu – and the resulting AI-generated output of a high resolution image, is distinctively a white person.
In order to depixelate images, PULSE uses the algorithm StyleGAN, and a technique called upscaling, to “imagine” the high resolution version of pixelated inputs. This is not done by ‘zooming and enhancing’ the original low-res image, as is done in TV and film, but by generating a brand new high resolution face, that when pixelated, looks the same as the one inputted by the user. Hence, the algorithm is not finding new details in the image, but rather, inventing new faces.
What has been discovered is that when using the algorithm to scale up pixelated images, it often generates faces with white, Caucasian features, leading to inaccurate depixelated images. As a result, PULSE is producing white faces more often than faces of people of colour. The algorithm creators have acknowledged that this bias is likely inherited from the dataset StyleGAN was trained on. In short, due to the data StyleGAN was trained on, it defaults to white features when it attempts to invent a face that looks like the pixelated input image. This is an extremely common issue in machine learning.
These racial disparities and biases were evident only after the tool was made widely accessible. Again, this calls into question the carelessness and ignorance in the AI world, amongst researchers and commercial entities, of consistently neglecting to ensure that people of colour are not harmed by these technologies. Deborah Raji, in the article, drills down the key point that:
Bias in AI is affected by wider social injustices and prejudices, and that simply using ‘correct’ data does not deal with the larger injustices.
It is yet again clear that the problem of bias goes beyond “fair” datasets or algorithms. Machine Learning systems are not biased only when data is biased, and that fair datasets can still create biased systems. Addressing discrimination and bias requires more than technical fixes, but rather, a fundamental shift in thinking about the development and use of these technologies.