AI models like DALL-E 2 keep making art that looks way too European

October 19, 2022:

In late September, OpenAI made its DALL-E 2 AI art generator widely available to the public, allowing anyone with a computer to make one of those striking, slightly bizarre images that seem to be floating around the internet more and more these days. DALL-E 2 is by no means the first AI art generator to open to the public (the competing AI art models Stable Diffusion and Midjourney also launched this year), but it comes with a strong pedigree: Its cousin, the text-generating model known as GPT-3 — itself the subject of much intrigue and multiple gimmicky stories — was also developed by OpenAI.

Last week, Microsoft announced it would be adding AI-generated art tools — powered by DALL-E 2 — to its Office software suite, and in June DALL-E 2 was used to design the cover of Cosmopolitan magazine. The most techno-utopian proponents of AI-generated art say it provides a democratization of art for the masses; the cynics among us would argue it’s copying human artists and threatening to end their careers. Either way, it seems clear that AI art is here, and its potential has only just begun to be explored.

Naturally, I decided to try it.

As I scrolled through examples of DALL-E’s work for inspiration (I had determined that my first attempt ought to be a masterpiece), it seemed to me that AI-generated art didn’t have any particular aesthetic other than, maybe, being a bit odd. There were pigs wearing sunglasses and floral shirts while riding motorcycles, raccoons playing tennis, and Johannes Vermeer’s Girl With a Pearl Earring, tweaked ever so slightly so as to replace the titular girl with a sea otter. But as I kept scrolling, I realized there is one unifying theme underlying every piece: AI art, more often than not, looks like Western art.

“All AI is only backward-looking,” said Amelia Winger-Bearskin, professor of AI and the Arts at the University of Florida’s Digital Worlds Institute. “They can only look at the past, and then they can make a prediction of the future.”

For an AI model (also known as an algorithm), the past is the data set it has been trained on. For an AI art model, that data set is art. And much of the fine art world is dominated by white, Western artists. This leads to AI-generated images that look overwhelmingly Western. This is, frankly, a little disappointing: AI-generated art, in theory, could be an incredibly useful tool for imagining a more equitable vision of art that looks very different from what we have come to take for granted. Instead, it stands to simply perpetuate the colonial ideas that drive our understanding of art today.

To be clear, models like DALL-E 2 can be asked to generate art in the style of any artist; asking for an image with the modifier “Ukiyo-e,” for example, will create works that mimic Japanese woodblock prints and paintings. But users must include those modifiers; they are rarely, if ever, the default.

DALL-E 2’s interpretation of the prompt “Hokusai painting of Artificial Intelligence”
Neel Dhanesha/Vox; Courtesy of OpenAI

Winger-Bearskin has seen the limits of AI art firsthand. When one of her students used images generated by Stable Diffusion to make a video of a nature scene, she realized the twilight backgrounds put out by the AI model looked oddly similar to the scenes painted by Disney animators in the 1950s and ‘60s — which themselves had been inspired by the French Rococo movement. “There are a lot of Disney films, and what he got back was something we see a lot of,” Winger-Bearskin told Recode. “There are so many things missing in those datasets. There are millions of night scenes from all over the world that we would never see.”

AI bias is a notoriously difficult problem. Left unchecked, algorithms can perpetuate racist and sexist biases, and that bias extends to AI art as well: as Sigal Samuel wrote for Future Perfect in April, previous versions of DALL-E would spit out images of white men when asked to depict lawyers, for example, and depict all flight attendants as women. OpenAI has been working to mitigate these effects, fine-tuning its model to try to weed out stereotypes, though researchers still disagree on whether those measures have worked.

But even if they work, the problem of artistic style will persist: If DALL-E manages to depict a world free of racist and sexist stereotypes, it would still do so in the image of the West.

“You can’t fine-tune a model to be less Western if your dataset is mostly Western,” Yilun Du, a PhD student and AI researcher at MIT, told Recode. AI models are trained by scraping the internet for images, and Du thinks models made by groups based in the United States or Europe are likely predisposed to Western media. Some models made outside the United States, like ERNIE-ViLG, which was developed by the Chinese tech company Baidu, do a better job generating images that are more culturally relevant to their place of origin, but they come with issues of their own; as the MIT Technology Review reported in September, ERNIE-ViLG is better at producing anime art than DALL-E 2 but refuses to make images of Tiananmen Square.

Because AI is backward-looking, it’s only able to make variations of images it has seen before. That, Du says, is why an AI model is unable to create an image of a plate sitting on top of a fork, even though it should conceivably understand each aspect of the request. The model has simply never seen an image of a plate on top of a fork, so it spits out images of forks on top of plates instead.

Injecting more non-Western art into an existing dataset wouldn’t be a very helpful solution, either, because of the overwhelming prevalence of Western art on the internet. “It’s kind of like giving clean water to a tree that was fed with contaminated water for the last 25 years,” said Winger-Bearskin. “Even if it’s getting better water now, the fruit from that tree is still contaminated. Running that same model with new training data does not significantly change it.”

Instead, creating a better, more representative AI model would require creating it from scratch — which is what Winger-Bearskin, who is a member of the Seneca-Cayuga Nation of Oklahoma and an artist herself, does when she uses AI to create art about the climate crisis.

That’s a time-consuming process. “The hardest thing is making the data set,” said Du. Training an AI art generator requires millions of images, and Du said it would take months to create a data set that’s equally representative of all the art styles that can be found around the world.

If there’s an upside to the artistic bias inherent in most AI art models, perhaps it’s this: Like all good art, it exposes something about our society. Many modern art museums, Winger-Bearskin said, give more space to art made by people from underrepresented communities than they did in the past. But this art still only makes up a small fraction of what exists in museum archives.

“An artist’s job is to talk about what’s going on in the world, to amplify issues so we notice them,” said Jean Oh, an associate research professor at Carnegie Mellon University’s Robotics Institute. AI art models are unable to provide commentary of their own — everything they produce is at the behest of a human — but the art they produce creates a sort of accidental meta-commentary that Oh thinks is worthy of notice. “It gives us a way to observe the world the way it is structured, and not the perfect world we want it to be.”

That’s not to say that Oh believes more equitable models shouldn’t be created — they are important for circumstances where depicting an idealized world is helpful, like for children’s books or commercial applications, she told Recode — but rather that the existence of the imperfect models should push us to think more deeply about how we use them. Instead of simply trying to eliminate the biases as though they don’t exist, Oh said, we should take the time to identify and quantify them in order to have constructive discussions about their impacts and how to minimize them.

“The main purpose is to help human creativity,” Oh said, who’s researching ways to create more intuitive human-AI interactions. “People want to blame the AI. But the final product is our responsibility.”

This story was first published in the Recode newsletter. Sign up here so you don’t miss the next one!

Source link