AI Art and Photography (2015-Present)

v.1.0 { 11.8.22 ///

The term AI is essentially a marketing buzzword used to describe the development of machine learning algorithms that are capable of generating realistic approximations based on massive datasets. The machine learning algorithms that underpin our popular notion of AI have their roots in post-WW2 computer science, but it wasn't until the 2000s that these algorithms were developed to analyze and reproduce visual outputs.

A flashpoint in visual machine learning development was the public release of Google's DeepDream Machine Learning Algorithm. DeepDream was a convolutional neural network (CNN) designed to find and enhance patterns in images, especially images of dogs, slugs, and disembodied eyes splayed out in psychedelic arrays. It was a very "weird internet" way of showing how computer vision interprets patterns in images. The next development came from General Adversarial Network (GAN) Machine Learning algorithms which took image synthesis to the next level, first with the neural style-transfer apps (ie, turn your selfies into Starry Night), then in 2018 with NVIDIA's StyleGAN which could generate photorealistic faces. Around that same time OpenAI began developing advanced natural-language processing models beginning with GPT and continuing with CLIP, which was trained on text-to-image pairs. The first public release of that technology for visual art synthesis came in 2021 with OpenAI's DALLE and the VQGAN+CLIP algos hosted on google collabs like Disco Diffusion. These algorithms allowed artists to reliably generate images directly from a text prompt, these artists also developed an aesthetic of liminal imagery culled from the noise of visual data.

In the summer of 2022, AI Art erupted into the mainstream thanks to the release of DALLE2, MidJourney, and Stable Diffusion. Apart from the obvious improvements in resolution and visual fidelity within the algorithms, their developers pivoted towards a mostly subscription-based service model that gave users access through their browsers through content-moderated platforms. Developers also ceded monetization rights to users while keeping a license on generated images and prompts.

The advancement of text-to-image generation AIs is not without controversy, as the companies behind DALLE and Stable Diffusion have been called out for how they obtain and train their datasets. Scraping of internet data on a large scale is legally protected as fair use as long as it's done for research, so private companies exploit this loophole by funneling their development through research institutions that gather vast sets of data containing copyrighted material from stock image sites like Shuttrstock, portfolio hosting sites like ArtStation and Flickr, as well as social media sites. These copyrighted images are relicensed into open-sourced AIs with commercial applications. Critics have called this process "Data Laundering" and the result is an appropriation factory —an algorithm that can generate a replica of a living artist's work without their consent, often for a small fee.

This structural critique requires a structural solution, but it doesn't detract from the fact that Generative Image AIs are powerful tools for creators that master the art of prompting. In a way these AIs are made for poets, they benefit from an economy of language, and from ritual semantic gestures. It's kind of like casting a spell, or summoning a spirit. The artist must be careful to choose the right words and to ask the right question. But what does this appropriation factory mean aesthetically?

At its most basic, the new wave of AI Art offers a kind of fast-food aesthetic, a quickly served sludge of visual culture. Among users there is tendency towards waifus, cyberpunk, or historical fantasy, which to some degree reflects the biases of the trained datasets, where bodies are white, cis, and abled by default. There is also a tendency to render contemporary pop culture as renaissance or modernist paintings, as if defining new canons. Lev Manovich perceptively points out that these tendencies are similar to other folk arts and this is a useful lens to look at AI Art. A new generation creates their own kitsch. These are appropriation factories in a political sense, but mathematically they are averaging machines. The AI can only approximate between its existing dataset and while it can evoke a strange and beautiful liminality between visual forms and media, it won't be able to visualize what doesn't exist. This fact is a refuge for artists, these AIs will only do what we ask them to do. For the most experienced artists, AI is another tool in their arsenal.

Photography is one field that will be profoundly affected by generative image AIs. First, because they are the ultimate culmination of the image archive in the sense that it can evoke an infinite number of datapoints within a prescribed set. This is a moment of triumph for big data and its predictive aesthetics, which leads to the second point. Photography has had a tenuous relationship with truth for some time now, but as generative image AIs learn to create photorealistic imagery without cameras or subjects, photography becomes totally unmoored from that 20th century role as the objective observer of truth. The aesthetics of verisimilitude from 20th century photojournalism and surveillance media will give way to a 21st century aesthetics of predictive algorithms. As artists using these emerging tools, the task isn't just to create, but to question how these systems are designed to work, who do they benefit, who do they harm; and to imagine alternatives both through material/economic structures but also through liminal  aesthetics and tactical approaches.

Using Format