I was very young when I began exploring the world of digital art, fascinated by the potential of software like Photoshop. It was version 5.0, which is to today’s version much as a Fiat Cinquecento is to a Tesla, but it already seemed like a revolution to me. And, looking back, I was not wrong.
Years later, rants against the digital medium were always the same, such as “the computer does it all” and “you can’t make art with these tools.” Today these words sound ridiculous, if we look at how digital graphics schools and artists have bloomed. Nonetheless, scepticism is a common reaction when a major technological innovation revolutionizes the way we manipulate symbols, in this case visual ones. If I had to generalize, I would say that we go through three stages. The first is what I would call the “OMG” stage, when the novelty of the tool is such that every product it creates seems extraordinary. The second is the “I hate it!” stage, with concerns that the new medium decrees the end of art. The third (and last) is the “OK” one, in which the wide diffusion of the technique normalizes its use. When it comes to digital art today, we now find ourselves in this last stage.
The situation changes, however, if we look at the new machine learning-based software that creates images from a textual command (TTI). We are talking about programs capable of creating images of incredible definition and detail from a textual command provided by the user. For example, “vintage photo of a wombat in a suit” and… here it is.
I have been exploring these programs lately, both in theory and practice, and, in my opinion, they will bring a revolution for visual art equal to the advent of photography or computer graphics (and before that, of oil painting, etc.).
Art, technique, and magic
Unlike the various machine learning systems that produce text, those that work with images play in a field where human skills are less developed, because it’s easier to learn to write than to draw. The potential of these programs lies both in their quality, which I would currently compare to that of good craftsmen, and in their speed, which is vastly superior to that of any human.
Before addressing the big questions (Does the computer do everything? Is it possible to make art with it?), I will make a few technical notes. To quote Searle’s famous thought experiment, these software programs look more like “Chinese rooms” than anthropomorphic androids; they are in fact algorithmic models based on huge amounts of data created by humans, on which they work on a statistical basis in order to successfully respond to our requests. For example, if I want a picture of a kitten, a TTI software like Midjourney, which has digested millions of pictures of kittens, will create a picture of some kitten in the desired style. The ingredients that make the magic possible are essentially the source material (photos of cats), the way it is labelled (“this picture is a picture of a cat jumping”), and the computational power of the machine. That’s impressive, but as Inke Arns writes in HumaniTies and Artificial Intelligence: “[Artists often] point out that AI is not something that magically acts on its own, that AI – despite the misleading name – is not something that “thinks” on its own, or is even “intelligent””. At the moment these kinds of software, no matter how powerful, still have limitations. Their potential is constrained by the training undergone, self-censorship (to which I will return), and some features which are still currently missing. Nonetheless, these technologies are still in their infancy, and it is easy to predict that complex image generation software will emerge in the future.
This brings me back to the big questions. The first is whether this medium that allows us to achieve so easily any visual fancy can make artworks – and the catch is in the “so easily,” which is the same mistake that happened with photography in the past. In fact, even at the time of the first daguerreotypes, intellectuals were asking whether it was possible for an instrument that so easily created a realistic representation of the world to be able to make art. Baudelaire had no doubt: no. His main fear was that the ease of realization would multiply immensely the number of cheap works – and centuries later we can say that he was absolutely right. But he was wrong in believing that this tool could not make works of art, and he was wrong because, trivially, he was not a good photographer.
By now we know very well that a camera is not enough to make a photographer – my partner, who is horrified whenever I take a picture of her, can confirm this. The camera is a complex tool, not only in terms of its technical use, but also and especially in terms of the choice, the framing, the eye of the photographer. Taking good photos, not to say artistic photos, is apparently easy, but among millions, or rather billions, of amateur photos only a small minority have risen to the empyrean quality of art. Quantity, in short, is not related to quality, and some photographs have as much value as other works of art, in spite of their reproducibility.
Creating an image with TTI technologies is perhaps even harder, because writing down what you want does not automatically generate the desired image, just like a click does not create a beautiful photo, let alone an artistic one. Of course, there are artistic techniques which are unquestionably more complex than others, and passing off a work made with TTI as a painting would be as wrong as taking a photograph and selling it as a hyperrealist oil painting. TTI, like photography, have a more accessible entry level, because they allow more or less anyone to produce decent images, unlike drawing. Exactly like photography, however, the top tier is also rarer. That everyone takes pictures but not everyone draws has not led to a proportional increase in excellent photographers, who are as rare in this field as in painting. TTI technologies, as well as computer graphics, are probably similar: to create a layout for a magazine with InDesign is much easier than doing it manually, but this doesn’t imply that it then becomes an easy task.
Here one might ask for a clear-cut discrimination or definition of “art,” but that this task is difficult or even impossible does not mean that art does not exist. Rather, the opposite is self-evident, for through a complex movement over time of relationships among critics, artists and audiences, a nuanced and changing canon has emerged, according to which we attach or remove the label of “artwork” to photographs. It is not an objective or formalizable limit, but it exists as part of our cultural practices. Even forcing this boundary is part of the art practice, as taught by Duchamp’s Fountain, which became established as a work of art where similar objects are, well, urinals.
If we look at the distant past of photography, we will also notice that even the fears of those involved in painting have proved unfounded, although the impact of this technique has led painting to other, more abstract and conceptual shores. On the other hand, it is difficult for one medium to exile another if its functions are not entirely replaced and improved. Painting is a cultural product with different functions and features than photography and that is why it survives, just as paper books are still alive despite the advent of e-books. The reason? They are media with different pros and cons, notwithstanding their apparent similarities.
Is it true art?
Is it therefore possible to create works of art with new software? My answer is undoubtedly positive. In this regard, the technical differences that exist between them, on the one hand, and photography and computer graphics, on the other, are irrelevant, since it’s a new medium to which we can refer only metaphorically, and because to those who use Stable Diffusion, Midjourneys or Dall-e it is self-evident that creating works of art with it is not at all easy. The fact that in the gigantic production of TTIs images there are still very few works of art paradoxically shows that it is possible to make them.
We could argue that such software is limited by the images it has “digested” and that it will therefore reproduce styles that, however suggestive, are based exclusively on existing ones; in short, the element of novelty, typical of art, would be missing. Unlike the predictive algorithms, however, here the final say does not lie with the machine, but with humans. Imitating a painting by Vermeer with TTI will not create a masterpiece comparable to those of the Dutch painter, but what about mixing his style with that of other authors, feeding the machine with different and unexpected keyword compositions, noticing in the TTIs creations those that by some lucky mistake do not return a painting by Vermeer but something new, and carrying it forward into new variations? In short, recognizing and using the machine’s outputs will lead the only intelligences in the field, namely our own, to invent something new.
The creative potential of these new tools lies – as is often the case in art – mostly in its mistakes, and it will be up to us not to regard them as such, but to find new and unexpected paths through them. Of course, the tool is undoubtedly limited to what it has “digested” (even if we can intervene in its dataset), but it’s a mostly quantitative difference anyway, because in creating works of art we are also bound by the visual data we have experienced, and no matter how incomparably greater our data bank is, our creative skill is still bound to our perceptual past. If Picasso had been born five hundred years earlier, he would still have become a painter, but certainly not the one we know, because he would not have had access to the artistic revolutions of the centuries to come. A work of art never has just one author.
Now we get to the second big question: are these programs co-authors or tools? My opinion is two-sided, because, on the one hand, I think they are no more co-authors than a brush is, and, on the other hand, that the latter is much more of an author than we think. But to explain what I mean I have to take a small step back.
Software programs, as I have had occasion to write elsewhere, are not human. They are devoid of any autonomous propulsion of originality and were created by humans for exquisitely human purposes. If in the future a genuine artificial intelligence becomes autonomous and evolved enough to want to create a work of art, it is likely that we will not understand it; indeed, perhaps we will not even notice it, because if and when AIs such as these will develop a human-like intelligence, it is plausible to imagine that such an intelligence will be completely alien to our own, because of the enormous structural differences between us.
At the moment we are dealing with tools, however evolved, which are as much co-authors as oil painting, the camera or Photoshop are. Here, though, I would like to propose a reversal of the idea of authorship, to suggest that in a sense even a paintbrush is a co-author, because, as Heidegger suggested, there is no such thing as a neutral technology. Oil painting, as well as photography and computer graphics, incorporate within themselves a dense network of theoretical and technological knowledge, stylistic choices, limits and potentialities that derive from the practice of many people who over the years, centuries or millennia have worked with them. More than dwarfs on the shoulders of giants, we are dwarfs among many other dwarfs, and what we make is made possible and constrained by the discoveries and decisions of others – not only stylistic or poetic choices, but also technical and methodological ones. Anyone who has had the opportunity to draw or paint in any medium knows how material and tool are simultaneously a constraint and a creative opportunity. There’s a generative symbiosis with blurred boundaries between the artist and their medium, because the tool is not an inert object; it stands on the legacy of those who have used, developed and modified it before us. The tool is a magic wand that possesses a will of its own, with which to come to terms, because it internalizes and bequeaths ancient knowledge that becomes explicit only through use – which is why to the second big question I would answer that, yes, this software is a co-author, but neither more nor less than a paintbrush is.
Chaos and censorship
TTI can create illustrations, drawings, stylized images and digital paintings, as well as photographs, invented but extremely realistic, such as “a sad man with a pigeon.”
As one can easily imagine, this can lead to trouble. In the future, if anyone is able to almost perfectly falsify any kind of image in immense quantities, the testimonial value of a photo, which was already in crisis with the development of digital graphics, will reach zero. It will therefore be increasingly difficult to cross-reference data to discover the truth of events that happened at a distance from us, which will lead us in turn to distrust the documentary value of images, and, conversely and more likely, to return to the testimonial source, whose reputation will be worth more than easily falsifiable evidences. To avoid this risk, along with commercially dangerous controversies, TTI systems operate a strict censorship policy, banning terms and images that might offend… well, pretty much anyone. Anything that might shock a conservative and a liberal, a homosexual and a homophobe, etc., is banned. Companies don’t want any trouble, that’s clear, but it is easy to imagine that in the future these limits will be overridden by less shy competitors like Stable Diffusion, which, as an open-source project, lets users work around censorship.
Upstream of such censorship, it is not difficult to see a confusion between the idea of fantasy and real images. What is the harm in inventing scenes of violence, sex, cruelty and anything else? Art has been making a living with them for centuries. As St. Augustine said, a man is not responsible for his own dreams, and denying even the possibility of thinking evil certainly does not keep it away.
The discourse changes if we talk about the dissemination of risky images, especially if they are passed off as true. Suppose, for example, that right now I am thinking about a scene with sexual content and I decide to take a pencil to turn my imagination into a picture to store it in my drawer. It seems perfectly legitimate to me to do so, and in fact it is. What would change if instead of a pencil I used a TTI to bring my imagination to life? Of course, if I started spreading these images among non-consenting people or even passing them off as real, I would be committing serious harm. But to deny a home to sex, violence and horror in our imagination is, the Puritans will forgive me, simply ridiculous – and harmful.
This does not mean that the problem related to the dissemination of false or unwelcome images is to be underestimated. Far from it, the problem at stake is an important issue toward which good countermeasures must be thought of, but without letting ourselves be distracted by counterproductive moralisms. As Alessandro Y. Longo wrote in his reports for Reincantamento, the most serious problems of artificial intelligences are rather to be found in their environmental impact and in the increase of inequalities due to the closed nature of most software. Timnit Gebru calculated that about 300 tons of CO2 are emitted to train a large language model – a considerable and worrying quantity. Moreover, there are employment and copyright issues that are currently hotly debated. Once again, we are in danger of focusing on the smaller problems in order to ignore the more substantive ones – which are less visible because they are inherent in the flaws of our society.
As Baudelaire has shown, even for much better minds than mine it is risky to give a verdict on any new technologies, and I don’t know if this text will end up being classified among the prophetic ones, or among the ones who got it wrong. If I must take a chance, however, my guess is that as soon as these tools improve, they will make for a revolution in the field of art. Not only in this isolated field of course; as happened with photography, the earthquake will be felt everywhere. It seems clear to me that we are facing a small revolution, and that it must find us excited, curious, and well prepared for the inevitable dangers.
Note: Image by Francesco D’Isa / Midjourney. The original version of this article was published here.