AI, Artists, and the Future of Images

An Introduction to Vilém Flusser, and thoughts on on AI art

Nov 01, 2022

Portrait of Vilém Flusser, created with Midjourney

In recent months, AI text-to-image artworks have been flooding the internet, releasing a deluge of discourse around the role of artists in a rapidly changing world. I’ve been revisiting the 1985 book Into the Universe of Technical Images by Vilém Flusser, a philosopher and media theorist who I first encountered in the context of film, but turns out to be shockingly relevant to the current wave of AI image models and the questions they raise about creativity, art, and labor. A close reading of Flusser’s prophetic text can help to answer some of these questions, and to clarify the role of artists in the fast-approaching future.

Traditional Images, Technical Images

In broad strokes, Flusser’s account of cultural history can be summed up:

From traditional images (2D art images made by hand, like cave paintings),

to linear texts (i.e. written language works, like the Bible),

to technical images (images created mechanically by an apparatus, like a photograph).

Flusser lays this out using the model of a ladder as the historical progression between these steps:

“First rung: Animals and “primitive people are immersed in an animate world, a four-dimensional space-time continuum of animals and primitive peoples. It is the level of concrete experience.”

Here we can imagine the apes in 2001: A Space Odyssey, proto-humans fighting for survival in a new world.

“Second rung: The kinds of human beings that preceded us (approximately two million to forty thousand years ago) stood as subjects facing an objective situation, a three dimensional situation comprising graspable objects. This is the level of grasping and shaping, characterized by objects such as stone blades and carved figures.”

The apes have developed tools, giving them an edge over the beasts.

“Third rung: Homo sapiens sapiens slipped into an imaginary two-dimensional mediation zone between itself and its environment. This is the level of observation and imagining characterized by traditional pictures such as cave paintings.”

Some stability has set in, the origins of civilization, the dawn of human expression. Traditional images will reign for thousands of years. These are mythical images, in which the makers of images tried to “reduce their subjectivity to a minimum,” and stick to established aesthetic codes within a community. Cave paintings and tribal masks all share a similar form–the aim is maintaining a tradition, not originality. They reflect a mythic consciousness, based on a cyclical conception of time.

“Fourth rung: About four thousand years ago, another mediation zone, that of linear texts, was introduced between human beings and their images, a zone which human beings henceforth owe most of their insights. This is the level of understanding and explanation, the historical level. Linear texts, such as Homer and the Bible, are at this level.”

Linear texts mark the beginning of civilization and history as we know it, “for only then did imagination begin to serve (and oppose) conceptual thinking, and only then did image makers concern themselves with being original, with deliberately introducing new symbols, with generating new information.” The one-directional form instigated a linear conception of history–time progressing forward towards the future.

“Fifth rung: Texts have recently shown themselves to be inaccessible. They don’t permit any further pictorial mediation. They have become unclear. They collapse into particles that must be gathered up. This is the level of calculation and computation, the level of technical images.”

It’s the final, fifth rung where we find ourselves, where advances in science and math have scattered linear text into points, bits of abstract data, mathematical formulas and binary code. No longer flowing linearly, they form an interconnected web that scatters in all directions.

“In my terminology I say that before the invention of writing, people thought in a prehistoric way, and after the invention of the alphabet, historical consciousness was elaborated. And now we are in the process of elaborating a post-historical, structural way of thinking.”

AI text-to-image models bring the dialectical relationship between text and image into a full circle synthesis, with the input of text leading directly to the output of technical images.

“Images of our time are infected with texts; they visualize texts.”

Particles & Pixels

While traditional painted images have a cohesive physicality, technical images are ephemeral arrangements of particles: whether it’s the photons captured by the photosensitive chemicals or sensor of a camera, the pixels lighting up on a monitor, or the gaussian noise patterns that AI images emerge from, technical images are reorganizations of particulate data that can only ever achieve an illusory wholeness. They are always reducible to the particles that constitute them.

“With technical images, it is about first programming the computation of particles, then deprogramming them to convert them into informative situations. It is about a gesture that takes place in a particle universe, with fingertips touching keys, and the structure of this gesture is as particulate of the structure of the universe, that is, it consists of clear and distinct mini-gestures.”

Flusser here is referring to cameras and other technology of his era, but the same principle applies pretty directly to the way that an AI is trained on a dataset of images, creating a network of trained “particles” floating in the cloud that are then reassembled into new images via prompts. So we can see AI text-to-image models not as a brand new phenomenon, but the natural apogee of the technical image as particle reassemblage. The difference, as compared to other technical images like photographs, is not of kind, but degree–the scale of access, ease of use, and number of possible images (infinite).

“Producers of technical images, those who envision (photographers, cameramen, video makers), are literally at the end of history. And in the future, everyone will envision. Everyone will be able to use keys that will permit them, together, with everyone else, to synthesize images on the computer screen.”

AI image synthesis is still in its infancy, but it is advancing with remarkable speed, and we’re not far off from a future where everyone will be using AI to generate images, all the time. If that’s the future, or even the present, where can we situate the role of existing artists now and going forward?

The Myth of the Artist

“Images that can be telematically manipulated could give rise to an art that is still inconceivable, a pictorial dialogue infinitely richer than linear, historical dialogue could ever have been. Such a society, in dialogue through images, would be a society of artists. It would dialogically envision, in images, situations that have never been seen and could not be predicted.”

The imminent future, where anyone has the ability to generate high-quality images or videos with a few keystrokes, threatens the status of the “Artist” as a singular creator, an individual who has put in the hours to develop their technical skills and unique point of view. This existential disruption to the concept of the artist is at the center of controversy around text-to-image models. Dialogues around the implications of this shift are passionately unfolding right now online, but Flusser shows us that the shift was already well underway at the time of his writing nearly 40 years ago. Motivational structures of capitalism, technological progress, and the internet have been shifting the roles of artists and their livelihoods for decades, and AI art is just the latest manifestation of these same forces. But, being the inflection point of this epochal shift, it could actually provide an opening towards something genuinely revolutionary, rather than another bandaid on the shaky foundations of institutions crumbling beneath our feet.

“The social structure that is now appearing represents a synchronization of radiating images with the dispersed, lonely, depersonalized people who sit at the terminals of these rays. Revolutionary visualization tries to replace this structure with another in such a way that the images bring new interpersonal relationships into being and lead to new social configurations, the names of which remain unknown for now.”

Flusser is talking here about the current (in the 80s) state of centralized distribution channels broadcasting externally in one direction (i.e. TV, film, radio), giving way to something more decentralized and participatory. Writing on the cusp of the invention of the internet, he could already see that technical images would be the new means of communication, eventually superseding linear text. As we can see with emojis and memes, declining literacy rates, the rise of video platforms like Youtube and Tik Tok over written content, and the prominence of image-sharing platforms like Instagram, Tumblr, and Pinterest, its inarguable that the internet has brought the global populous into dialogue through technical images.

These new platforms have been disrupting long-held values about where and how meaning is found and shared in art, particularly for new tech-literate generations. The era of “Great Artists" with huge audiences has been declining for some time–the internet has fostered a culture of interactive, interconnected niche communities, rather than a one-directional monoculture. And the old institutions that have until now determined and regulated value have been slow to fully adapt. AI will accelerate this fragmentation and make the future even more granular–rather than a list of Great Artists peppered across a linear timeline, as in the 20th century, every moment will be exploding with Artists whose work resonates and cross-pollinates in all directions to smaller and smaller niches. With the models for how artists make money, how audiences spend money, and the role of outside organizations in a state of flux, the constant that remains is art itself, and its ability to generate meaning for practitioners and audiences.

“The central problem to be discussed with regard to a dialogic society is that of generating information. It is this problem that was called ‘creativity’ in former times. How do we get information that is unpredictable and improbable? It looks as though it suddenly appears from nowhere, as if it were a miracle. Hence the concept creation ex nihilo; hence the belief in a creator god; and once the veneration of creative people, above all so-called artists. The problem of generating information must be lifted out of this mythologizing concept to grasp the revolutionary possibilities of a telematic society, a true information society” (87)

Online, the question is often framed as an existential battle between traditional (albeit frequently digital) artists and AI users, humans and machines. It’s understandable, but Flusser allows for a more optimistic framing, reversing the existing paradigm of artists as isolated, singular individuals uniquely qualified to do creative work. Contesting this account of art history, Flusser proposes that abandoning this myth of the artist will liberate everyone to have access to the same power of creation previously limited to “Artists.” While we, artists, can lament the loss of our special status earned through decades of hard work, Flusser pushes us to focus on the flipside; the liberation of everyone gaining the ability to create images. Seeing it not an act of subtraction, erasing artists and their role in society, but of addition. To fully understand the stakes of this shift, we should first get to the bottom of what makes an Artist, and the core of what defines artmaking.

Artist As Curator

“For from now on, human freedom no longer consists in being able to shape the world to one’s own desires (apparatuses do this better) but to instruct (program) the apparatus as to the desired form and to stop (control) it when this form has been produced.”

More than any other medium, AI text-to-image synthesis lays bare the truth that all art is really curation. An artist absorbs influences from the world–from experiences, memories, other artworks–and synthesizes them into a new from via a medium.

Traditional images embody this translation of mental labor into physical form most apparently: when we look at a painting, we perceive the direct results of a human hand, we are apprehended with a record of concerted physical effort that occurred in a specific time and place. Formal choices are immediately apparent in the final image.

Technical images like photographs also reveal the hand of the artist, but more abstractly and indirectly: when we look at a photograph, we perceive a vision where the human hand is absent, because the mediums of technical images are black boxes, where programs happen invisibly. Rather than brush strokes applied over time, a photograph is created in an instant, so the human touch is felt retroactively in every choice made by the artist/operator leading up to the moment they press the shutter button–at which point the program inside the black box of the camera runs automatically.

AI images work in the same way–an operator makes choices in the construction of their prompt, and an image is produced automatically with a final keystroke. The technical mechanics are different than photography, but the process is structurally the same–the sum total of human decisions leading up to the moment that a program is run in a black box.

We can define these decisions in the creation of images as curation–the conscious narrowing of infinite choices towards an intentional desired outcome. Curation happens simultaneously in both the artist and their medium, and both feed off each other. An artist curates their inner self to shape the moment of creation, and the medium’s intrinsic qualities curate the qualities of the image that is produced. An artist limits themselves by their choice of medium, and the medium limits them in turn by its constraints. Art is what emerges in the space where these two levels of curation harmonize.

Obviously, on a certain level we’re all artists on a daily basis every time we idly doodle or snap a picture with our phones. But what we generally recognize as an “Artist” is someone who has, with time and practice, curated a depth in both their inner self and their medium of choice to produce a special harmonious result that generates meaning and emotion in an audience. The most significant artists will do so in a way that is unique to them, curating outcomes that could recognizably only come from them.

The barrier to entry is not identical across mediums–anyone picking up a camera for the first time could quickly press a button to take a photo, but not anyone could pick up a violin and produce a sound, let alone a song. In either case, the endgame of Artistry is never guaranteed by mastery of a medium, because achieving harmony relies on a self that is curated to meet it. Someone studying photography for decades could still lack the inner depth to produce images that resonate with an audience, or, someone who has only a basic grasp of the violin could produce music that brings an entire concert hall to tears.

Every medium has a different resonance, a different ratio of the depth they require a person to bring to them to reach harmony. AI text-to-image art has perhaps the smallest gap between effort and quality of outcome, essentially putting the mastery of medium at your fingertips–but in turn, by cheapening the time to curate and produce, it raises the bar for the depth of self required to express something meaningful that can stand out among millions of other images. As with most art, the vast majority of AI images right now are uninspired, derivative, and repetitive. But there are exceptions, and they have the potential for much more.

The constant dance between the curation of self, medium, and harmony–and their relativity across time and culture–is what makes Art elusive, exciting, and difficult. It’s this intangibility and rarity that has led to the veneration of Artists, the few who truly find meaningful harmony–the group which Flusser says will disappear. And on some level it may–but it won’t be by eliminating Artists. It will be via the rise of a medium that, by its intuitive and unlimited possibilities, expands the potential of everyone to be Artists, if they choose to rise to the occasion. From the standpoint of art as a creative practice, this is a great thing. But it does raise questions for those who hope to make a living on their arts, potentially giving a competitive playing field a lot more players.

Art and Capital

The definition of Artist gets murkier when we consider artists working within the context of capitalism; because art is not strictly speaking “useful,” there has always been an uneasy partnership between artists and capital, and most working artists’ labor is notoriously undervalued. We should acknowledge first that the system is already broken–AI isn’t barreling in out of nowhere to wreck a well-oiled machine that’s working great for everyone. The role of the working artist is always precarious when it comes to technological change–for example, the rise of CGI artists in Hollywood replacing practical special effects artists, matte painters, and stop motion animators. But while it’s true that some jobs were lost, it’s not so black and white–many of the artists of the old ways either adapted to the new technologies, or were able to leverage their experience to supervisory roles: their hard-earned knowledge, their depth of curation, still translated to the new context. Since the beginning, art has never been a stable career–change is the only constant.

There are a myriad of ways that people make money from artistic work, and there’s no one-size fits all answer to how AI will affect different industries. Many have argued that AI art in its current form directly threatens some categories of artistic jobs, like concept artists or storyboard artists–but businesses trying to save money are already outsourcing these roles overseas on Fiverr or cutting corners in other ways, and companies that care about art and people will find a way to use them. A human artist could become a luxury for the businesses who can afford it, an in-house prompt artist could soon be a job title, or the situation could remain relatively similar with existing artists just incorporating AI into their workflow as a tool for ideating or iterating. The exact mechanics of how things will shake out is unclear, but it’s certainly not going to be the case that overnight, trained artists can no longer find work because every company starts using AI for everything.

Whether it’s working artists trying to make a living, fine artists exhibiting in galleries, or hobbyists making art for its own sake, the core function of art remains; but wherever implemented, AI will make the speed at which it is produced and shared exponentially faster, making art more social, less precious and scarce, and endlessly modifiable. It’s the full transition from the one-way model of centralized broadcasting distribution channels, to a decentralized social network of communication, and plugging in is less like the lonely introspection of past artistic production, and more like playing a game.

Art as Play: The Rise of Homo Ludens

“The person of the future, playing at the keyboard, will be ecstatic about the creation of durable information that is nevertheless constantly available for a new synthesis… the person of the future will be absorbed in the creative process to the point of self-forgetfulness. He will rise up to play with others by means of the apparatuses. It is therefore wrong to see this forgetting of self in play as a loss of self. On the contrary, the future being will find himself, substantiate himself, through play. The “I” that eidetic reduction (and neuropsychological, psychological, and informatics analysis) has shown to be an abstract concept, to be nothing, will be realized for the first time through creative play.”

When speculating on the future of technical images, a word Flusser returns to again and again is “play.” Anyone who has practiced art can recognize that in its purest form, creativity is play–an unselfconscious, childlike engagement with the unlimited possibilities of a medium, rearranging elements to make something new. Anyone who has used a text-to-image model can recognize the same feeling, amplified and distilled. The models offer a striking balance of simplicity and depth, offering a much more instant gratification than other mediums. It’s incredibly fun, putting the play at the forefront and minimizing the frustration that often plagues the creative process. In combination with the networking possibilities of the internet, Flusser’s vision of the future is a collective ego-death where the social play of image-making supersedes the outdated isolated process of artistic creation.

“Future images will be at a high level because they will owe their production to the dialectic between theory embedded in the apparatuses and the intuitive hallucinatory power of the envisions… creation there will not be limited to a few ‘great people’ who produce informative works empirically by means of a lonely inner dialogue. .. instead everyone will participate in the creative process and test their intuitions and inspirations against the theories embodied in apparatuses, of whose riches we as yet have no inkling.”

It’s important that play is often social. Most people have had fun with the interactive nature of social media in some form or another. The way that many artists reach an audience now is by distributing their content on social media channels among various communities. The groupchat structure of Midjourney’s Discord channel is a fairly literal example of Flusser’s vision, with users sharing and interacting with each others’ images in real time. The web app DALL-E Mini (renamed Craiyon) had a viral moment on social media with users sharing 3x3 grids of low-quality generated images, often meme-y or pop-culture related. It was a perfect example of the kind of dialogic exchanges that Flusser predicted. The authors anonymous, anyone could participate in the play of creating and sharing images, and the stakes weren’t monetary. It was purely for fun.

Looking closer at the success of Craiyon, it’s important to understand why these images in particular resonated with such a wide audience, despite being “lower quality” than many of the shinier, cleaner DALL-E or Midjourney images that often just gather dust. The technical limitations of Craiyon force more emphasis on the curation of the self to create harmony. The lo-fi, uncanny nature of the images makes them better suited to funny prompts, as a vehicle for personality rather than beauty. Craiyon’s meme-like images prove that it’s not solely the technical mastery of images that makes them resonate, and sometimes it’s the opposite.

Flusser underscores the revolutionary potential of collaboration for creativity with the example of chess:

“The inner dialogue that was once so exciting can easily be simulated with chess. One sits alone at the board and alternatively moves the white and black pieces. Interesting, informative situations can result. But as soon as a second player joins in, it immediately becomes clear how limited the initial situation… under pretelematic conditions, including the present, the singular way of playing was responsible for almost all information (scientific, philosophical, artistic, or political). Telematics, on the other hand, will involve very many players in the game, and the playing competence will expand exponentially. All the information generated until now by great individuals (our entire cultural inheritance) will be regarded as relatively space in the future. Compared to synthetically produced information of the future, compared above all with future images, the culture of the past will appear as a mere starting point. It will become clear that a systematic, conscious creativity really begins with telematics.”

It’s in some ways a utopian vision, this transition of humankind towards homo ludens, (man the playful) but Flusser is clear that the utopian outcome is not guaranteed; he acknowledges the potential downsides of such a future, the anxieties of homogeneity and stagnation that are already front and center in the current state of culture. AI has the potential to either break open this stagnation, or be its ultimate fulfillment.

The Future of Images

“In their current first phase, technical images can still constantly renew themselves by feeding on history. But history is about to dry up, and this exactly because images are feeding on it, because they sit on historical threads like parasites, recoding them into circles. As soon as these circles are closed, the interaction between image and person will, in fact, become a closed feedback loop. Images will always show the same thing, and people will always want to see the same thing…”

Flusser’s thesis offers two possible outcomes from technical images–one utopian and one dystopian. We’re already in the midst of the dystopian descent into the closed feedback loop, and have been for a while; just look at the current state of Hollywood as an endless recycling of nostalgic franchise IPs, or streaming sites as data-driven, algorithmically-curated “content.” AI can either be the final nail in the coffin, or a way out.

“The current interaction between images and human beings will lead to a loss of historical consciousness in those who receive the images and, as a result, also to a loss of any historical action that could result from the reception of the images. But this current interaction is not yet leading to the development of a new consciousness, unless it changes radically, unless the feedback is interrupted and images begin to mediate between people. Such a rupture of the magical circle between image and person is the task we face, and this rupture is not only technically but above all existentially possible.”

Flusser’s prognosis evokes the stultifying content fatigue that we already face on the internet, which is about to escalate with the inevitable flood of AI content on the horizon. We risk drowning in it, but at the current inflection point, we still have the chance to steer things in a more utopian direction if we can see the situation clearly. The key is to see AI images not as a historical aberration, but an inevitable step in the revolution brought about by technical images. From this perspective, there is an opening for artists to embrace these new tools, push their unlimited potential, and break free from creative stagnation through new ways of envisioning, new ways of collectively imagining value and meaning.

A Spiritual Revolution

“Every revolution, be it political, economic, social, or aesthetic, is in the last analysis a technical revolution. If you look at the big revolution through which mankind has gone, let’s say the Neolithic Revolution or the revolution of Bronze Age, or the Iron Age, or the Industrial Revolution, every revolution is, in fact, a technical revolution. So is the present one.

But there is one difference. So far, techniques have always simulated the body. For the first time, our new techniques simulate the nervous system. So that this is for the first time, a really, if you want to say so, a really immaterial, and to use an older term, spiritual revolution”

-Vilem Flusser, 1988 interview

We are standing on the cusp of a revolution, a position that is by its nature difficult to predict. Such uncertainty requires tools for navigation, and I hope this brief introduction to Vilém Flusser can serve as a compass to help guide the conversation by adding some broader theoretical context to the AI art discourse, and offer an antidote to the despair and disdain so many artists are projecting onto the situation. We need to look head-on at the reality that AI art is here, and it’s not going away–but that doesn’t mean “art is dead.” Art has survived millennia of technical advances, and is arguably most vital and alive in these moments of transition. As long as there are humans there will be art. If we are to trust Flusser (and I think we should, given the accuracy of his vision), then we can be optimistic for the future of art and AI’s role in it, if we approach it with the right mix of thoughtfulness, caution, and hope.

JKBlues

Nov 6, 2022

This technology is scary only if one empowers it to be so. Art seems always to be one step behind technology. The technology of photography did not devalue still life or realistic artwork but it led to impressionism and abstraction. The utilization of AI will probably lead to art movements that are by nature more organic, individualistic and less reproducible by electronic means.

Expand full comment

ghobii

Nov 5, 2022

Thanks for this. As an artist, it's helped me resolve a lot of conflicting emotions on this topic, or at least better come to terms with them. The power AI brings to the creative process is one of the most inspiring things I've experienced, but at the same time I am unsure how society will absorb the huge changes it will bring.

1 reply by "C"ON"VER"S"E"

1 more comment...

"C"ON"VER"S"E"’s Newsletter

Discussion about this post