AI Art is great, but who owns the copyright?
Published:
A tech company on an isolated island in the Pacific ocean in the style of cyberpunk. Generated using Stable Diffusion with the API provided at HuggingFace.
On August 26, 2022, Jason Allen won first prize in the category of digital arts/digitally-manipulated photography at the Colorado State Fair fine arts competition with an AI-generated artwork entitled “Théâtre D’opéra Spatial.” Allen created the art with a proprietary AI model called Midjourney featuring Text-to-Image Synthesis. This news leads to heated discussions online, primarily on the reflections on AI art compared to traditional art by human artists. This is not the first time a piece of AI artwork has beat traditional artwork. In 2018, Edmond de Belamy — a portrait painting produced by a generative neuron network – was sold for $432,500 at an auction by Christie’s. The great success of AI artwork brings up a critical ethical question that has been repeatedly asked in the field of Artificial Intelligence: who owns the copyright of the AI artwork? The practitioners of AI models who generate the artwork, or AI models, or AI companies or researchers that propose the AI models? In this essay, we shall investigate the copyright issues involved in AI Art. We will then introduce and classify AI Art production into Image-to-Image Translation and Text-to-Image Synthesis. After a discussion on the potential ethical and social consequences of copyright infringement, we will approach the copyright issues by inspecting the creativity involved in an AI artwork. Based on a clear definition of creativity, then, we shall carefully investigate the source of innovation in a piece of AI artwork and determine the authorship accordingly. In the end, we will see that the copyright issues in AI Art shall be resolved in a case-by-case pattern due to the complicated procedures in generating the AI artwork.
AI creation of Art, or AI Art, refers to the emerging field that utilizes Artificial Intelligence models to generate artistic images. Starting around 2015, the field has undergone exponential development, with technological advances in two broad categories: Image-to-Image Translation and Text-to-Image Synthesis. One of the earliest works in AI Art is the DayDreams model developed by Google Research in 2015 (Olah and Tyka, 2015). By maximizing the activation functions at particular layers of the convolutional neuron network, the DayDreams model can produce hallucinatory images. Another early work, Neural Style Transfer (NRT) was introduced one year later to transfer the style of well-known artworks through content and style reconstructions (Gatys et al., 2016). Around the year of 2017, then, the Generative Adversarial Network (GAN) transformed the field with its high performance in Image-to-image Translation. Specifically, GAN is composed of a generator and a discriminator. The objective is to maximize the ability of the generator to fool the discriminator by generating fake images that cannot be distinguished from the real ones and to minimize the error of the discriminator in detecting the fake from real images simultaneously. These two objectives are optimized iteratively, and the generator can generate indistinguishable fake photos after training. In 2017, researchers devised methods to train the mapping between two domains of images with distinct styles using GAN. This way, photorealistic pictures can be produced from outline sketches (Isola et al., 2017), and images can be translated from one kind to another (Zhu, 2017). Starting in 2018, the StyleGAN model – another GAN-based model featuring style transfer – took the lead in this field. The latest version of StyleGAN can generate high-quality human faces that do not exist based on a variant of the GAN architecture (Karras, 2019). Researchers also proposed the Creative Adversarial Network (CAN) based on the GAN architecture. The paper discussed one psychological definition of creativity and modified the generative adversarial network to achieve creativity (Elgammal, 2017). The advantage of the GAN-based model did not last long, as more powerful architectures developed recently offer the option of Text-to-Image Synthesis in AI Art. Compared to Image-to-Image Translation, Text-to-Image Synthesis technology gives more freedom to the users in Art creation, since it enables the practitioners of AI models to generate images from their text prompts. A prominent example in Text-to-Image Synthesis would be the DALL-E model. Based on the Contrastive Language-Image Pre-Training (CLIP) model, DALL-E allows users to generate images of specific semantics and style (Ramesh, 2021). Midjourney, a research lab that builds black-box AI models similar to DALL-E, even commercializes the generation of AI Art and targets modern artists. The latest model in the field of Text-to-Image Synthesis, Stable Diffusion – an open-source latent diffusion model – enables users to synthesize high-resolution photorealistic or artistic images on their devices with modest computational costs.
Ethical issues centered around the authorship of AI-generated artwork arose as the field of AI Art developed. Up to now, the copyright has been in the person who came up with the text prompts and generated the image using AI models. However, if little or no human creativity was involved, should the image be credited to the model itself or the researchers that built the model instead? Additionally, underlying models like GAN synthesized images from the latent space, which had essentially been determined by the training images’ distribution. If the training images contained artwork uploaded by other artists, the authorship that was attributed to the AI model practitioner might also be improper. However, challenges in assessing human creativity in AI Art complicate the situations. Specifically, it is difficult to determine if the artwork should be attributed to human creativity in all three components of a typical pipeline of Text-to-Image Synthesis (Oppenlaender, 2022): First, the model is often proprietary. It means that we do not know if the system is adopted from a pre-trained model, where no or little creativity is involved, or if the parameters of the system are carefully adjusted or tuned towards the generation of a specific artwork. Second, human prompts to the model may also be inaccessible to the public, which impedes the evaluation of the novelty of the text prompts. Third, more is needed to know about the whole process of generating AI artwork. Even though the image is artistic, we are unaware of if and where human creativity comes into the picture. All these factors make it difficult to assess the human creativity involved in AI artwork, which implicates a straightforward solution to copyright issues in AI Art.
If not addressed properly and timely, the copyright or authorship of an AI-generated artwork would have severe ethical and societal consequences. A direct consequence would be the displacement of traditional human artists in the job market. In an online post by Len Aoi, the author documented the recent debates surrounding the AI model’s application in generating illustrations for Japanese Anime. After the “mimic” – an AI model that generates illustrations based on uploaded images – was released on Twitter, the tweet received over 40,000 retweets, and the developers became “the subject of criticism and abuse on social media” (Aoi, 2022). One excerpt from a tweet in response to this event commented that “seeing people nonchalantly use an approximation of my work that I didn’t draw would break my mental health to where I couldn’t draw. That would probably result in me losing work.” The potential copyright infringement problems finally led to a temporal suspension of the “mimic” service, and activism against AI Art like “No AI learning” started trending on Twitter in Japan. Their concerns are indeed legitimate and urgent. Suppose copyright infringement in AI Art is left unregulated, AI practitioners or companies are able to acquire the copyrights of beautiful AI artworks more or less unethically. The unfair advantages of these AI practitioners may allow them to win competitions against human artists easily and eventually threaten the balanced ecosystem of Art creation. Even worse, AI Art is starting to shift and even shape people’s perception of art regardless of the ethical issues. GAN-based models aim to produce fake images indistinguishable from real images, and state-of-the-art models like Stable Diffusion can provide us with high-quality exotic pictures that outperform many human artists in terms of details and imagination. AI-generated Art has gradually become the mainstream fashion in many communities. Before AI Art can exert a more significant impact on our society, we must carefully inspect the ethical issues entangled with it, especially the authorship of the AI artworks.
Analyzing the copyright issues in AI Art starts with a discussion of creativity. In particular, the definition of creativity and the source of innovation in a piece of AI artwork determine who should be granted the authorship. While creativity might be an abstract concept that is difficult to evaluate quantitatively, there have been attempts to study creativity in AI Art formally and systematically. For example, the authors of the Creative Adversarial Network (CAN) define novelty in terms of the intensity of “arousal potential” motivated by the theory of D. E. Berlyne. In particular, novelty requires a moderate level of “arousal potential.” Hence, the Art generation agent they proposed aims to “increase the stylistic ambiguity and deviations from style norms while simultaneously avoiding moving too far away from what is accepted as art” (Elgammal, 2017). According to this definition, they modified the architecture of GAN so that the generator can produce images that cannot be classified into certain styles (e.g., Impressionism, Baroque) by the discriminator (the generator and the discriminator are two parts of GAN). Despite the articulated concept of novelty, however, discussions about the source of novelty are not found in the paper. Arguably, it is very likely that the source of creativity may still lie in the randomness embedded in the inputs of the generator, a typical pattern for images produced by GAN-based models. In this case, the creativity element shall be attributed to the CAN model itself rather than the practitioners who use the model to synthesize images. Here, we investigate the source of innovation in AI Art by adopting Rhodes’s “Four P of Creativity” model: person, process, press, and product (Rhodes, 1961), and we use the Text-to-Image Synthesis approach of AI Art production as an example. To begin with, to study the element of creativity in the practitioner or the person who provides prompts to the Text-to-Image Synthesis model and generates images from it, we shall analyze the prompts themselves. According to Oppenlaender in “The Creativity of Tex-to-Image Generation”, “prompt engineering” requires specific skills and domain knowledge, with which the practitioner can develop practical and creative prompts and, therefore, better employ the power of the AI models to generate high-quality and imaginative pictures (Oppenlaender, 2022). It means that the product, or the artwork itself, is no longer the only metric to evaluate the creativity of AI Art. Instead, the text prompts that provide the semantics and styles of the pictures should also be considered. Then, the second aspect, the process of production of AI Art, also proves to be an essential factor in evaluating creativity. More specifically, a piece of successful AI artwork may need iterative prompt engineering and careful image curation (Oppenlaender, 2022). Through iterative prompt engineering, the skeleton, color, texture, and other details are gradually added, so the final output partially depends on when this iterative process is terminated. Moreover, the practitioner, or even professional AI artists, may not display all the generated images to the public but rather a carefully curated portfolio. In this sense, the two steps in the production of AI Art influence the outcome picture presented to the audience, where original thoughts and creativity may make a difference. The third P, press, refers to the emerging communities featuring Text-to-Image Synthesis or general AI Art. Online platforms such as AIArtists.org enables newcomers to learn AI Art from established and professional AI artists and create their artwork easily. More importantly, by providing the media for AI artists to communicate and learn from each other, these communities can foster a creative environment for creating novel AI artwork. Last but not least, the product, or the AI-generated image, may also exhibit the originality of AI Art. However, we must pay close attention to distinguishing between human creativity and machine automation. Similar to the Creative Adversarial Network above, AI models in the field of Text-to-Image Synthesis are capable of generating exotic images from uninspired text prompts, meaning the produced images appear to be creative despite the lack of imagination in the text prompts. In this case, the output images shall not be regarded as innovations of the practitioners.
The copyright issues would be more straightforward with a clear definition of innovation and with the source of creativity settled down. In the case of “Portrait of Edmond Belamy,” which is generated by CAN, the GAN-based model, the chances are that the innovation comes from the model and training data rather than the practitioner (Cetinic, 2022). Therefore, if the practitioner used a pre-trained model without adjusting the parameters to get the image, it is unethical for him or her to claim the authorship of the painting and receive the award. In the case of Jason Allen, who won first place in the category of Digitial Arts/Digitally-Manipulated Photography at the Colorado State Fair 2022 with an image that was created by the AI program Midjourney, the situation is more complicated. Since Midjourney is a proprietary model, and Allen refused to release the text prompt, it would be impossible to analyze the prompts and the model used, two criteria in the “Four P of Creativity” applied to Text-To-Image Synthesis. Even though Allen won the award and his copyright is acknowledged, the lack of transparency makes this case controversial.
However, the copyright of AI Art based on the source of creativity remains a “blind spot” in regulations and laws related to AI ethics until recently. Artificial Intelligence itself has been generally rejected to be granted the copyrights of any products in the United States and European countries (Recker, 2022). Specifically, the US Copyright Office recently ruled that “human authorship is a prerequisite to copyright protection” for AI-generated art and refused to grant copyright protection for images made by AI. This law, however, is currently being appealed against in a federal court (Chatterjee, 2022). One opponent to the new regulation, Ryan Abbott, argued that creative works of Artificial Intelligence are deserved to be protected by law just as the intellectual labor of the human mind (Recker, 2022). Despite the debates surrounding the role of Artificial Intelligence as patent owners, few existing laws or regulations have emphasized and scrutinized the factor of human creativity in granting the copyrights of AI Art. In this case, this is exactly the situation where ethics shall guide the law and create new laws in this emerging new field. AI companies and researchers also started to realize the copyright issues on the technology side and have been taking action. For instance, in the “mimic” model mentioned before, the term of use “prohibit[s] uploading the work of others and ask users to only upload their work or illustrations they hold the rights to” (Aoi, 2022). Moreover, Midjourney’s terms of service state clearly that Artificial Intelligence is trained to produce the assets. It acknowledges that the assets might be similar to copyright-protected materials, and the rights of the authors of these materials are respected. Similar sentences also appear in the policies of OpenAI API, the company that built the DALL-E model. In general, the AI Art community has gradually become aware of the ethical problems in this field. AIArtists.org, a large online community for artists interested in AI, dedicates a whole page of resources to AI ethics. Specifically, it lists “potential harms from automated decision-making,” such as differential access to job opportunities in the form of filtering job candidates by race or genetic/health information on the individual level. However, much of the information provided about AI ethics on this website is tangential to our discussion regarding the newly emerged copyright issues of AI Art.
Undoubtedly, more actions need to be taken to protect the copyrights of AI Art adequately. Due to the complicated procedure of AI Art production, especially in the context of Text-to-Image Synthesis, a pragmatic approach should be adopted to determine the source of innovation and then the copyright case by case. For each specific case of AI artwork, we shall determine the source of creativity, probably based on established criteria like Rhode’s “Four P of Creativity” mentioned above, before giving the authorship to the right person. One feasible way to achieve this in practice is to require the person who claims the authorship to provide proof of the source of creativity himself or herself. The Copyright Office then shall review the required information with close inspection. Specifically, since few people actually have experience in copyright issues in the newly emerging field of AI Art, specialists that have dealt with similar cases in other subfields in Artificial Intelligence, AI artists that are familiar with the AI artwork production pipelines, and researchers that understand the technology of AI Art generation are potential candidates for the review committee. Then, if the supporting materials can prove the source of creativity is from the person himself or herself, the copyright shall be given. In particular, for the sake of fairness, the review process should be conducted in a “double-blind” pattern: the supporting materials submitted to the review committee should be anonymized, while the identity information of the reviewers should not be released to the public as well. In addition, the review process should include verification of the supporting materials, simulation or reproduction of the procedure to generate the AI artwork following the supporting materials, and evaluation of human innovation in comparison to creativity due to machine automation. Suppose this procedure is too complicated to implement in reality. In that case, the copyrights should at least be granted conservatively: without clear evidence that the AI-generated art is the result of a person’s intellectual labor, the artwork is not copyright-protected. For AI companies and researchers, copyright issues should deserve at least the same attention as privacy rights. In practice, this requires engineers to check the copyrights of the training data when collecting the dataset, explain the algorithm of the AI model to the public clearly, develop methods to detect potential copyright infringement by practitioners, and more. For example, a model to compare the input, in the form of text or image, against existing copyright-protected materials online can be embedded at the beginning of the production of AI Art. Researchers can also develop more advanced models that output summary statistics in each step of AI Art generation to help the AI practitioner decide whether a particular AI artwork is a creative work of his or her own. For instance, information like the semantic similarity between the user’s text prompt with existing text prompts online, the difference between the parameters used in the current production process between the default parameters, and the similarity between the output image with the images in the training dataset can be provided for the user as references.
References:
Cetinic, Eva, and James She. “Understanding and creating art with AI: Review and outlook.” ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 18.2 (2022): 1-22.
Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. “Image style transfer using convolutional neural networks.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
Isola, Phillip, et al. “Image-to-image translation with conditional adversarial networks.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
Zhu, Jun-Yan, et al. “Unpaired image-to-image translation using cycle-consistent adversarial networks.” Proceedings of the IEEE international conference on computer vision. 2017.
Karras, Tero, Samuli Laine, and Timo Aila. “A style-based generator architecture for generative adversarial networks.” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019.
Ramesh, Aditya, et al. “Zero-shot text-to-image generation.” International Conference on Machine Learning. PMLR, 2021.
Elgammal, Ahmed, et al. “Can: Creative adversarial networks, generating” art” by learning about styles and deviating from style norms.” arXiv preprint arXiv:1706.07068 (2017).
Oppenlaender, Jonas. “The Creativity of Text-based Generative Art.” arXiv preprint arXiv:2206.02904 (2022).
Rhodes, Mel. “An Analysis of Creativity.” The Phi Delta Kappan, vol. 42, no. 7, 1961, pp. 305–10. JSTOR, http://www.jstor.org/stable/20342603. Accessed 18 Oct. 2022.
Aoi, Len, “New AI image generating service in Japan stirs debate. Artists decry their work being used for AI art generation.” AUTOMATON, Active Gaming Media Inc., 31 Aug. 2022, automaton-media.com/en/nongaming-news/20220831-15350/
Chatterjee, Poulomi, “Can Text-to-Image AI Learn Ethics —or Is the Future Doomed?” Analytics India Magazine, 30 Aug. 2022, analyticsindiamag.com/can-text-to-image-ai-learn-ethics-or-is-the-future-doomed/
Recker, Jane, “U.S. Copyright Office Rules A.I. Art Can’t Be Copyrighted.” Smithsonian Magazine, 24 Mar. 2022, www.smithsonianmag.com/smart-news/us-copyright-office-rules-ai-art-cant-be-copyrighted-180979808/