"Impressive." The word comes almost spontaneously out of one's mouth as one's eyes examine for the first time the images scrolling across the screen: a lighthouse perched on the rocks on which the waves break, an off-road vehicle speeding along a dirt road raising dust, the outskirts of Tokyo immortalized by the window of a running train on which the interior of the carriage is reflected, revealing the face of the girl author of the video.
In front of the images generated by Sora, the tool presented a few days ago by OpenAI, there are Dario Piga, Antonella Autuori and Matteo Subet: the former a senior researcher at the Dalle Molle Institute for Artificial Intelligence Studies USI-SUPSI and head of the Generative Artificial Intelligence CAS at the Department of Innovative Technologies, the latter research and teaching assistants at the Master's in Interaction Design at the Department of Environment, Construction and Design. Different paths, but a similar familiarity with generative artificial intelligences, of which Sora is the big news. In the race for applications of these tools, so far no one had achieved similar results in creating videos generated from text instructions. And although Sora, by OpenAI's own admission, is still far from mature for market entry, early demonstration videos hint at its potential.
"ChatGPT had impressed me with its ability to interpret queries, perhaps even more than with its ability to generate new texts," Dario Piga began. With Sora it is the opposite: I am impressed by its ability to simulate with surprising realism-though not always perfect-physical aspects such as motion, fluid dynamics, collisions, and object interactions. Considering that we are only at the beginning, excitement grows thinking about the future. Recent scientific work proposes the integration of physical laws into generative models so that predictions respect physical laws and cause-effect relationships.
Precisely cause-and-effect relations are still a stated weakness of this tool, as Antonella Autuori and Matteo Subet note: "Let's give some examples: at the moment, if we are in front of a video in which a child bites into a cookie, at the end of the action we will not see the bitten cookie, or a glass that spills will continue to contain its liquid. One has to imagine that the videos unveiled by OpenAI are a selection of the best results obtained after stressing the generation tool. In addition, there remains the issue of moderating content generated on bias or inciting hatred; an issue on which there is still much work to be done even for the DALL-E and GPT models. That said, as far as we have been able to see, the results are better than the existing tools."
Like any tool, the use of these emerging technologies (whether tex-to-text, text-to-image or text-to-video) goes and is already accompanied by a learning process. For artists and designers, it is primarily a matter of getting the imagined and desired result from artificial intelligence. "Literacy is underway and even just the knowledge of content generation through AI will spread and expand over time. Through prompting strategies, matured by users formulating the right instructions and developing technologies, qualitatively better results will be achieved than with the advent of DALL-E 3 and GPT."
The same principles are already being applied in many businesses, Dario Piga explains, "Think about vacation booking: there are already Apps from major online travel companies that integrate ChatGPT, which advises based on our wants and needs, asks us what we like and what we don't, and then makes reservations for us. Our CAS program goes beyond the use of ChatGPT, exploring open language models, developed by entities such as Meta, that offer significant benefits in terms of confidentiality of information-a crucial issue for banks, insurance companies, hospitals, and government. We teach the use of these models, but also how to customize and specialize them to meet specific business needs, ensuring that they can integrate with the internal dynamics of organizations.
We also address emerging legal and ethical issues. Understanding these issues allows us to prepare for the future by consciously addressing the challenges these technologies bring and promoting appropriate regulation of their use. We discuss the risks associated with the misuse of these technologies, the protection of personal and corporate data, and the management of copyrights to ensure respect for the rights of others and protection of our own.
Our goal, which we also pursue in the Bachelor in Data Science and Artificial Intelligence, is to train professionals who are not just users but experts in these technologies, capable of guiding their responsible and innovative use, making the most of the opportunities they offer."
Making the most of what generative artificial intelligence has to offer, a thought also shared by Antonella Autuori and Matteo Subet: "We firmly believe in complementarity rather than substitution. Our experience in design has taught us that the conscious integration of these technologies can enrich, not limit, the creative process. Therefore, already two years ago we started a research project, at SUPSI's Design Institute, aimed at formalizing a new teaching model for the integration and transfer of these skills into design curricula. In our experiments to date, we have realized that the greatest power these tools give to a designer is to enhance lateral design thinking, the process of 'thinking out of the box,' which is essential when one is required to propose and develop increasingly complex and original solutions. In addition, these tools prove useful in the prototyping phase of ideas, facilitating the visualization of design scenarios and/or decreasing rendering times related to the interactions and impact that designed solutions have in the real world."
Thoughts formulated by those already familiar with the medium and sensing the scope for its exploitation. However, one cannot overlook the concern generated by Sora's first images among those who have made animation or videomaking their profession. The question of the impact of AI on employment is recurring and does not fail to be asked here as well.
For Dario Piga: "Tools like Sora will change the graphics industry, film, video games, publishing, web, and digital and print media and can certainly replace a professional in some tasks, justifying the initial concern. However, it is crucial to ask whether this substitution is an advantage or disadvantage for the professional and for society.
Is a low-cost tool, capable of generating relatively high-quality images and videos in a very short time, really a competitor for graphic designers and illustrators? Or does it serve different market segments? Sora may be used by amateur users or those who need to quickly create an image or video clip. I wonder if these are the main market segments currently served by a graphics expert and what the actual erosion of market share by this "new competitor" may be.
Similarly, I wonder if those who require the services of a professional graphic designer or videomaker, such as creators of films, video games, or marketing companies, can be satisfied with a product generated from a few textual instructions. For a movie scene, video game, or advertising campaign, the expertise of a professional, creative person capable of taking care of details, creating a story, a context, a scenario, and conveying a high-impact message will continue to be needed.
"It is not about replacing human work, but expanding it," Antonella Autuori and Matteo Subet add. Art and design will always evolve, as will these new tools, but individual creativity and the generation of the winning idea will remain at the heart of any design process. With this in mind, the principle of complementarity can overcome any future challenge.
Regarding this point, the words of Eryk Salvaggio during ACMI's Future of Arts, Culture and Technology Symposium (FACT) 2024 best express our thoughts:
“If artificial intelligence strips away context, human intelligence will find meaning. If AI plots patterns, humans must find scores. If AI reduces and isolates, humans must find ways to connect and to flourish”
Rather than being driven by the fear of being overtaken or excluded by technological innovations (AI FOMO), we believe it is crucial to invest as much as possible in learning and critical manipulation of these tools. From our perspective, the only pursuable attitude that allows one not only to remain relevant in one's field, but also to actively contribute to shaping the future of art and design in an increasingly AI-mediated world.