No, You Can't Make Anime with AIby Callum May,
It's been difficult to ignore the conversations that have surrounded the development of AI tools lately. With Midjourney using prompts to generate images and OpenAI's ChatGPT able to generate seemingly comprehensive text, it feels like we're on the precipice of an exciting yet terrifying technological revolution.
It's not just had an impact on schools, art competitions, and publishers—after the release of Corridor Digital's attempt at an AI-generated anime, animators both within and outside of Japan have been wondering whether it's a realistic solution to the industry's issues with overproduction.
And generally, their conclusion has been: No, it's not.
Full Disclosure: I find AI fun. I've played around with ChatGPT getting it to write stupid fanfiction like, “write a short story about a Pokémon trainer who defeats the Elite 4 with a Weedle.” I've also sent in prompts to MidJourney with “Joker from Persona 5 fights robots with Spongebob Squarepants in a sci-fi Venice.”
However, I've found that you rarely get what you're looking for, even when you refine your prompts to achieve a more specific result. Instead, you're taking the intelligence's best guess and getting it to provide iteration after iteration until you find something that's just “alright.” This is most grievous when it comes to MidJourney—no matter what data it feeds upon across the web, it will always produce something less than the sum of its parts. If it produces great art, it's because great artists somewhere produced something similar.
And that same feeling goes for Corridor Digital's “ANIME ROCK, PAPER, SCISSORS” short, a seven-minute animation based on live-action footage, using a Stable Diffusion model trained on screenshots from Madhouse's Vampire Hunter D: Bloodlust film. Even if you put aside the grossness of having watched something that transparently plagiarizes dozens of artists' work being used for profit, the result isn't a convincing anime in the first place—and it's because an AI only understands what animators produce, not why they produced it.
While Corridor's Niko Pueringer talks at length about “democratizing” animation in his “Did We Just Change Animation” video, he spends very little time getting to grips with the actual process he's trying to mimic. Instead, the YouTube creators shot footage in front of a green screen of themselves in costumes performing the actions, exaggerating each gesture dramatically to fit their idea of what anime is. Then, when they had the AI turn those frames into anime-inspired equivalents, they decided only to include every second frame in their final video.
This was an attempt to capture anime timing, but it missed the fundamental issue that the frequency of frames in anime is ever-changing, even within the same shots. That's why animators don't talk about anime in terms of “frames-per-second”—the frame count can change at any moment, and sometimes, characters will be animated at a different frequency even when they exist in the same scene. It may sound minor, but these inconsistencies, pioneered by the late Yasuo Otsuka, are the core of anime's iconic limited animation style. And, unfortunately for any AI attempting to copy them, animators treat these as largely intuitive.
In fact, 3D anime has struggled with this for over a decade. Attempts come across as uncanny if rendered out at a full 24 frames-per-second but choppy if rendered at a constant 12fps or 8fps. One of the main reasons Studio Orange's works are lauded even among 3D-skeptics is because they've successfully matched the timing of 2D anime after having worked on them for decades. The reality is that no matter how digital your process gets, the appeal of anime is in human inconsistency and intuition.
That lack of inconsistency is also a major issue in Corridor's work. If you tell a robot always to draw a character the exact same way, it will do just that. But the fun of giving human animators a character design sheet is that you know they will each interpret things slightly differently. While Corridor struggles to get their model to stick with one specific style—that of Vampire Hunter D: Bloodlust—the truth is that the film never had just one look. Depending on the animation director, the animator, or even just how far away the character was from the camera, their faces change subtly throughout the film. It's not a mistake. It's an appeal of the medium.
The only inconsistencies Corridor's model can produce are the many glitches across the characters' faces. It's not able to understand that Vampire Hunter D: Bloodlust was a film created by dozens of different artists with many different inspirations and ways of creating art. It's not able to understand that anime characters aren't actually meant to look exactly the same all the time.
Ultimately, "Anime Rock, Paper, Scissors" is a flickering rotoscoped short film. There is one way to fix its largest issues, but that would literally be a case of hiring animators to review it and make adjustments. And at that point, why not just hire them to begin with?
Even as AI generation gets smarter at understanding our prompts and learning how art is created, it's a very long way from understanding why an artist makes the decisions they do. Currently, the best parts of Corridor Digital's short film are the parts they actually did themselves—the VFX, the lighting, the premise, etc. The rest is a clumsy attempt to capture a style that neither the AI nor Corridor themselves actually understand.
In fact, in a video posted to Corridor's channel, animators Tom and Tony Bancroft had praise for the film but noted that a lot of the actually impressive parts were the creative decisions made by the human artists. Regular people without the same kind of experience or tools, they said, couldn't produce a similar result.
It's hard to know if AI will ever truly have a place within anime production. Of course, machine learning technology and other forms of computer generation will always have a place in anime's digital components, but those are valuable because anime creators are always aware of exactly what data they're running off. For example, david production was particularly enthusiastic about the automated in-betweening tool CACANI for a while, although even then, it still required a human touch.
One of the biggest issues that seem to come up in both machine learning and AI generation is the need to have humans fix a computer's work. When Netflix experimented with AI generation for developing backgrounds in their "Dog and Boy" anime short with Wit Studio, their model had notably gotten key details wrong, prompting the background artist to repaint the entire image. It's unclear how much artificial intelligence can help and how much it can hinder.
Right now, “anime” is one of the most popular prompts on Midjourney, yet it feeds from so much fan-created data that it's not at all uncommon to find warped watermarks in the corners of generated images. At the same time, “anime” is not a style, to begin with—every time you prompt an AI to create something in an anime style, it's finding either a fan artist or a professional character designer to try and copy. It's incapable of creating anything new.
Even if AI does find a place in other industries, I can only see it taking a major role in anime production if it gains sentience, goes to art school, and studies anime as a process and not just a result.
discuss this in the forum (56 posts) |