Generative AI: Images & Videos

Jump ahead by clicking on one of the projects below.


Curious Refuge AI Advertising Contest

Created on August 2024

Tools used: Midjourney, ToonCrafter, AnimateDiff, RunwayML, ElevenLabs, and SunoAI.
Timeline: 3 days
Budget: $60

For six months, I’ve been developing AI-generated image and video concepts for Western Digital, exploring various tools, methods, and media types to benefit our marketing team. Upon discovering this competition with only three days remaining, I challenged myself by applying my GenAI knowledge outside of work. Over just three nights of work, I created this video, pushing the boundaries of what I’ve learned. As an animation enthusiast, I’m incredibly proud of what I achieved in such a short time, bringing to life a project that would typically require weeks of effort!

This video can be broken down into two main sections, 1) the character in her dinning room and 2) the character dancing in a karaoke dreamscape. Each section relies on two different workflows, keyframes and rotoscope, below I will be breaking down the workflow for each section.

1) Dinning Room – Keyframe Pipeline

This AI workflow mirrors traditional animation fundamentals. In traditional animation animators reference a style and character sheet to learn how the character looks in various poses and the overall feel of how the character acts. Once the character sheets are established a senior animator typically will draw out the key frames, which are the key poses in a sequence. Next a junior animator will draw out the in between frames, the frames that transition the character from one key pose to the next. These steps are adapted into the AI pipeline with Midjourney acting as the character designer and senior animator and the ToonCrafter model acting as the junior animator.

Below is the reference character sheet generated by Midjorney along with the keyframes for the first sequence. In Midjourney, the inital generated character sheet might have inconsistencies with the character. It is important to go in manually and correct any errors. For example, in this sheet all of her expression on the left were initally generated without any freckles so I had to manually had them in with Adobe Photoshop.

When generating keyframes in Midjourney use the prompt template below. This will tell Midjourney what the style of the animation is (–sref), the style of the character (–cref), and how closely it should follow the character style (or in other words what is the weight of the link –cw).

There are four key components in the workflow:

  1. Image Cropping: Before loading the entire frame we need to crop the image to our area of interest and normalize all the resolutions. ToonCrafter only intakes and outputs low-resolution videos so important we first crop to a specific location so we’re not down scaling our entire frame.
  2. Prompts: For the ToonCrafter model it’s best of to have no prompts. Instead, focus on good keyframes that tak straightforward movements for the in-between frames. The interesting thing about ToonCrafter is prompts are optional and often can result in strange outputs if included because it now has to balance what it thinks your prompt looks like at the frames it’s been given.
  3. ToonCrafter Model and Sampler: The key is to ensure you have the right values. Play around with it to see what works best for your images.
  4. Upscaler and Frame Interpolator: ToonCrafter outputs low-resolution video; by running it through an upscaler and another interpolator, we can match the frame and resolution of our original video.

The low-resolution output of ToonCrafter will look something like the above. Now, we take the upscaled version (also with higher frames per second) into Adobe Premiere and match it with one of the keyframes to fill in the background. Once we do that, we’ll have a scene with the correct resolution, fps, and aspect ratio.

2) Karaoke Dreamscape (Rotoscope Pipeline)

The whipped cream background is pretty simple, using Runway’s image to video on their Gen 3 alpha model, so I won’t be going into that. What I will break down is the character animation. It involves using ControlNets, IP Adaptors, and AnimateDiff to pull the whole thing together. 

LTDR, the recording will look strange and awkward on you, but on the cartoon character, it’ll look great! Trust the process. 

3) Final Editing!

Of course, after all that work we still need to generate sound effects (ElevenLabs), music (SunoAI), and finally edit everything with some good fashion manual video editing!


Wedgwood Pendent Dancer

Created on April 2024
Tools Used: RunwayML and AnimateDiff

Using a simplified version of the character animation workflow we can see our four key components for influencing the model’s output.

  1. ControlNet DWPose – tracks poses
  2. ControlNet Soft Edge – dictates composition
  3. IP Adapter – style reference, for my workflow I used a mask to highlight only the blue pendant and not the background or silver trim. This can be seen in the full detailed workflow below not but the simplified workflow above.
  4. Prompts – overall

Favorite Tutorials and Creators