Any pose in MJ: ECU on a detail and then ZOOM OUT

Glibatree (Ben Schade) recently implored on YouTubeDo THIS to Create Amazing Poses in Midjourney!!!

The problem with a lot of image generators is that they love selfies: front-facing portraits. But what if you want a profile? Ben has a two-step work-around:

“Generate a close-up photo of your subject’s ear and then use the editor to zoom out and create the rest of the image.”

He explains:

“The reason this works is because what Midjourney needed was a pattern interrupt. Take advantage of its usual way to generate images by finding the usual way to generate an image with a more unusual focus. It’s better to choose a focus that is already often viewed from the angle we want.

  • focus on a ponytail if we want to see the back of someone’s head
  • use a receding hairline to see someone from straight above
  • focus on the back pocket of a pair of jeans if you want the…
  • I wouldn’t recommend looking up someone’s nostril (I mean it’s an angle that works but I just wouldn’t recommend it.)

The point is we can generate any of these things using extremely simple prompts and get very unusual angles to be seeing a person from. And then starting from there once we have the angle well defined we can simply zoom out and make our chosen feature less prominent by changing our prompt to something else and so in the new image the angle we wanted is extremely well defined not by tons of keywords but by the part of the image we already generated.”

This works for Expressions as well. He explains:

“If we start with a photo of just a smile or just closed eyes or just a mischievous smirk, Midjourney will spend all of its effort to create a high quality closeup version of the exact expression we wanted that now, in just one more generation, we can apply to our character by simply zooming out.”

My take: thank you, Ben, for cracking the code!

FaceFusion 3: the best free face swapper

Tim of Theoretically Media has a great review of FaceFusion 3.0.0 on YouTube:

In it he discusses:

  1. How to install FaceFusion 3 using Pinokio
  2. How to face swap for video
  3. The limitations of FaceFusion
  4. Face swapping with AI-generated characters
  5. Lipsync
  6. Expression controls
  7. Aging controls

A huge bonus to this pipeline is face_editor. See 14:02 for tools to alter the many elements on faces, such as smiles, frowns and eye lines. Even age.

My take: we are way beyond deep fakes now. The ability to change expression is extremely powerful! Every performance can be altered.

Kling is redefining CGI, with Grading up next

Tim Simmons from Theoretically Media just released a new look at Kling AI’s new 1.5 model:

In it he relates what’s new:

1080p Professional Mode: Kling 1.5 now generates videos at 1080p resolution when using Professional Mode. While it costs more credits, the output quality is significantly better and sets a new standard for AI video generation.

Motion Brush: Kling has introduced Motion Brush, a long-awaited tool in the AI video generation space. Currently, it’s only supported in the 1.0 model but will be available in 1.5 soon. Stay tuned!

End Frames: End frames have been introduced in the 1.0 model and are coming soon to the 1.5 model, allowing for smoother transitions and more control over your videos.

Using Negative Prompts: Improve your outputs by adding negative prompts to filter out undesired elements. Copy and paste the following negative prompts into your settings:

ARTIFACTS, SLOW, UGLY, BLURRY, DEFORMED, MULTIPLE LIMBS, CARTOON, ANIME, PIXELATED, STATIC, FOG, FLAT, UNCLEAR, DISTORTED, ERROR, STILL, LOW RESOLUTION, OVERSATURATED, GRAIN, BLUR, MORPHING, WARPING”

Of particular note is the emotion it’s able to generate.

Plus, Tim signals that Kling is about to add a full-featured Video Editor. Stay tuned indeed!

My take: of course, some will lament these advances. Yes, tasks that workers once spent their lives performing are now accomplished immediately. Looking at you, Medieval scribe, hot metal typesetter, telephone exchange operator. More job transformation is sure to come. We are well into the Digital Age and its promise is bearing increasingly wondrous fruit.

Flux.1 prompting and guidance guides

CyberJungle, the Youtube channel of Hamburg-based Senior IT Product Manager Cihan Unur, recently posted a great video on consistent generated characters.

There are lots of great insights in this 20-minute video. Two outstanding takeaways:

First: a prompting guide for Flux.1. At 15:28 he reveals three prompting styles: list, natural language and hybrid.

Second: a guidance guide for Flux.1. At 17:18 he shows Photorealistic and Cinematic images with a wide scope of guidance values. He posits:

“The essence of guidance setting is a compromise or a balance between photo realism and prompt understanding.”

See 18:36 for the Photorealistic results. He prefers a level of two.

See 19:54 for the Cinematic guidance level he prefers: again two.

My take: to me, too often generated images look over-the-top and so ideal, they’re unrealistic. The key seems to be dialing the guidance down to two. Who knew? Now, you do.

You can now star in generated video

Last week we explored the latest Generated Video (GV) pipeline. This week Seattle’s Yutao Han, aka Tao Prompts, goes further and illustrates How to Create Ai Videos of Yourself!

The goal here is to consistently end up with the same real person in multiple generated video clips.

“In this tutorial we’ll learn how to use the Flux image generator to train a custom AI model specifically for your own face and generate AI photos of yourself. Then we’ll animate those photos with the Kling AI video generator, which in my opinion generates the best AI videos right now.”

In a nutshell, the process is:

  1. Create an archive of at least ten photos of your star
  2. Upload this to the Ostris flux-dev-lora-trainer model on Replicate
  3. Train the LORA custom image model and use it to generate key frames
  4. Upscale these images on Magnific, optionally
  5. Generate six second clips in Kling AI with these images

My take: it seems week by week we’re getting closer to truly usable generated video that rivals (or even surpasses) Hollywood’s CGI/VFX. Imagine being able to train more than one LORA model into Flux for Kling. I have it on good authority that that is just around the corner.

New Generated Video pipeline?

A couple of very recent videos point to a potential new Generated Video, or GV, pipeline.

The first is “Create Cinematic Ai Videos with Kling Ai! – Ultra Realistic Results” by Seattle’s Yutao Han, aka Tao Prompts.

The second is “How-To Create Uncensored Images Of Anyone (Free)” by Lisbon’s Igor Pogany, aka The AI Advantage.

Imagine combining both into a new GV pipeline:

  1. Train custom character models
  2. Create key frames utilizing these custom models
  3. Animate clips with these key frames
  4. Upscale these clips
  5. Edit together.

My take: a lot of people will immediately claim this is heresy, and threatens the very foundations of cinema as we’ve come to know it over the last one hundred years. And they would be right. And yet, time marches on. I believe some variation of this is the future of ultra-low budget production. Very soon the quality will surpass the shoddy CGI that many multi-million dollar Hollywood productions have been foisting on us lately.

Compare Image Generators at a glance

Matt Wolfe has just released a wonderful comparison of top image generators tackling four different types of pictures on YouTube.

The four image categories are:

  • Human Realism
  • Landscapes
  • Scenery incorporating Text
  • Surrealistic Images

The platforms are:

  1. Ideogram 2.0
  2. MidJourney 6.1
  3. Mystic
  4. Phoenix
  5. Flux.1 (Grok)
  6. Dall-e 3
  7. SD3
  8. Firefly 3
  9. Meta Emu
  10. Imagen 3
  11. Playground v3

See the Figma board to see all eleven contenders at once.

My take: as a visual learner, I really appreciate this side-by-side comparison. Thank you, Matt!

August 2024 AI Video Pipeline

Love it or hate it, as of August 2024, AI Video still has a long way to go.

In this video, AI Samson lays out the current AI Video Pipeline. Although there are a few fledgling story-building tools in development, full-featured “story mode” is not yet available in AI video generators. The current pipeline is:

  1. Create the first and last frames of your clips
  2. Animate the clips between these frames
  3. Create audio and lip-sync the clips
  4. Upscale the clips
  5. Create music and SFX
  6. Edit everything together offline.

It seems new platforms emerge weekly but AI Samson makes these recommendations:

00:23 AI Art Image Generators
09:19 AI Video Generators
16:28 Voice Generators
18:02 Music Generators
20:44 Lip-Syncing
21:52 Upscaling

Keep an eye open for LTX Studio though.

My take: You know, the current pipeline makes me think of an animation pipeline. It’s eerily similar to the Machinima pipeline I used to create films in the sandbox mode of the video game The Movies over ten years ago:

July 2024 Tier List for AI Video

Igor Pogany of The AI Advantage recently released a YouTube video that succinctly summarizes the current state of AI Video.

The tools he reviews are:

His favourites (dated mid-July 2024) are:

Runway GEN-3 Alpha and Luma Dream Machine for their clip outputs, but watch out for LTX Studio because of their overall project approach.

See the full tier list at 12:48 for the tl;dr.

My take: this is a super-valuable video that can get you up-to-date in under 14 minutes. Well worth your time.

Reality check: LTX Studio mid-2024

You’ve seen the Sora samples. The Dream Machine videos. How does LTX Studio, touted as “the future of storytelling, transforming imagination into reality,” stand up?

Haydn Rushworth posted this review:

“There are whole bunch of things it does not do, but I love where it’s going and where I hope it’s going to go…. It’s brilliant for keeping track of all of the shots that you really do need to keep track of. It’s brilliant for scene wide settings and project wide settings, something I’ve been craving, and it’s really, really good at that. It’s great for casting. It’s brilliant for allowing you to then kind of just drop those characters in. I love the generative tools that will allow you to erase bits that you don’t need in your starting shot and to add other bits that you need that will help you tidy up the shot…. My two big gripes and I don’t think these are bugs that they’re going to fix, this is just fundamental features that it needs to be in there. One of them is every shot is slow motion…. Secondly, breaking the fourth wall. It drives me out of my mind!”

Note that LTX Studio can do lots of things:

  • Pitch Decks
  • Storyboards
  • Animatics
  • Videos

Check out the video at the bottom of the corporate webpage.

Here’s a peek at actually using LTX Studio by Riley Brown:

My take: In addition to Haydn’s slo mo and fourth wall gripes, I would add these requirements as well: movement and expression control including blinking and lip-sync. Mid-2024, one has to use each of the many AI tools for what it does best and then bring all the bits together in post. As an early proponent of Machinima (using video games to make movies,) I’m watching this space with interest. My conclusion: advances are being made but we’re nowhere near lucid dreaming.