Cinema-grade add-on lenses for iPhones?

Jourdan Aldredge on No Film School invites us to Meet the World’s First Cinema-Grade Mobile Lenses for iPhone.

There are at least half a dozen brands of add-on lenses for iPhone cinematography, but these promise to be the first cinema-grade lenses from ShiftCam, working with TUSK.

Beyond the optical quality and build, consider their best use cases:

  1. Discreet Filming in Crowds
  2. Fast-Paced B-Roll Capture
  3. Overhead & Tight Space Shots
  4. Quick Transitions Between Shots
  5. Budget-Friendly Aerial & Water Shots
  6. Scouting Locations
  7. Creative Time-Lapse & Motion Effects
  8. Multi-Cam Filming with Multiple Phones
  9. Professional-Quality Live Streaming
  10. Filming in Extreme Weather Conditions

Here’s the link to the Kickstarter campaign. Not cheap.

My take: I would love to see real-world test footage and charts from these lenses.

Workflow to create aerial clips

Rory Flynn has shared a workflow that uses a combination of AI tools to create aerial clips.

The tools are: Claude 3.7, Magnific and Runway.

The workflow is:

  1. Build a 3D Render in Claude 3.7
  2. Program in camera movements
  3. Screen record the render
  4. Upload this video to Runway Gen-3
  5. Extract the first frame
  6. Apply a Magnific Structure Reference to the first frame
  7. Upload this new first first frame in Runway
  8. Apply the new first frame to the initially rendered video using Runway Restyle.

The Claude prompt he used in Step 1 is: “can you code a 3d version of [subject + env] in three.js?” E.g. “can you code a 3d version of an epic castle atop a mountain plateau in a valley in three.js?

The Magnific Structure Reference he used in Step 6 is: “editorial photo, epic castle on a plateau, intricate rocky textures and fine details, immaculate New Zealand landscape, white marble castle, high precision photography” with these settings:

  • Model: Mystic 2.5
  • Structure Reference
  • Structure Strength: 52%
  • Resolution: 2k
  • Creative Detailing: 75%
  • Engine: Magnific Sharpy

See his X post or LinkedIn post.

See an interview with Rory on AI in business.

My take: amazing!

Riffusion generates full songs effortlessly

Riffusion has just opened a public beta and it rocks!

Riffusion is the brainchild of Hayk Martiros and Seth Forsgren.

“Our goal is to make everyone into a musician and bring a future where music is interactive and personalized.”

TechCrunch reported their $4M seed funding in October 2023.

My take: damn! Not only will this create full songs, it will also create stems you can download for further modification in your DAW of choice.

Best Open Source TTS: Kokoro

There is a new open source Text to Speech generator in town called Kokomo-82M.

As far as I can determine, it’s being developed by one person, Hexgrad, based on earlier models.

Apparently, this is something you can install and run locally on your own computer.

You can try it out online here. You can also compare various open source models at the TTS Arena.

My take: note that this does not clone voices or emote (at all.) Perhaps in the next version?

Generated Video and Emotions

Haydn Rushworth has just released COMPARED: 10 AI Emotions – Minimax / Hailuo.ai 12V-01-live vs KLING, VIDU, Runway.

He compares Minimax with Runway, Vidu and Kling.

His conclusions?

Runway was the most sedate whereas Kling was all over the place. Vidu was good, but Minimax was his favourite.

Tao Prompts also compares Sora, Kling, Minimax and Runway.

He concludes that Runway doesn’t tend to add much emotion at all.

My take: it appears that Minimax may be the best platform to generate video from images at the close of 2024. What will 2025 bring us?

How to Create Consistent AI Characters

Caleb Ward of Curious Refuge has released 2024’s best summary of how to Create Consistent Realistic Characters Using AI.

He suggests using Fal.AI to train a custom LoRA ( fal.ai/models/fal-ai/flux-lora-fast-training ) with at least 10 images of the subject. Then use this model to generate images ( fal.ai/models/fal-ai/flux-lora ) and increase their resolution using an up-res tool. Finally, you can now move on to animating them.

CyberJungle, the Youtube channel of Hamburg-based Senior IT Product Manager Cihan Unur, also posted How to Create Consistent Characters Using Kling AI.

He details how to train a LoRA on Kling using at least eleven videos of your character. Admittedly, this pipeline is a little more involved. He also suggests FreePik as another option.

My take: basically, if you can imagine it, you can now create it.

Any pose in MJ: ECU on a detail and then ZOOM OUT

Glibatree (Ben Schade) recently implored on YouTubeDo THIS to Create Amazing Poses in Midjourney!!!

The problem with a lot of image generators is that they love selfies: front-facing portraits. But what if you want a profile? Ben has a two-step work-around:

“Generate a close-up photo of your subject’s ear and then use the editor to zoom out and create the rest of the image.”

He explains:

“The reason this works is because what Midjourney needed was a pattern interrupt. Take advantage of its usual way to generate images by finding the usual way to generate an image with a more unusual focus. It’s better to choose a focus that is already often viewed from the angle we want.

  • focus on a ponytail if we want to see the back of someone’s head
  • use a receding hairline to see someone from straight above
  • focus on the back pocket of a pair of jeans if you want the…
  • I wouldn’t recommend looking up someone’s nostril (I mean it’s an angle that works but I just wouldn’t recommend it.)

The point is we can generate any of these things using extremely simple prompts and get very unusual angles to be seeing a person from. And then starting from there once we have the angle well defined we can simply zoom out and make our chosen feature less prominent by changing our prompt to something else and so in the new image the angle we wanted is extremely well defined not by tons of keywords but by the part of the image we already generated.”

This works for Expressions as well. He explains:

“If we start with a photo of just a smile or just closed eyes or just a mischievous smirk, Midjourney will spend all of its effort to create a high quality closeup version of the exact expression we wanted that now, in just one more generation, we can apply to our character by simply zooming out.”

My take: thank you, Ben, for cracking the code!

FaceFusion 3: the best free face swapper

Tim of Theoretically Media has a great review of FaceFusion 3.0.0 on YouTube:

In it he discusses:

  1. How to install FaceFusion 3 using Pinokio
  2. How to face swap for video
  3. The limitations of FaceFusion
  4. Face swapping with AI-generated characters
  5. Lipsync
  6. Expression controls
  7. Aging controls

A huge bonus to this pipeline is face_editor. See 14:02 for tools to alter the many elements on faces, such as smiles, frowns and eye lines. Even age.

My take: we are way beyond deep fakes now. The ability to change expression is extremely powerful! Every performance can be altered.

Kling is redefining CGI, with Grading up next

Tim Simmons from Theoretically Media just released a new look at Kling AI’s new 1.5 model:

In it he relates what’s new:

1080p Professional Mode: Kling 1.5 now generates videos at 1080p resolution when using Professional Mode. While it costs more credits, the output quality is significantly better and sets a new standard for AI video generation.

Motion Brush: Kling has introduced Motion Brush, a long-awaited tool in the AI video generation space. Currently, it’s only supported in the 1.0 model but will be available in 1.5 soon. Stay tuned!

End Frames: End frames have been introduced in the 1.0 model and are coming soon to the 1.5 model, allowing for smoother transitions and more control over your videos.

Using Negative Prompts: Improve your outputs by adding negative prompts to filter out undesired elements. Copy and paste the following negative prompts into your settings:

ARTIFACTS, SLOW, UGLY, BLURRY, DEFORMED, MULTIPLE LIMBS, CARTOON, ANIME, PIXELATED, STATIC, FOG, FLAT, UNCLEAR, DISTORTED, ERROR, STILL, LOW RESOLUTION, OVERSATURATED, GRAIN, BLUR, MORPHING, WARPING”

Of particular note is the emotion it’s able to generate.

Plus, Tim signals that Kling is about to add a full-featured Video Editor. Stay tuned indeed!

My take: of course, some will lament these advances. Yes, tasks that workers once spent their lives performing are now accomplished immediately. Looking at you, Medieval scribe, hot metal typesetter, telephone exchange operator. More job transformation is sure to come. We are well into the Digital Age and its promise is bearing increasingly wondrous fruit.