Sponsored by

A working creator’s overview of what AI video actually is, the unit of work, the three ways a clip starts, and the five observations that change how you approach the medium once the image-prompting habits stop transferring.
Reading this in another folder? Move it to your inbox so you never miss an issue.
| Luxe Prompting |
ISSUE 44 MAY 2026 |
|
Starting with AI video.
A working creator’s overview of what AI video actually is, where it sits today, and the five observations about the medium that change how you approach it.
|
|
The World's Biggest Dev Event Hits Silicon Valley
WeAreDevelopers World Congress comes to San José, CA — September 23–25, 2026. 10,000+ developers, 500+ speakers, and the full software development lifecycle under one roof, in the heart of Silicon Valley.
Kelsey Hightower. Thomas Dohmke (fmr. CEO, GitHub). Christine Yen (CEO, Honeycomb). Mathias Biilmann (CEO, Netlify). Olivier Pomel (CEO, Datadog). The people actually building the tools you use every day — all on one stage.
AI, cloud, DevOps, security, architecture, and everything real builders ship with. Workshops, masterclasses, and the official congress party.
|
•••
|
|
Most creators come to AI video with the same instincts that worked for image. They learn the prompt patterns. They study the platforms. They expect the craft to map across one-to-one. Some of it does. Most of it does not. The medium has its own shape, and the things that surprise creators in the first month are usually the same things, in the same order.
AI video is not AI image with motion added on top. It is a different medium with a different unit of work, a different production rhythm, and a different relationship between the prompt and the output. The work that holds up comes from creators who understand the medium on its own terms, not from creators who treat it as an extension of what came before.
Below are five observations about the shape of AI video that come up in almost every conversation I have with creators moving into the medium for the first time. None of them are about which tool to pick. All of them are about how the work actually behaves.
|
|
Observation 01
The unit of work is the clip.
|
|
AI video models generate clips. Two seconds. Four seconds. Eight seconds at the upper end. A clip is not a video any more than a photograph is a feature film. The output of an AI video tool is raw material. The video is what you build from the clips.
This single reframing changes how creators plan their work. You stop trying to generate “a video” in one prompt. You start thinking in shot lists, sequences, and assemblies. The prompt is for a clip. The video is for an editor, which is usually still you, just with different software open.
|
|
Observation 02
Three ways a clip starts.
|
|
Every AI video clip begins from one of three inputs. A text prompt alone. An image used as the first frame. Or a combination of an image plus a text prompt describing what should change across the seconds that follow. Each one is a different workflow, with different strengths.
Text-to-video gives the model the most room and the least anchor. Image-to-video locks the look but constrains the motion. Image-plus-text is the workflow most working creators settle into, because it lets you build the look in your image tool of choice and use the video tool only for the motion.
|
|
Observation 03
The platforms divide by purpose.
|
|
Sora. Veo. Kling. Higgsfield. Runway. Pika. The temptation is to treat them as competitors and pick one. The working approach is to treat them as different instruments. Sora carries narrative scenes. Veo handles audio-synced dialogue. Kling renders photoreal product motion. Higgsfield gives you cinematic camera control. Runway sits closest to traditional editing. Pika produces fast, social-shaped clips.
The platforms differ less in overall quality than in what they were built for. Picking the right one for a specific clip is more about matching the tool to the work than about choosing a single tool for everything. Creators who learn to route their work spend less time fighting tools that were not built for it.
|
|
Observation 04
Consistency is the hard problem.
|
|
Generating one good clip is easier than people expect. Generating ten clips that look like they came from the same world is the actual craft. Character consistency across scenes. Environment consistency across cuts. Lighting consistency across angles. Tone consistency across the full assembly. These are the questions that separate a working AI video creator from someone making one-offs.
The platforms are racing to solve this. Soul ID, character references, style references, image-conditioned generation. Each one is an attempt to anchor more of the clip to something you control. The creators who plan for consistency from the first clip have an easier time than the ones who try to fix it in the edit.
|
|
Observation 05
The craft sits downstream of generation.
|
|
Most creators new to AI video assume the work is in the generating. It is not. The work is in the selecting, the sequencing, the scoring, the sound design, and the color treatment. A working AI video creator might generate twenty clips to use three. The other seventeen are tuition for the three that make it in.
This is closer to traditional filmmaking than to image generation. The prompt is the equivalent of a directing decision. The selection and assembly is the equivalent of editing. The output of the generator is footage, not a finished piece.
|
|
The Shift
Five observations and a shift.
|
|
Put the five observations together and you can see the shift the medium asks of you. You stop thinking in single prompts and start thinking in shot lists. You stop expecting one tool to do everything and start routing work to the right one. You stop treating consistency as something to add later and start designing for it from the first clip.
AI video rewards creators who treat it as its own craft, not as an extension of image generation. The image-prompting muscles still help. The new muscles are sequencing, routing, and downstream assembly. Most of the actual work happens after the generator finishes.
|
|
•••
I am putting together a starter pack with fifteen worked examples across text-to-video, image-to-video, and image-plus-text workflows. Each one annotated with the platform I would route it to, the prompt I would use, and the assembly notes that turn the clip into a finished piece.
Want it when it ships? Reply with send me the starter pack and I will get it to you.
|
|
A QUESTION FOR YOU
Which of the five observations changes how you work?
Reply and tell me. The observation that hits the hardest is usually the one that closes a loop you have been working on for a while.
If this resonated, forward it to a creator who is making their move from AI image into AI video.
|
|
Until next time,
Luxe Prompting
|
|
Luxe Prompting
AI Image Generation for Creators
|
|