In partnership with

A mysterious AI video model appeared on the blind-test arena in April and quietly outshone every model creators had been using. The team behind it stayed anonymous for two days. Then Alibaba quietly admitted it was theirs.

Reading this in Promotions? Move it to Primary so you never miss an issue.

Luxe Prompting ISSUE 29   MAY 2026

The mysterious AI
video model
nobody knew about.

A model called HappyHorse appeared on the blind-test arena last month and started showing up in every comparison. Nobody knew who made it. Then Alibaba admitted it was theirs. Here is what it does.

Ghost: Free Postgres For Agents

Agents are desperate for ephemeral databases.

They spin up projects, fork environments, test ideas, and tear them down. Over and over. But every database on the market was designed for humans who provision once and stick around. Agents don't work that way.

Ghost is a database built for agents. Unlimited databases, unlimited forks, 1 TB of storage, and 100 compute hours per month. All free. Try it here.

•••

A strange thing happened on the blind-test arena for AI video last month. A model called HappyHorse appeared without any team name attached to it. No company, no announcement, no hype. Just a generated clip uploaded to an arena where real users compare two AI videos side by side and pick the one they prefer, without knowing which model produced either.

Within a few days, HappyHorse started showing up at the front of every comparison. People watching the arena noticed and got curious. The clips were obviously coming from somewhere serious. The motion was natural. The faces held together across cuts. The audio was synchronized with what was happening on screen, which almost no other model could do. Speculation about who made it ran for two days. Then Alibaba quietly admitted it was theirs.

The team turned out to be a small group inside Alibaba's Taotian lab, led by a researcher named Zhang Di who had previously built Kling AI at Kuaishou. They had been working in stealth since the end of 2025 and decided to submit the model anonymously to gather honest feedback before saying anything. The strategy worked. HappyHorse is now available to anyone with a browser, and it is the model I have been quietly switching to this week.

What It Actually Does

Three things no other video model handles as cleanly.

The thing HappyHorse does that almost no other AI video model handles well is generate the sound and the visuals together in a single pass. When a wave splashes on screen, the splash sound arrives in the right frame. When a character speaks, the mouth movement matches the syllables. The audio is not added later. It comes out of the same operation that produces the video. That synchronization changes the workflow more than the raw quality numbers suggest, because every other model still requires you to add audio in a separate step through ElevenLabs or a sound library, and the alignment is always a little off.

The second thing is multi-shot continuity in a single generation. Most AI video tools give you one drifting shot per request. HappyHorse can render a fifteen-second clip with two or three distinct shots inside it, with the same character holding their face, their clothing, and their setting across cuts. For anyone making narrative video, this collapses a workflow that used to require multiple generations stitched together into one prompt.

The third thing is speed. A 1080p clip comes back in around thirty-eight seconds on the data center hardware these hosting services use, which is fast enough that the iteration loop stops feeling slow. Generating ten variations of a shot before settling on one becomes practical, and the cost per generation is meaningfully lower than what Sora used to charge before OpenAI shut Sora down.

What I Tried This Week

A walk-through of one prompt.

The prompts HappyHorse responds to are different from what works in Midjourney or even other video tools. Short and specific. Around twenty words per shot. A clear subject, a clear action, a clear setting, and one cinematography cue per shot. Long prose hurts the output rather than helping it. The fal.ai guide for the model spells this out, and after testing it against longer prompts, the guidance is right.

A shot-list format with timecodes works more reliably than continuous description for multi-shot videos. Here is one I ran this week that produced a clean clip on the first try:

Shot one, zero to three seconds. Woman walking through a sunlit forest path. Slow dolly-forward camera. Golden hour light filtering through the trees. Shot two, three to eight seconds. Close-up of her hand brushing against the leaves. Soft focus background. Natural ambient sound of birds and footsteps. Shot three, eight to fifteen seconds. Wide shot revealing the path opening into a clearing. Cinematic pullback. Faint wind in the trees.

Paste that into WaveSpeedAI with HappyHorse selected. The output should be a fifteen-second clip with three distinct shots, character continuity across cuts, and synchronized ambient audio. The first time I ran something like this and got back a clip with the birds chirping in the right places without me having to mix the audio afterwards, the workflow felt different.

Where to Try It

Three browser-based options.

WaveSpeedAI. The cleanest entry point. Both text-to-video and image-to-video are live with a simple interface and starter credits at no cost. The right pick for anyone who wants to start generating quickly.

fal.ai. Multi-model platform that hosts HappyHorse alongside Seedance, Kling, and Veo. Useful if you want to compare outputs across models with one account.

Alibaba Cloud. Direct access through Alibaba's own platform. Setup is more involved and it is the official source if you want the full feature set.

What I Would Tell My Past Self

Why this changes my workflow.

For most of last year, AI video was something I would experiment with on weekends and not use in actual work. The output was good enough to be interesting but never quite good enough to ship, and the audio always required a separate workflow that took longer than the video itself. I had quietly stopped reaching for it on real projects.

HappyHorse is the first video model I have used that produces output I would actually deliver to a client without warning them it was AI. The motion is natural. The audio is in sync. The shots hold together across cuts. The cost per finished clip is low enough that I can iterate ten times before settling on one. None of those things were true two months ago.

If you have been skeptical about AI video for the same reasons I was, this week is the moment to try again. The model has been quietly available for a few weeks and most creators have not noticed yet. That window will close once the coverage catches up, which is happening as I write this. For the next handful of days, you have a small head start.

•••

I am putting together a pack of fifteen tested HappyHorse prompts across narrative shorts, social cuts, and product reveals. Each one with shot timing. Each one tuned for the model.

Want it when it ships? Reply with send me the HappyHorse pack and I will get it to you.

A QUESTION FOR YOU

What kind of video would you make if AI could finally do it well?

Reply and tell me. I will send back a shot-list prompt tuned for whatever you describe.

If this issue resonated, forward it to a creator who has been waiting for AI video to be usable.

Until next time,

Luxe Prompting

Luxe Prompting

AI Image Generation for Creators

Keep Reading