NEW · Morning journal prompts → start your day with intention
Random Prompts
LTX-2.3 — 22B open-source video model, 4K at 50 FPS, #2 globally (March 2026)

LTX-2.3 Prompt Generator

The LTX-2.3 prompt generator gives you 20 free, copy-ready prompts for Lightricks' open-source 22B video model. Generates 4K video at 50 FPS with synchronized audio in a single pass — no post-production, no separate audio step. The #2 AI video model globally at launch.

What is the LTX-2.3 Prompt Generator?

The LTX-2.3 prompt generator on this page provides 20 free, professionally crafted prompts for LTX-2.3, the open-source AI video model released by Lightricks on March 5, 2026. LTX-2.3 is a 22 billion parameter Diffusion Transformer that generates 4K video at 50 frames per second with synchronized audio — all in a single generation pass, from a single text prompt.

At launch, LTX-2.3 Pro ranked #2 on the global AI video leaderboard with an Arena score of 2071, behind only Kling v3 (2097). Unlike Kling and HappyHorse, which generate silent video only, LTX-2.3 generates audio and video simultaneously from the same prompt — the model understands the relationship between what is seen and what is heard. The result is synchronized audio: a wave crash that sounds exactly when it hits, a wing gun that fires exactly when you see it move.

Every prompt below is structured for LTX-2.3's 4K 50fps capabilities and audio-visual generation: each describes the visual scene, camera movement, and audio environment in detail, with timing cues for synchronized events. Paste directly into LTX-2.3 via ComfyUI, fal.ai, Replicate, or the Lightricks API.

How to Write LTX-2.3 Prompts

LTX-2.3 uses a gated attention text connector that parses structured prompts more accurately than keyword-based systems. Use this framework for best results:

[Camera + movement] + [Subject + action + timing] + [Frame rate spec] + [Audio: sources, volumes, sync points] + [Production quality reference] + [Duration]

LTX-2.3 Strengths:

  • 4K native resolution — highest detail in open-source class
  • 50 FPS — fluid motion, ideal for sports and nature
  • 22B parameters — complex multi-subject scenes
  • Audio-visual synchronization in one pass
  • Open-source, commercial licence — no per-clip fees
  • Portrait mode native (1080×1920) for vertical content

Best Prompt Elements:

  • Specify "50fps" or "shot at 50fps, reviewed at half speed"
  • Include timing cues: "at second 3, the barrel breaks"
  • Name audio sources explicitly with volume relationships
  • Use physically accurate audio delays (e.g., lightning/thunder)
  • Add production quality references: "BBC Earth," "National Geographic 4K"
  • Always include duration: "12 seconds," "18 seconds"

20 Free LTX-2.3 Prompts — Copy & Paste

Click any prompt to copy — paste into LTX-2.3 via ComfyUI, fal.ai, Replicate, or the Lightricks API

1. Urban Awakening — City at Dawn

Cinematic

A slow aerial crane shot descending toward a major city skyline at first light: the sky shifts from deep blue to amber at the horizon, windows of glass towers catching the first reflections, traffic on the main boulevard below beginning to accumulate. The camera descends from 400 metres to street level over 18 seconds. Audio: the distant city hum rises gradually — engines, a tram bell, a delivery truck reversing, morning birds on a building cornice. The audio gains presence as the camera descends. 4K 50fps, golden-hour cinematography, BBC documentary quality.

2. Peregrine Falcon Stoop — 300 km/h Dive

Nature

A high-speed tracking shot following a peregrine falcon in a near-vertical stoop from 600 metres altitude: the bird is a dark silhouette against a white cloud bank, folding its wings into a teardrop shape, accelerating through frame at 300 km/h. The camera tracks from behind, then cuts to a lateral view at the moment of impact with prey over a river valley. Audio: the wind roar increases sharply as the bird accelerates — a specific aerodynamic whistle through the falcon's primary feathers, the rush of air, and at the strike, a single sharp percussive impact. 4K 50fps, National Geographic wildlife quality. 15 seconds.

3. Luxury Watch — Precision Close-Up

Commercial

A macro tracking shot moving across a luxury mechanical watch on a polished black marble surface: starting at the crown at extreme macro, moving along the case edge with light catching every bevelled detail, then settling on the dial where the sweeping seconds hand makes its full rotation. Depth of field is shallow — the unsharp portions of the case blur into abstract gold light. Audio: the room is acoustically silent except for the precise mechanical tick of the movement — each tick synchronized with the visual advance of the seconds hand. Shot at 50fps, reviewed at half speed in post. Commercial luxury quality. 15 seconds.

4. Ocean Wave — Barrel Formation

Nature

An underwater-to-surface camera positioned inside the breaking zone of a large ocean wave: the barrel forms as a wall of green-blue water lifts overhead, the lip throwing forward with spray, the interior of the tube visible for 3 seconds before the wave closes. Shot at 50fps for fluid slow-motion replay. Audio: the subaqueous rumble of the wave approaching — a deep infrasonic pressure wave in the water, then the surface impact of the barrel closing — a hollow crack and rushing white-water roar. Synchronized to the visual wave break. 4K, aquatic documentary quality. 15 seconds.

5. 100m Sprint — Race Start

Sports

A track-level camera at the start line of a 100m sprint: the starting gun fires at second 1, eight athletes explode from the blocks in perfect unison, arms driving, heads down in the acceleration phase. Shot at 50fps — reviewed at half speed — each muscle contraction visible in the athletes' legs during the first 10 metres. Audio: the starter pistol crack at second 1 synchronized precisely with the visual, followed by the rapid drumbeat of spikes on track at 50fps frame rate, the crowd's initial gasp rising to a roar. No music. 4K athletic documentary quality. 15 seconds.

6. Automated Car Assembly — Robotic Ballet

Industrial

A wide shot inside an automotive assembly plant: twelve robotic arms move in choreographed synchrony around a car body on the production line — welding, placing, torquing — each arm's movement precisely timed to the others. Shot at 50fps so the speed of the robots' movements is rendered with fluid clarity. Audio: the metallic percussion of the assembly line — the rhythmic arc-flash click of spot welders, the pneumatic hiss of a torque gun, the conveyor's low hum beneath everything, each sound source spatially placed in the stereo field. Monocle industrial quality. 15 seconds.

7. Cocktail — Flaming Citrus Peel

Commercial

A close-tracking shot behind a bartender's hands as they express a flamed orange peel over a crystal rocks glass: the peel is held over a flame, then squeezed so the expressed oil catches the flame and creates a brief flash of orange fire above the glass. Shot at 50fps — the flame and oil spray visible frame by frame. Audio: the brief flare of the citrus oil igniting — a soft whomp, followed by the quiet sizzle of the peel, the clink of the peel dropped into the glass, ambient bar atmosphere in the background. 4K commercial quality, Monocle lifestyle aesthetic. 12 seconds.

8. Cherry Blossom Fall — Tokyo Street

Nature

A slow tracking shot at head height along a Tokyo residential street lined with cherry trees in full bloom: petals fall at 50fps — rendered at half speed — creating a continuous snowfall of pink and white. A cyclist passes through frame, their movement fluid and slow in the frame rate. The street is wet from earlier rain. Audio: a light breeze causes a mass release of petals at second 5 — you hear the subtle collective sigh of wind through thousands of blossoms, a bicycle bell, the distant sound of a school, the ambient quiet of a residential morning. 4K, Terrence Malick aesthetic. 18 seconds.

9. Fashion Runway Finale — Paris

Fashion

A tracking shot moving along a major fashion runway at the finale: a procession of models walking in structured garments, the camera at floor level moving against the direction of the procession so faces are captured in sequence. The runway is lit with a single harsh overhead strip light that creates dramatic shadow under cheekbones and collarbone. Shot at 50fps for fluid movement. Audio: the rhythmic percussive bass of runway music at controlled volume — each model's footstep synchronized to the beat, the ambient volume of 400 seated guests, camera clicks from the press section. 4K editorial quality. 15 seconds.

10. Crystal Growth — Laboratory Macro

Science

An extreme macro time-lapse at 50fps of bismuth crystals growing in a controlled laboratory environment: iridescent stepped cubic structures forming from a melt, colours cycling through copper, violet, and silver as oxide layers build. The crystal grows to fill the frame over 15 seconds of accelerated footage. Audio: a clean laboratory environment — the hum of an extraction fan, the occasional drip of condensation, a scientific ambience of low frequency resonance. The crystal growth itself is nearly silent — the tension is visual. 4K macro documentary quality.

11. Night Market — Kuala Lumpur

Documentary

A walking observational shot through a Kuala Lumpur night market at 10 PM: stalls of char kway teow, satay smoke rising in columns through the sodium light, a child tugging a parent past durian vendors. The camera moves at crowd pace, turning to follow a vendor flipping flatbread on a cast-iron pan. Audio: the night market's full acoustic texture — the sizzle of a wok at short range, the satay vendor's rhythmic fanning, a Malay pop song from a distant stall speaker distorting slightly, motorbike passing the market perimeter, the ambient crowd hum. Spatial audio, each source positioned. 4K documentary. 18 seconds.

12. Controlled Demolition — High Rise

Documentary

A wide tripod shot at 400 metres distance of a 25-storey residential tower undergoing controlled demolition: at second 3, the base charges fire in sequence, the tower's legs buckle, and the structure collapses in a symmetrical downward pancake over 6 seconds, a massive dust cloud billowing outward. Shot at 50fps, reviewed at half speed — the glass fracture pattern visible in individual frames. Audio: the delay is physically accurate — the charges detonate visually at second 3, the sound arrives at second 4.2 (400m / 340 m/s). The sound is a low structural crack, followed by a ground-transmitted rumble, then the dust cloud's hiss. 4K BBC engineering quality. 18 seconds.

13. Flamenco Dancer — Sevilla Stage

Cultural

A tight tracking shot circling a flamenco dancer mid-performance: the camera starts behind, moves to a 45-degree lateral as the dancer executes a series of rapid footwork sequences, then pushes in to a close shot of their hands in a circular wrist motion. Red dress fabric follows the movement with 3-frame delay. Shot at 50fps so every footstrike is individually resolved. Audio: the raw acoustic footwork — each zapateado strike individually audible, the castanets in the right hand, the guitarist's rasgueo chord, the palmas from the corner of the stage. Every footstrike synchronized to its visual landing. Flamenco cultural documentary. 15 seconds.

14. Molten Glass Blowing — Craft Studio

Industrial

A close tracking shot of a glass blower working a molten gather at 1200°C: the glowing orange mass is reheated in the furnace, withdrawn, then inflated by breath through the pipe — the gather visibly expanding into a luminous sphere. Shot at 50fps — the heat shimmer above the gather and the colour shift from orange to yellow as it expands all rendered with fluid precision. Audio: the gas furnace roar diminishes as the blower steps back, replaced by the creak of the iron pipe, the blower's breath rhythm, the faint crackling of the cooling glass surface. 4K craft documentary. 15 seconds.

15. Coastal Lighthouse — Winter Storm

Nature

A wide angle shot from the rocks below a stone lighthouse during a severe North Atlantic winter storm: 10-metre waves strike the base of the lighthouse, spray reaching the lamp room. The beam rotates at its fixed interval — every 4 seconds — the only constant through the chaos. Shot at 50fps so each wave's fracture structure is captured in full detail. Audio: the bass boom of each wave impact against the lighthouse base arrives 0.1 seconds after the visual — the sound travels through the rock as much as through the air. Wind at sustained 100 km/h, spray hiss, the lighthouse fog horn sounding once at second 12. BBC Earth quality. 18 seconds.

16. Neon City Rain — Midnight

Cinematic

A static tripod shot at street level in a dense urban commercial district at midnight during heavy rain: neon signage reflects in the wet pavement in fractured columns of pink, cyan, and yellow. Pedestrians with umbrellas pass through frame backlit by the reflections. Shot at 50fps — every raindrop impact on the pavement surface and every reflection shimmer resolved. Audio: the heavy rain creates a layered percussion — the mass impact on asphalt, the metallic drumming on a shop awning above the camera, the gutters running full, a taxi's tyre hiss on wet tarmac. No music. 4K neo-noir quality. 18 seconds.

17. Hospital Emergency Room — Documentary

Documentary

An observational single-take tracking shot through a busy emergency department triage area: nurses moving between bays, a monitor displaying a waveform, a doctor examining a patient behind a half-drawn curtain, a wheelchair arriving through the double doors at the end of the corridor. Camera moves at walking pace, non-intrusive. Audio: the ED soundscape — the continuous beep of a cardiac monitor at 72 bpm, an IV pump alarm at bay 3, radio chatter from a nursing station, the wheels of a trolley, the PA system calling a code. Each audio source is spatially positioned and continuous. 4K medical documentary quality. 15 seconds.

18. Latte Art — Competition Pour

Commercial

An overhead close shot of a barista's hands executing a competition-level rosetta pour: the pitcher tilts, a thin stream of microfoam meets the crema surface of the espresso in the cup below, the barista's wrist oscillates in a practiced rhythm, the rosetta pattern emerging in the foam layer. Shot at 50fps — each individual foam cell structure visible, the pattern forming in real time. Audio: the quiet of a competition environment — the gentle pour of milk against the espresso surface, the barista's controlled breath, the scrape of the pitcher lip being set down. Nearly silent, intensely precise. 4K specialty coffee quality. 12 seconds.

19. Formula 1 Pit Stop — Under 2 Seconds

Sports

A pit-lane camera at floor level captures a Formula 1 pit stop: the car enters at 80 km/h, stops precisely on the marks, 20 mechanics converge in a synchronized burst — wheel guns firing, tyres removed and replaced in sequence — and the car exits in 1.8 seconds. Shot at 50fps — every wrench contact and tyre throw resolved as individual frames. Audio: the car enters with full V6 hybrid whine, the four wheel guns fire simultaneously in a 3-second overlapping burst, the lollipop drops, the car's rear tyres spin as it exits. Every mechanical action locked to its visual source. 4K trackside documentary. 12 seconds.

20. Himalayan Base Camp — Pre-Dawn

Nature

A static wide shot from Everest Base Camp at 5,300 metres before dawn: the Khumbu Icefall is lit by the fixed lamps of climbers beginning their ascent — a chain of moving lights ascending into darkness — while the main summit pyramid and Lhotse catch the first alpenglow on the horizon, turning from deep purple to rose gold. Shot at 50fps time-lapse compressed to 18 seconds. Audio: the extreme silence of high altitude — the only sounds are the distant creak of glacier ice under thermal stress at night, and at second 10, a brief wind gust carrying ice crystal sound across the camp. Thin air acoustic environment. National Geographic quality.

LTX-2.3 vs. Other AI Video Models (2026)

LTX-2.3 is the only model combining open-source access, 4K/50fps resolution, and native audio synchronization:

Model Open-Source Max Resolution Native Audio Leaderboard
LTX-2.3 (Lightricks) ★ Yes — 22B 4K / 50 FPS Yes — synchronized #2 globally (2071 Elo)
Kling v3 (Kuaishou) No 1080p / 24 FPS No (silent video) #1 (2097 Elo)
SkyReels V4 (SkyWork) Yes 1080p / 30 FPS Yes — #1 T2V-audio #1 T2V-with-audio
HappyHorse-1.0 (Alibaba) No 1080p / 24 FPS No (silent video) #1 without audio
Wan 2.7 (Alibaba Tongyi) Yes — 27B 1080p / 24 FPS No (silent video) Top 5 overall

★ LTX-2.3 Pro: #2 on the AI video generation leaderboard (Arena score 2071, March 2026). Open-source weights on HuggingFace, commercial licence. API via Lightricks, fal.ai, and Replicate.

LTX-2.3 Prompting Tips

Do This:

  • Specify "50fps" — LTX-2.3 uses frame rate to pace motion blur and audio envelopes
  • Use "reviewed at half speed" when you want slow-motion detail
  • Include timing cues for audio events: "the crack arrives 1.2 seconds after visual"
  • Describe acoustic space explicitly: "stone cathedral reverb," "open outdoor cliff"
  • Add production quality references: "BBC Earth," "National Geographic 4K"
  • Use duration precisely — LTX-2.3 paces audio across the clip length

Avoid This:

  • Keyword dumps — LTX-2.3's gated attention responds to sentences, not tags
  • Vague audio: "natural ambient sound" — name specific sources
  • Omitting camera type — always specify static, tracking, or aerial
  • Over 250 words without a clear audio-visual priority hierarchy
  • Conflicting acoustic environments (indoor reverb + outdoor wind)
  • Skipping the frame rate spec on fast-motion content

Frequently Asked Questions — LTX-2.3

What is the LTX-2.3 prompt generator?

The LTX-2.3 prompt generator on this page gives you 20 free, copy-ready prompts for LTX-2.3, the open-source AI video model released by Lightricks on March 5, 2026. LTX-2.3 is a 22 billion parameter Diffusion Transformer model that generates 4K video at up to 50 frames per second with synchronized audio in a single generation pass. It ranked #2 globally on the AI video generation leaderboard at launch, behind only Kling v3.

What is LTX-2.3 and who made it?

LTX-2.3 is an open-source AI video generation model developed by Lightricks, an Israeli AI company best known for creative video tools. Released March 5, 2026, it is the third generation of the LTX series and its most significant leap: a 22B Diffusion Transformer with a rebuilt variational autoencoder (for sharper texture), a gated attention text connector (for better prompt adherence), and an upgraded vocoder for cleaner audio output. It ships in four checkpoint variants — dev, distilled, fast, and pro — with the distilled variant generating video in as few as 8 denoising steps.

How does LTX-2.3 generate 4K video at 50 FPS?

LTX-2.3 uses a Diffusion Transformer architecture that operates natively at 4K resolution, generating frames at 50 frames per second for up to 20 seconds per clip. The rebuilt VAE encodes spatial detail at a higher fidelity than previous versions, meaning textures, edges, and fine detail are sharper in the final output. The 50 FPS frame rate is particularly valuable for sports, nature, and commercial content where motion resolution matters — fast motion that looks choppy at 24fps is rendered with full detail at 50fps.

How do I write good prompts for LTX-2.3?

LTX-2.3 responds strongly to structured prompts with explicit visual and audio direction. Use this framework: (1) Camera type and movement — 'static wide shot,' 'tracking shot from floor level,' 'overhead close-up'; (2) Subject and action — described in concrete physical terms with timing cues ('at second 3, the wave breaks'); (3) Technical spec — '50fps,' 'shot at 50fps reviewed at half speed,' '4K'; (4) Audio description — LTX-2.3 generates synchronized audio from the prompt, so describe specific sounds, their sources, volume relationships, and timing; (5) Production quality reference — 'BBC Earth,' 'National Geographic,' '4K commercial quality.' The more precisely you describe both what is seen and what is heard, the more locked the output.

Where can I access LTX-2.3?

LTX-2.3 model weights are available open-source on HuggingFace under Lightricks' repository, with a permissive licence covering commercial and research use. Cloud API access is available through the official Lightricks API (models: ltx-2-3-fast and ltx-2-3-pro), fal.ai, and Replicate. Local inference requires a GPU with at least 24GB VRAM for the full Pro checkpoint; the Distilled variant runs on 16GB VRAM GPUs. ComfyUI supports LTX-2.3 via dedicated nodes for both text-to-video and image-to-video workflows.

How does LTX-2.3 compare to SkyReels V4, HappyHorse, and Kling 3?

LTX-2.3 (Lightricks) and SkyReels V4 (SkyWork AI) are the two open-source models in this group — both generate synchronized audio and video in one pass, and both can be run locally without per-clip API fees. Kling v3 (Kuaishou) leads the overall leaderboard for raw video quality but generates silent video and requires API access. HappyHorse-1.0 (Alibaba) also generates silent video with high cinematic fidelity. For open-source use, LTX-2.3 Pro has a slight quality edge from its 22B parameter count; SkyReels V4 leads on the T2V-with-audio leaderboard specifically. Both are strong options depending on your GPU and use case.

More AI Video Prompt Generators