Text-to-Video

Create videos from text descriptions using AI. Simply describe what you want to see, and Chat Video Pro generates it.

Text-to-Video is the starting point for creating AI video from scratch — you describe a scene and the model generates it. Use this when you want to create new footage that doesn't exist yet: B-roll, establishing shots, product showcases, or social content.

Not sure which mode to use?

You have...
Use instead

An existing still image to animate

Two images and want a morph transition

Reference photos of a character

Existing footage to modify or extend

Want an expert prompt written for you first

How It Works

  1. Enable Generate Media - Toggle the Generate Media button in the composer

  2. Select Video Model - Choose from Sora 2, Veo 3.1 (including Veo 3.1 Lite), Grok, Kling, Hailuo, Seedance 2, or Wan 2.7

  3. Set Parameters - Choose aspect ratio, duration, resolution

  4. Write Your Prompt - Describe the video you want

  5. Generate - Click send and wait for your video

Writing Effective Prompts

Essential Elements

A good video prompt includes:

  1. Subject - What or who is the focus?

  2. Setting - Where does this take place?

  3. Action - What's happening?

  4. Camera Movement - How is it shot?

  5. Style - What's the aesthetic?

  6. Mood/Atmosphere - What's the feeling?

Prompt Structure

Good Examples

Detailed and Specific:

Action-Focused:

Cinematic:

Bad Examples

Too Vague:

Missing Details:

No Style:

Model-Specific Tips

Sora 2 / Sora 2 Pro

Best For:

  • Longer clips (up to 12 seconds)

  • Cinematic quality

  • Detailed scene descriptions

Prompt Tips:

  • Use cinematic language

  • Describe camera movement clearly

  • Longer, detailed prompts work well

  • Mention style and mood

Example:

Veo 3.1

Best For:

  • Dialogue and speaking characters

  • Audio generation

  • Character consistency

Prompt Tips:

  • Include dialogue in quotes: "Hello, welcome to our channel."

  • Describe the speaking character clearly

  • Mention audio needs

  • Use reference mode for character consistency

Example:

Veo 3.1 Lite

Best For:

  • Fast, budget-friendly text-to-video when you do not need AI-generated audio in the clip

  • Visuals only — add voiceover, music, or dialogue in Premiere

Prompt Tips:

  • Describe the scene, motion, and style like any Veo prompt; omit expectations of synced speech in the render

  • Duration options align with other Veo 3.1 models (e.g. 4s / 6s / 8s); resolution is 720p or 1080p (no 4K)

See also: Supported Video Models for the full Veo lineup comparison.

Kling Models

Best For:

  • Complex camera movement

  • Dynamic scenes

  • Cinematic camera work

Prompt Tips:

  • Emphasize camera movement: "360° orbit", "FPV drone shot"

  • Describe complex motion clearly

  • Use cinematic terminology

  • Kling 3.0 understands very detailed prompts

Example:

Hailuo 2.3

Best For:

  • Action and sports

  • Dynamic movement

  • Fast-paced content

Prompt Tips:

  • Focus on action and movement

  • Describe dynamic elements

  • Emphasize motion and flow

Example:

Seedance 2

Best For:

  • Natural motion and stable subjects

  • Cinematic footage with native audio

  • Wide aspect ratio support including 21:9 ultrawide

Prompt Tips:

  • Describe motion and mood naturally

  • Audio is generated by default — include sound cues in your prompt if desired

  • Supports "auto" duration (the model picks the best length for your prompt)

  • Use Seedance 2 Fast for quicker iterations at the same quality tier

Example:

Wan 2.7

Best For:

  • High resolution 1080p output

  • Flexible duration (2-15s)

  • All standard aspect ratios plus 4:3 and 3:4

Prompt Tips:

  • Works well with detailed scene descriptions

  • Supports negative prompts for avoiding unwanted elements

  • Prompt expansion enabled by default (enhances short prompts)

  • Good for cinematic and narrative content

Example:

Grok (xAI)

Best For:

  • Fast video generation

  • Mobile-first content (TikTok, Instagram Reels)

  • Unique aspect ratios (2:1, 1:2, 20:9, 19.5:9, 9:19.5, 9:20)

  • Quick iterations and previews

Prompt Tips:

  • Simple, natural language prompts work well

  • Describe motion and mood clearly

  • Works great with direct, descriptive prompts

  • Also supports Image-to-Video and Video-to-Video modes

Unique Features:

  • Duration up to 15 seconds - Matches longest models

  • Mobile aspect ratios - 19.5:9, 9:19.5 for modern smartphones

  • Panoramic formats - 2:1, 1:2 for ultra-wide/tall content

  • Video-to-Video support - Transform existing videos

Example:

Setting Parameters

Aspect Ratio

Choose based on your platform:

  • 16:9 - YouTube, general video (supported by all models)

  • 9:16 - TikTok, Instagram Reels, YouTube Shorts (supported by all models)

  • 1:1 - Instagram posts (not supported by VEO 3.1)

  • 21:9 - Ultrawide cinematic (Seedance 2)

  • 4:3 - Classic/vintage look (Wan 2.7, Seedance 2)

  • 3:4 - Portrait alternative (Wan 2.7, Seedance 2)

  • 2:1 - Panoramic landscape (Grok only)

  • 1:2 - Extra tall portrait (Grok only)

  • 20:9 / 19.5:9 - Modern smartphone landscape (Grok only)

  • 9:19.5 / 9:20 - Modern smartphone portrait (Grok only)

Note: VEO 3.1 models only support 16:9 and 9:16 aspect ratios. For 1:1, 4:3, or 3:4, use Wan 2.7 or Seedance 2. For 21:9 ultrawide, use Seedance 2. For mobile-specific ratios (20:9, 19.5:9), use Grok.

Duration

  • 4 seconds - Quick clips, social media

  • 8 seconds - Standard length (most models)

  • 10 seconds - Longer clips (Kling, Sora)

  • 12 seconds - Maximum (Sora 2 only)

  • 15 seconds - Maximum (Wan 2.7, Seedance 2, Grok)

Resolution

  • 720p - Faster generation, lower cost

  • 1080p - Standard quality (most models)

  • 4K - Highest quality (Veo 3.1 and Veo 3.1 Fast — Veo 3.1 Lite is 720p / 1080p only)

Audio

Enable audio for models that support it:

  • Veo 3.1 - Full audio generation

  • Veo 3.1 Lite - Visuals only (no AI-generated audio in the clip)

  • Seedance 2 - Native audio generation (on by default)

  • Kling 3.0 - Audio support (Pro and Standard)

  • Other models (Wan 2.7, Hailuo, Sora, Grok) - Add audio in post

Advanced Techniques

Iterative Refinement

  1. Generate initial video

  2. Review result

  3. Ask for adjustments: "Make it more cinematic" or "Add more motion."

  4. Regenerate with improvements

Style Transfer

Describe the style you want:

  • "Cinematic film look"

  • "Documentary style"

  • "Animated/cartoon style"

  • "Vintage 16mm film"

  • "Modern digital"

Camera Movement

Specify camera work:

  • Static - No movement

  • Push-in - Camera moves closer

  • Pull-out - Camera moves away

  • Dolly - Sideways movement

  • Orbit - Circular movement around a subject

  • FPV - First-person view, dynamic

  • Aerial - Drone-like perspective

  • Tracking - Follows the subject

Lighting Descriptions

  • Golden hour - Warm, soft, sunset/sunrise

  • Blue hour - Cool, twilight

  • Natural daylight - Bright, clear

  • Studio lighting - Controlled, professional

  • Film noir - High contrast, dramatic shadows

  • Soft, natural - Diffused, gentle

Common Use Cases

B-Roll Creation

Prompt Example:

Establishing Shots

Prompt Example:

Product Showcases

Prompt Example:

Social Media Content

Prompt Example:

Troubleshooting

"Video doesn't match my prompt."

Solutions:

  • Be more specific in your description

  • Include camera movement details

  • Mention style and mood

  • Try a different model

  • Iterate: "Make it more [desired quality]."

"Generation failed"

Solutions:

  • Check that your Fal.ai account has credits

  • Verify internet connection

  • Try a shorter prompt

  • Check model availability

  • Reduce resolution or duration

"Video quality is low."

Solutions:

  • Use higher resolution models (Sora 2 Pro, Wan 2.7)

  • Increase the resolution setting

  • Use upscaling after generation

  • Check prompt clarity (better prompts = better results)

"Audio not generating."

Solutions:

  • Ensure you selected a model with audio (Veo 3.1, Veo 3.1 Fast, Seedance 2 — not Veo 3.1 Lite; or Kling 3.0)

  • Check "Audio" toggle is enabled

  • Mention audio in your prompt

  • Some models don't support audio

Best Practices

  1. Start specific - More details = better results

  2. Include camera movement - "Static shot" vs. "push-in."

  3. Describe lighting - "Golden hour" or "studio lighting."

  4. Mention style - "Cinematic" or "documentary."

  5. Iterate - Refine based on results

  6. Use reference mode - For character/product consistency

  7. Match model to use case - See model comparison guide


Next: Learn about Image-to-Video to animate still images. Also see: Video Prompter Assistant to learn how to use it to create expert prompts.

Last updated