# Text-to-Video

**Text-to-Video** is the starting point for creating AI video from scratch — you describe a scene and the model generates it. Use this when you want to create new footage that doesn't exist yet: B-roll, establishing shots, product showcases, or social content.

**Not sure which mode to use?**

| You have...                                 | Use instead                                                                    |
| ------------------------------------------- | ------------------------------------------------------------------------------ |
| An existing still image to animate          | [Image-to-Video](/features/video-generation/image-to-video.md)                 |
| Two images and want a morph transition      | [Transition Mode](/features/video-generation/transition-mode.md)               |
| Reference photos of a character             | [Reference Mode](/features/video-generation/reference-mode.md)                 |
| Existing footage to modify or extend        | [Video Canvas Editor](/features/video-editing-tools/video-canvas-editor.md)    |
| Want an expert prompt written for you first | [Video Prompter Assistant](/conversation-starters/video-prompter-assistant.md) |

### How It Works

1. **Enable Generate Media** - Toggle the Generate Media button in the composer
2. **Select Video Model** - Choose from Sora 2, Veo 3.1 (including **Veo 3.1 Lite**), Grok, Kling, Hailuo, Seedance 2, or Wan 2.7
3. **Set Parameters** - Choose aspect ratio, duration, resolution
4. **Write Your Prompt** - Describe the video you want
5. **Generate** - Click send and wait for your video

### Writing Effective Prompts

#### Essential Elements

A good video prompt includes:

1. **Subject** - What or who is the focus?
2. **Setting** - Where does this take place?
3. **Action** - What's happening?
4. **Camera Movement** - How is it shot?
5. **Style** - What's the aesthetic?
6. **Mood/Atmosphere** - What's the feeling?

#### Prompt Structure

```
[CAMERA MOVEMENT] [SUBJECT] [ACTION] in [SETTING] at [TIME], 
[STYLE], [LIGHTING], [COLOR PALETTE], [MOOD]
```

#### Good Examples

✅ **Detailed and Specific:**

```
Slow push-in on a steaming coffee cup on a wooden table in a cozy 
coffee shop during golden hour, cinematic style, warm natural lighting 
streaming through large windows, soft bokeh background, peaceful and 
inviting atmosphere

```

✅ **Action-Focused:**

```
A dynamic tracking shot following a skateboarder weaving through 
urban streets at sunset, energetic and fast-paced, vibrant colors, 
street art in the background, urban energy
```

✅ **Cinematic:**

```
Wide establishing shot of a mountain range at dawn, cinematic 
aerial perspective, dramatic clouds, golden hour lighting, epic 
and majestic atmosphere, film grain texture
```

#### Bad Examples

❌ **Too Vague:**

```
A coffee shop
```

❌ **Missing Details:**

```
Person walking
```

❌ **No Style:**

```
A video of a car
```

### Model-Specific Tips

#### Sora 2 / Sora 2 Pro

**Best For:**

* Longer clips (up to 12 seconds)
* Cinematic quality
* Detailed scene descriptions

**Prompt Tips:**

* Use cinematic language
* Describe camera movement clearly
* Longer, detailed prompts work well
* Mention style and mood

**Example:**

```
Cinematic slow-motion shot of raindrops hitting a puddle in an 
urban alley at night, film noir style, neon reflections, moody 
atmosphere, shallow depth of field, 35mm film aesthetic
```

#### Veo 3.1

**Best For:**

* Dialogue and speaking characters
* Audio generation
* Character consistency

**Prompt Tips:**

* Include dialogue in quotes: "Hello, welcome to our channel."
* Describe the speaking character clearly
* Mention audio needs
* Use reference mode for character consistency

**Example:**

```
Medium shot of a friendly host speaking directly to the camera, 
"Welcome to today's tutorial, where we'll learn something amazing." 
with background music, professional lighting, clean background, 
engaging and energetic tone
```

#### Veo 3.1 Lite

**Best For:**

* Fast, budget-friendly **text-to-video** when you do not need AI-generated audio in the clip
* Visuals only — add voiceover, music, or dialogue in Premiere

**Prompt Tips:**

* Describe the scene, motion, and style like any Veo prompt; omit expectations of synced speech in the render
* Duration options align with other Veo 3.1 models (e.g. 4s / 6s / 8s); resolution is **720p or 1080p** (no 4K)

**See also:** [Supported Video Models](/features/video-generation/supported-video-models.md) for the full Veo lineup comparison.

#### Kling Models

**Best For:**

* Complex camera movement
* Dynamic scenes
* Cinematic camera work

**Prompt Tips:**

* Emphasize camera movement: "360° orbit", "FPV drone shot"
* Describe complex motion clearly
* Use cinematic terminology
* Kling 3.0 understands very detailed prompts

**Example:**

```
Smooth 360° orbit around a vintage motorcycle in a garage, 
cinematic camera movement, golden hour lighting through windows, 
dust particles in air, detailed textures, professional cinematography
```

#### Hailuo 2.3

**Best For:**

* Action and sports
* Dynamic movement
* Fast-paced content

**Prompt Tips:**

* Focus on action and movement
* Describe dynamic elements
* Emphasize motion and flow

**Example:**

```
Dynamic action shot of a basketball player making a slam dunk, 
slow-motion at peak moment, sports arena atmosphere, dramatic 
lighting, energetic and powerful, crowd in the background
```

#### Seedance 2

**Best For:**

* Natural motion and stable subjects
* Cinematic footage with native audio
* Wide aspect ratio support including 21:9 ultrawide

**Prompt Tips:**

* Describe motion and mood naturally
* Audio is generated by default — include sound cues in your prompt if desired
* Supports "auto" duration (the model picks the best length for your prompt)
* Use Seedance 2 Fast for quicker iterations at the same quality tier

**Example:**

```
A lone violinist performing on a foggy bridge at dawn, slow crane shot 
rising to reveal the river below, strings echoing through the mist, 
soft golden light breaking through the clouds
```

#### Wan 2.7

**Best For:**

* High resolution 1080p output
* Flexible duration (2-15s)
* All standard aspect ratios plus 4:3 and 3:4

**Prompt Tips:**

* Works well with detailed scene descriptions
* Supports negative prompts for avoiding unwanted elements
* Prompt expansion enabled by default (enhances short prompts)
* Good for cinematic and narrative content

**Example:**

```
Cinematic shot of a vintage car driving down a coastal highway at sunset, 
smooth camera tracking from the side, ocean waves visible in background, 
warm golden lighting, film grain texture, nostalgic atmosphere
```

#### Grok (xAI)

**Best For:**

* Fast video generation
* Mobile-first content (TikTok, Instagram Reels)
* Unique aspect ratios (2:1, 1:2, 20:9, 19.5:9, 9:19.5, 9:20)
* Quick iterations and previews

**Prompt Tips:**

* Simple, natural language prompts work well
* Describe motion and mood clearly
* Works great with direct, descriptive prompts
* Also supports Image-to-Video and Video-to-Video modes

**Unique Features:**

* **Duration up to 15 seconds** - Matches longest models
* **Mobile aspect ratios** - 19.5:9, 9:19.5 for modern smartphones
* **Panoramic formats** - 2:1, 1:2 for ultra-wide/tall content
* **Video-to-Video support** - Transform existing videos

**Example:**

```
A person walking through a bustling city street at night, 
neon signs reflecting on wet pavement, dynamic urban atmosphere,
smooth camera following motion, cinematic mood
```

### Setting Parameters

#### Aspect Ratio

Choose based on your platform:

* **16:9** - YouTube, general video (supported by all models)
* **9:16** - TikTok, Instagram Reels, YouTube Shorts (supported by all models)
* **1:1** - Instagram posts (not supported by VEO 3.1)
* **21:9** - Ultrawide cinematic (Seedance 2)
* **4:3** - Classic/vintage look (Wan 2.7, Seedance 2)
* **3:4** - Portrait alternative (Wan 2.7, Seedance 2)
* **2:1** - Panoramic landscape (Grok only)
* **1:2** - Extra tall portrait (Grok only)
* **20:9 / 19.5:9** - Modern smartphone landscape (Grok only)
* **9:19.5 / 9:20** - Modern smartphone portrait (Grok only)

**Note:** VEO 3.1 models only support 16:9 and 9:16 aspect ratios. For 1:1, 4:3, or 3:4, use Wan 2.7 or Seedance 2. For 21:9 ultrawide, use Seedance 2. For mobile-specific ratios (20:9, 19.5:9), use Grok.

#### Duration

* **4 seconds** - Quick clips, social media
* **8 seconds** - Standard length (most models)
* **10 seconds** - Longer clips (Kling, Sora)
* **12 seconds** - Maximum (Sora 2 only)
* **15 seconds** - Maximum (Wan 2.7, Seedance 2, Grok)

#### Resolution

* **720p** - Faster generation, lower cost
* **1080p** - Standard quality (most models)
* **4K** - Highest quality (Veo 3.1 and Veo 3.1 Fast — **Veo 3.1 Lite** is 720p / 1080p only)

#### Audio

Enable audio for models that support it:

* **Veo 3.1** - Full audio generation
* **Veo 3.1 Lite** - Visuals only (no AI-generated audio in the clip)
* **Seedance 2** - Native audio generation (on by default)
* **Kling 3.0** - Audio support (Pro and Standard)
* Other models (Wan 2.7, Hailuo, Sora, Grok) - Add audio in post

### Advanced Techniques

#### Iterative Refinement

1. Generate initial video
2. Review result
3. Ask for adjustments: "Make it more cinematic" or "Add more motion."
4. Regenerate with improvements

#### Style Transfer

Describe the style you want:

* "Cinematic film look"
* "Documentary style"
* "Animated/cartoon style"
* "Vintage 16mm film"
* "Modern digital"

#### Camera Movement

Specify camera work:

* **Static** - No movement
* **Push-in** - Camera moves closer
* **Pull-out** - Camera moves away
* **Dolly** - Sideways movement
* **Orbit** - Circular movement around a subject
* **FPV** - First-person view, dynamic
* **Aerial** - Drone-like perspective
* **Tracking** - Follows the subject

#### Lighting Descriptions

* **Golden hour** - Warm, soft, sunset/sunrise
* **Blue hour** - Cool, twilight
* **Natural daylight** - Bright, clear
* **Studio lighting** - Controlled, professional
* **Film noir** - High contrast, dramatic shadows
* **Soft, natural** - Diffused, gentle

### Common Use Cases

#### B-Roll Creation

**Prompt Example:**

```
Smooth push-in on a laptop on a modern desk, shallow depth of field, 
soft natural lighting, clean minimalist aesthetic, professional 
and polished
```

#### Establishing Shots

**Prompt Example:**

```
Wide aerial shot of a bustling city at sunset, cinematic perspective, 
golden hour lighting, dynamic clouds, epic and grand scale
```

#### Product Showcases

**Prompt Example:**

```
360° slow rotation of a premium watch on a dark background, 
professional product photography style, dramatic lighting, 
luxury aesthetic, detailed textures
```

#### Social Media Content

**Prompt Example:**

```
Energetic vertical shot of a person dancing in an urban setting, 
vibrant colors, dynamic camera movement, trendy and engaging, 
perfect for social media
```

### Troubleshooting

#### "Video doesn't match my prompt."

**Solutions:**

* Be more specific in your description
* Include camera movement details
* Mention style and mood
* Try a different model
* Iterate: "Make it more \[desired quality]."

#### "Generation failed"

**Solutions:**

* Check that your Fal.ai account has credits
* Verify internet connection
* Try a shorter prompt
* Check model availability
* Reduce resolution or duration

#### "Video quality is low."

**Solutions:**

* Use higher resolution models (Sora 2 Pro, Wan 2.7)
* Increase the resolution setting
* Use upscaling after generation
* Check prompt clarity (better prompts = better results)

#### "Audio not generating."

**Solutions:**

* Ensure you selected a model with audio (**Veo 3.1**, **Veo 3.1 Fast**, **Seedance 2** — not **Veo 3.1 Lite**; or **Kling 3.0**)
* Check "Audio" toggle is enabled
* Mention audio in your prompt
* Some models don't support audio

### Best Practices

1. **Start specific** - More details = better results
2. **Include camera movement** - "Static shot" vs. "push-in."
3. **Describe lighting** - "Golden hour" or "studio lighting."
4. **Mention style** - "Cinematic" or "documentary."
5. **Iterate** - Refine based on results
6. **Use reference mode** - For character/product consistency
7. **Match model to use case** - See model comparison guide

***

**Next:** Learn about Image-to-Video to animate still images.\
**Also see:** [Video Prompter Assistant ](/conversation-starters/video-prompter-assistant.md)to learn how to use it to create expert prompts.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.chatvideopro.com/features/video-generation/text-to-video.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
