> For the complete documentation index, see [llms.txt](https://docs.chatvideopro.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.chatvideopro.com/features/video-generation/reference-mode.md).

# Reference Mode

Reference Mode generates video using one or more images as visual anchors. Instead of asking the model to invent a character, product, outfit, location, or style from text alone, you provide images that show what should stay consistent.

Use it when identity matters: a recurring character, presenter, product, mascot, brand object, wardrobe, location, or visual look.

{% hint style="info" %}
If you attach exactly two images and the app switches to Transition Mode, that means Chat Video Pro is treating them as a start frame and end frame. To use the images as references instead, choose a reference-capable model such as Kling O3 Reference, Seedance 2 Reference, Wan 2.7 Reference, or Veo 3.1 Reference.
{% endhint %}

***

### When To Use Reference Mode

Use Reference Mode when you want a generated video to follow visual examples:

* A presenter should look like the same person across shots.
* A product should keep its shape, color, and design.
* A character should remain recognizable.
* A location, wardrobe, or brand style should carry across generations.
* You need multiple videos that feel like the same campaign.
* Text-only prompting is not enough to preserve the subject.

Choose another workflow when:

<table><thead><tr><th width="400">You have...</th><th>Use instead</th></tr></thead><tbody><tr><td>One image that should become the first frame of a video</td><td>Image-to-Video or Studio Motion Director</td></tr><tr><td>Two images that should connect as start/end frames</td><td>Transition Mode or Studio AI Transitions</td></tr><tr><td>A still image that needs alternate camera angles</td><td>Studio Multi-Cam</td></tr><tr><td>A cinematic still that needs to be created first</td><td>Studio Cinematic Lab</td></tr><tr><td>Existing video to edit</td><td>Studio</td></tr></tbody></table>

***

### How It Works

1. Enable **Generate Media**.
2. Attach reference images of the same subject, product, character, or look.
3. Choose a reference-capable model.
4. Write a prompt describing the scene, action, camera, and mood.
5. Configure duration, aspect ratio, resolution, and audio options when available.
6. Generate the video.
7. Review whether the subject stayed consistent.

The references provide identity and visual direction. The prompt provides the new scene and action.

***

### Reference Mode vs. Transition Mode

This is the most common point of confusion.

<table><thead><tr><th width="361">If the images are...</th><th>Use...</th></tr></thead><tbody><tr><td>The first and last frame of a shot</td><td>Transition Mode</td></tr><tr><td>Examples of the same person/product/character</td><td>Reference Mode</td></tr><tr><td>Two frames you want to connect with a polished style</td><td>Studio AI Transitions</td></tr><tr><td>A single still you want to animate</td><td>Image-to-Video or Motion Director</td></tr></tbody></table>

Transition Mode asks: **How should image A become image B?**

Reference Mode asks: **What should stay consistent while the model creates a new shot?**

***

### Choose Strong Reference Images

Good references are clear, consistent, and useful.

Use images that show:

* The same person, product, or character.
* A clear face, silhouette, product shape, or key design detail.
* Different useful angles when possible.
* Similar identity even if pose, expression, or lighting changes.
* Enough resolution for the model to read details.
* The most important visual traits you want preserved.

Avoid references that are:

* Blurry, dark, or low quality.
* Different people or different products.
* Contradictory styles.
* Extreme angles only.
* Heavily filtered or distorted.
* Full of unrelated background clutter.
* Too many images that fight each other.

More references are not always better. A small set of clean references often works better than a large set of mixed-quality images.

***

### How Many References To Use

Use the smallest set that explains the subject.

<table><thead><tr><th width="197">Reference count</th><th>Best for</th></tr></thead><tbody><tr><td>1 image</td><td>Simple product, logo-like subject, or one clear character anchor.</td></tr><tr><td>2-3 images</td><td>Most people, products, presenters, and brand subjects.</td></tr><tr><td>4-7 images</td><td>Complex characters, varied angles, or stronger identity preservation.</td></tr><tr><td>8-9 images</td><td>When the selected model supports it and you have genuinely useful angle/style variety.</td></tr></tbody></table>

If results ignore your references, try better references before adding more. If results copy the references too literally, reduce the count or use more varied images.

***

### What To Prompt

Do not spend the whole prompt describing the character if the reference images already show them. Use the prompt to describe the new shot.

Focus on:

<table><thead><tr><th width="231">Prompt layer</th><th>What to describe</th></tr></thead><tbody><tr><td>Setting</td><td>Where the subject is now.</td></tr><tr><td>Action</td><td>What the subject does.</td></tr><tr><td>Camera</td><td>Shot size, movement, and angle.</td></tr><tr><td>Mood/style</td><td>Cinematic, commercial, documentary, playful, dramatic.</td></tr><tr><td>Audio/dialogue</td><td>Only if the selected model supports generated audio.</td></tr><tr><td>Preservation</td><td>What must stay consistent: face, outfit, product shape, logo, color.</td></tr></tbody></table>

Useful structure:

{% code overflow="wrap" %}

```
The referenced subject [action] in [setting], [camera movement/composition], [lighting/style], preserving [identity/product details].
```

{% endcode %}

***

### Prompt Examples

#### Presenter Video

{% code overflow="wrap" %}

```
The referenced presenter speaks directly to camera in a clean modern studio, confident and friendly delivery, medium shot, soft key light, subtle background blur, preserve the face and outfit from the references.
```

{% endcode %}

#### Character Scene

{% code overflow="wrap" %}

```
The referenced character walks through a neon-lit alley at night, looking over their shoulder with suspicion, slow handheld tracking shot, cinematic noir lighting, preserve the character's face, jacket, and silhouette.
```

{% endcode %}

#### Product Clip

{% code overflow="wrap" %}

```
The referenced product sits on a dark reflective surface while the camera slowly orbits, studio rim light catching the edges, premium commercial style, preserve the product shape, color, and label.
```

{% endcode %}

#### Brand Mascot

{% code overflow="wrap" %}

```
The referenced mascot waves from a bright trade show booth, cheerful expression, slow push-in, colorful brand environment, preserve the mascot proportions and costume details.
```

{% endcode %}

***

### Weak Prompts To Avoid

Only describes the reference:

```
A person with brown hair and a blue jacket.
```

Better:

{% code overflow="wrap" %}

```
The referenced person walks confidently through a modern office lobby, natural daylight, medium tracking shot, preserve their face and blue jacket.
```

{% endcode %}

Too vague:

```
The character does something cool.
```

Better:

{% code overflow="wrap" %}

```
The referenced character steps out of a sports car at night, camera low and close, neon reflections on wet pavement, confident cinematic reveal.
```

{% endcode %}

Conflicts with the reference:

```
Make the referenced black backpack into a red suitcase.
```

Better:

{% code overflow="wrap" %}

```batch
Show the referenced black backpack on a clean studio pedestal, slow orbit, premium product lighting, preserve the backpack shape and material.
```

{% endcode %}

If you want to change the subject itself, use image editing or another workflow. Reference Mode is strongest when references are meant to remain recognizable.

***

### Choosing A Model

Choose based on what matters most.

<table><thead><tr><th width="437">Need</th><th>Good starting point</th></tr></thead><tbody><tr><td>Dialogue or talking head with references</td><td>Veo 3.1 Reference</td></tr><tr><td>Flexible duration and strong reference quality</td><td>Kling O3 Reference</td></tr><tr><td>More reference images and no generated audio</td><td>Wan 2.7 Reference</td></tr><tr><td>Reference video with native audio/ambient sound</td><td>Seedance 2 Reference</td></tr><tr><td>Guided character/product still creation before video</td><td>Studio Cinematic Lab</td></tr></tbody></table>

For a broader model chooser, see Supported Video Models.

***

### When To Use Studio Instead

Studio is often better when the task has a more specific creative shape.

| Goal                                      | Better Studio workflow |
| ----------------------------------------- | ---------------------- |
| Create a consistent cinematic still first | Cinematic Lab          |
| Animate a still with a camera move        | Motion Director        |
| Create alternate angles of a subject      | Multi-Cam              |
| Transition between two intentional frames | AI Transitions         |
| Transfer motion to a character image      | Motion Capture         |

Use Reference Mode when you want direct model control from the composer. Use Studio when you want the workflow to guide the prompt, model, and asset setup.

***

### Best Practices

#### Build A Small Reference Set

For recurring people, products, or characters, keep 3-5 strong images in your Library or Recents. Reuse the same set across generations for more consistent results.

#### Keep References Focused

Do not mix unrelated styles unless style mixing is the goal. A product render, a blurry phone photo, and a stylized illustration may confuse the model if they are all meant to define the same subject.

#### Prompt The Scene, Not The Biography

References handle appearance. Your prompt should direct the new shot: where the subject is, what they are doing, how the camera moves, and what the mood is.

#### Preserve What Matters

If a detail must stay consistent, name it:

```
Preserve the product label, black color, rounded silhouette, and silver zipper.
```

```
Preserve the face, hairstyle, red jacket, and slim silhouette.
```

#### Expect Some Drift

Reference Mode improves consistency, but it is not a perfect identity lock. For critical brand, legal, or celebrity likeness work, review carefully and use manual finishing where needed.

***

### Example Workflows

#### Consistent Presenter Clip

1. Attach 2-4 clear images of the presenter.
2. Choose a model that supports audio if the presenter should speak.
3. Prompt the setting, delivery, camera framing, and dialogue.
4. Review face consistency and lip-sync.
5. Finish sound and edits in Premiere.

#### Product Campaign Shot

1. Attach 2-3 clean product references.
2. Prompt a new commercial scene or product movement.
3. Preserve shape, color, label, and material.
4. Generate multiple versions with different camera or lighting direction.
5. Upscale or edit the best result if needed.

#### Character Series

1. Build a small reference set for the character.
2. Reuse it across each shot.
3. Change the prompt for each scene, action, and camera move.
4. Keep wardrobe and core details consistent unless the story requires a change.

***

### Troubleshooting

#### Reference Mode does not activate

Make sure Generate Media is enabled, images are attached, no video is attached, and a reference-capable model is selected. If exactly two images trigger Transition Mode, manually switch to a reference model.

#### The character does not look consistent

Use clearer references, add more useful angles, and include preservation language in the prompt. Avoid mixing references that show different people, outfits, or styles unless that variation is intentional.

#### The output copies the reference too closely

Use fewer references or add more scene/action detail. The model may be treating your references as the whole shot instead of the identity anchor.

#### The result ignores the scene prompt

Your references may be too dominant or too visually similar. Reduce the reference count and make the prompt more specific about setting, action, and camera.

#### The wrong mode activates

Two images often route to Transition Mode. If you want a start/end transition, stay there. If you want identity references, choose Reference Mode manually with a compatible reference model.

#### The duration or audio options are not what you expected

Reference models have different limits. Some support audio, some do not. Some have fixed duration. Choose the model based on the shot's hardest requirement: audio, duration, reference count, or visual quality.

***

### Related Pages

* [Supported Video Models](/features/video-generation/supported-video-models.md) - Choose the right model.
* [Transition Mode](/features/video-generation/transition-mode.md) - Connect two frames as start/end images.
* [Image-to-Video](/features/video-generation/image-to-video.md) - Animate one source image.
* [Cinematic Lab](/features/studio/cinematic-lab.md) - Create consistent cinematic source frames.
* [Multi-Cam](/features/studio/multi-cam.md) - Generate alternate angles.
* [Motion Director](/features/studio/motion-director.md) - Animate a still with guided camera movement.

***

**Next:** If your two images are meant to become a start and end frame, use Transition Mode. If you want a guided transition style, use Studio AI Transitions.
You have...	Use instead
One image that should become the first frame of a video	Image-to-Video or Studio Motion Director
Two images that should connect as start/end frames	Transition Mode or Studio AI Transitions
A still image that needs alternate camera angles	Studio Multi-Cam
A cinematic still that needs to be created first	Studio Cinematic Lab
Existing video to edit	Studio
If the images are...	Use...
The first and last frame of a shot	Transition Mode
Examples of the same person/product/character	Reference Mode
Two frames you want to connect with a polished style	Studio AI Transitions
A single still you want to animate	Image-to-Video or Motion Director
Reference count	Best for
1 image	Simple product, logo-like subject, or one clear character anchor.
2-3 images	Most people, products, presenters, and brand subjects.
4-7 images	Complex characters, varied angles, or stronger identity preservation.
8-9 images	When the selected model supports it and you have genuinely useful angle/style variety.
Prompt layer	What to describe
Setting	Where the subject is now.
Action	What the subject does.
Camera	Shot size, movement, and angle.
Mood/style	Cinematic, commercial, documentary, playful, dramatic.
Audio/dialogue	Only if the selected model supports generated audio.
Preservation	What must stay consistent: face, outfit, product shape, logo, color.
Need	Good starting point
Dialogue or talking head with references	Veo 3.1 Reference
Flexible duration and strong reference quality	Kling O3 Reference
More reference images and no generated audio	Wan 2.7 Reference
Reference video with native audio/ambient sound	Seedance 2 Reference
Guided character/product still creation before video	Studio Cinematic Lab