Create Consistent, Seamless Shots of the Same Person in Sora AI

Tutorial

January 8th, 2025

Create Consistent, Seamless Shots of the Same Person in Sora AI

Category - Video generation Difficulty - Intermediate Tools - Sora, ChatGPT

Creating consistent, seamless shots of the same person in Sora AI significantly advances AI-generated video technology. This innovative feature focuses on maintaining visual continuity when depicting individual subjects across multiple scenes. It addresses one of the key challenges in AI video generation: ensuring that a person's physical characteristics, clothing, and overall appearance remain consistent throughout different shots and camera angles. While the technology enables creators to generate diverse scenes featuring the same individual, it is tough to get the same person in all shots.

In this tutorial, we will use a prompting technique called ‘Chain of Thought.' Although this technique doesn’t guarantee consistent results, we will tweak the prompts to achieve consistent shots of the same person in two video clips of the same length. Each clip will consist of a 5-second video in 720p resolution and an aspect ratio of 16:9.

By the end of this tutorial, you’ll be able to:

Tweak settings for 5-second videos
Write a prompt to achieve consistent shots of the same person in both videos
Create another video by uploading the video generated using the first prompt

Let's get right into it!

Tweak settings for 5-second videos

First, we have to tweak settings for 5-second video clips. Sora doesn’t allow a Plus account to create 10-second videos in 720p resolution. You can either select a 5-second clip in 720p resolution or a 10-second clip in 480p resolution. If you subscribe to their pro plan ($200/month), you can select a 10-second video in 720p resolution.

For our example, we’ll use a 5-second video clip in 720p resolution. Before writing a prompt and generating a video, we need to tweak other settings. Let’s do that first.

Access Sora.com and click the ‘Presets’ button in the prompt box.

These presets allow you to select various preset styles to help you create videos with distinct aesthetics. Here is a quick rundown of what each of the presets can do:

Balloon World

With its exaggerated shapes and bright colors, this preset gives your video a playful, cartoon-like appearance. It is perfect for content for younger audiences or projects requiring a whimsical touch.

Stop Motion

It emulates the charm of traditional stop-motion animation, introducing subtle imperfections and textures that mimic handcrafted models, adding a nostalgic and artistic feel to your videos.

Archival

It incorporates elements like film grain, dust, and scratches to create a vintage, historical feel, ideal for projects that require an aged or documentary-style aesthetic.

Film Noir

It captures the essence of classic black-and-white cinema with high-contrast imagery and dramatic lighting, evoking mystery, drama, or a retro cinematic atmosphere.

We are not going to use any preset in this tutorial. We want to show you how to achieve consistent shots of the same person in different 5-second clips. Leave the default preset setting.

Next is the aspect ratio. There are three choices. Here’s the rundown of what each option means:

16:9—Without getting into technical details, 16:9 is the aspect ratio used for YouTube videos and shorts.
1:1 - That’s the most minor aspect ratio. It can be used for thumbnails or recreational purposes
9:16 - This is for Instagram posts and TikTok. This one is useful, especially when posting videos on multiple platforms.

Regardless of where you want to post the video, 16:9 is the best aspect ratio to experiment with Sora AI.

Next, select resolution. As we told you before, 480p is the only option for a 10-second video and 720p for a 5-second video. If you want to generate a video in 1080p, you must pay $200/month for a pro account.

The duration setting defines the length of the video. Here is some general information to help you understand how video length works in Sora:

Definition: Sets the total length of the generated video.

Options: Sora allows video durations of up to 20 seconds, suitable for concise and impactful content.

Impact: Longer durations enable more complex narratives but may increase generation time and resource usage.

Again, with a Pro account, you can select a 15- or 20-second duration for your video. Since we cannot choose a 15- or 20-second duration for our 720p video, we will choose a 5-second duration.

The last setting to tweak is variation. It lets you generate multiple video versions from a single text prompt. This feature allows you to explore different interpretations of your idea, providing a selection of outputs to choose from. By requesting multiple variations, you can select the video that best aligns with your creative vision.

For this tutorial, we chose,

No presets
16:9 aspect ratio
720p resolution
5-second duration
2 variations

Write a prompt to achieve consistent shots of the same person in both videos.

Now comes the most important yet fascinating part of this tutorial. We will write a comprehensive prompt to give us the best shot at generating the same person in two videos.

First, we will use ChatGPT to create a prompt. Why are we using ChatGPT for Sora?? We want a prompt that helps us generate a consistent shot across two videos using the same person. The face and body features should be at least 95% similar. We do not want to spend our energy writing the detailed scene. We will tweak the ChatGPT prompt to personalize it and make it even more comprehensive.

Open ChatGPT and select the o1 model for deep reasoning.

Use the following prompt to create a prompt for Sora.

Prompt:

Write a detailed prompt for Sora AI that gives us the exact details of the person. I want to show the same person in all the clips in Sora. The person should remain consistent. The prompt should generate a video that shows a 20-year-old girl with a fair complexion, big eyes, and rosy lips. She smiles and drinks coffee from a cup, sitting on a comfy chair near a window. The afternoon sun shines on her face, creating a calm and serene atmosphere. The setting is a cozy restaurant with antique furnishings. The girl is wearing an orange sweater and shiny bands on her cuff, not her hands.

That looks good to us. Let’s tweak it a bit and use it to generate a video in Sora.

Tweaked prompt

Create a video featuring a 20-year-old girl with a fair complexion, big expressive eyes, and rosy lips that form a warm smile. She is sitting comfortably on a cozy chair near a window in a charming restaurant adorned with antique furnishings, such as wooden chairs, vintage tables, and soft lighting fixtures. The afternoon sun streams through the window, casting a gentle, natural glow on her face and enhancing the calm and serene atmosphere of the scene. She is dressed in an orange sweater paired with jeans (or specify another bottom if desired) and accessorized with shiny bands on her cuffs, positioned on her sleeves rather than her hands. As she smiles and drinks coffee from a cup, her relaxed posture and the inviting environment convey a sense of tranquility and warmth. Ensure that her hairstyle remains consistent (e.g., long wavy hair or straight bob) and that the orange sweater and shiny cuff bands are always present in every clip. The background should consistently depict the cozy restaurant setting with antique décor and the specific lighting from the afternoon sun. Incorporate soft background sounds or gentle ambient restaurant noise to complement the serene mood. Use medium and close-up camera angles to focus on her expressions and actions, maintaining uniformity in her appearance, attire, and the overall inviting atmosphere.

Click the ‘Storyboard’ link at the bottom right corner of the prompt box.

The storyboard feature allows you to manipulate every frame of the video. Since we want to generate a consistent shot of the same person across all video clips, we will paste this prompt in the first frame and click ‘Create.’

Here is the output.

Let’s select the first variation on the left. Download the video. Click the download button in the top right corner of the screen. A dialog appears. Click ‘Download.

Create another video by uploading the video generated using the first prompt.

Now that we have our first video let’s create a second one with the same settings. Click the Storyboard link and use the following prompt to generate a video of the same person in another setting.

Prompt:

Refer to the video you created previously. Create another video featuring the same 20-year-old girl with a fair complexion, big expressive eyes, and rosy lips that now form a frown, reflecting anger. She is sitting comfortably on a cozy chair near a window in the same charming restaurant adorned with antique furnishings, such as wooden chairs, vintage tables, and soft lighting fixtures. The afternoon sun still streams through the window, casting a gentle, natural glow on her face and maintaining the calm yet tense atmosphere of the scene. She is dressed in the familiar orange sweater paired with jeans (or the previously specified bottoms) and accessorized with shiny bands on her cuffs, positioned on her sleeves rather than her hands. In this clip, she is engaged in a conversation with a guy, her body language and facial expressions conveying frustration and anger. Her relaxed posture shifts to a more tense stance as she speaks, emphasizing her emotional state. Ensure that her hairstyle remains consistent with the previous clips (e.g., long wavy hair or straight bob) and that the orange sweater and shiny cuff bands are always present. The background should consistently depict the cozy restaurant setting with antique décor and the specific lighting from the afternoon sun. Incorporate soft background sounds or gentle ambient restaurant noise to complement the scene, with perhaps subtle changes to reflect the tension in the interaction. Use medium and close-up camera angles to focus on her expressions and interactions, maintaining uniformity in her appearance, attire, and the overall inviting yet emotionally charged atmosphere throughout the video clip.

Paste the prompt in the storyboard window and click ‘Create.’

Here are the results:

If you want to blend the videos together, click the ‘Blend’ button at the bottom of the video player and select ‘Transition’ from the drop-up menu.

Here are the results:

There you have it. The person and the settings are similar in both video clips—the secret lies in writing a prompt. Once you get the hang of it, creating consistent shots of the same person in multiple video clips will be easy.

Features Categories: Tutorial

Posted By: ai_base_admin