Grok Imagine Video 1.5 Prompt Guide:
Write Prompts That Actually Work
Grok Imagine Video 1.5 Preview is xAI's image-to-video model that animates still images into short videos with synchronized audio. The model already sees your uploaded image — your prompt should focus on motion, camera, atmosphere, and audio. This guide follows the official image-to-video prompt structure with copy-ready examples and the mistakes to avoid.
// Image-to-Video Prompt Formula
The Image-to-Video Prompt Structure
Grok Imagine Video 1.5 is image-to-video only. Your uploaded image provides the scene — your prompt describes what should change: motion, camera, atmosphere, and audio.
// The Formula
Motion / Action
What should change — the action, movement, or transformation. Be specific about degree and speed.
Camera Movement
How the camera moves — use standard cinematic language the model understands.
Atmosphere / Lighting
Mood, time of day, or light quality — not what's already visible in the image.
Audio
Native audio is generated automatically. Describe music, SFX, ambience, or dialogue.
// Camera Movements
Cinematic Camera Language
Grok Imagine Video 1.5 understands standard cinematic camera terminology. Always specify a shot type and camera movement in your image-to-video prompts.
| Movement | What it does |
|---|---|
| Pan left / right | Camera rotates horizontally to reveal a scene |
| Tilt up / down | Camera rotates vertically for dramatic reveals |
| Zoom in / out | Lens zooms closer or further |
| Dolly in / out | Camera physically moves forward or backward (more cinematic than zoom) |
| Tracking / follow shot | Camera follows a moving subject |
| Orbit / surround | Camera circles around the subject |
| Aerial / drone | Elevated bird's-eye perspective |
| Handheld | Natural shake for documentary feel or urgency |
| Slow push-in | Gradual forward movement to build tension |
| Static / tripod | No camera movement for stable, formal compositions |
// Audio Prompts
Native Audio Generation
Grok Imagine Video 1.5 generates audio natively alongside the video. Mention music, sound effects, ambience, or short dialogue in your prompt.
Background music
- “with upbeat electronic music”
- “dramatic orchestral score”
Sound effects
- “footsteps on gravel”
- “wind howling”
- “engine revving”
Ambient audio
- “quiet café ambience”
- “forest sounds with birdsong”
Short dialogue
- “a quiet whisper: 'We made it.'”
- “urgent shout: 'Stop him!'”
AUDIO: section at the end of your prompt for clarity. This helps separate visual and audio instructions.// Prompt Keywords
Prompt Keyword Library
Click any keyword to copy it. Combine motion, camera, atmosphere, and audio terms for image-to-video prompts.
Motion & Action
Camera Movement
Atmosphere & Lighting
Audio (Native Generation)
// 30 Copy-Ready Prompts
Prompt Examples by Scene
Click any prompt to copy it. All examples are motion-focused for image-to-video — upload your image first, then paste and customize for your scene.
The sneaker rotates smoothly on the pedestal, camera orbiting at eye level, dramatic spotlight sweeping across the surface.
Slow 360-degree rotation. Studio lighting sweeping across surface. Subtle electronic hum.
Static shot, steam rising from cup. Natural kitchen sounds with distant conversation.
Water droplets falling on watch face in slow motion. Dramatic rim light sweeping. Orchestral swell building.
Product slowly tilts forward revealing details. Clean studio lighting. Quiet ambient tone.
// Bad vs Good Prompts
What Makes a Prompt Actually Work
The fastest way to learn is to see what doesn't work next to what does. These image-to-video prompt comparisons show exactly what to fix — motion, camera, and audio instead of re-describing your image.
A woman with brown hair and blue dress walking on a beach at sunset with waves
Slow pull-back as she walks forward. Ocean breeze moving her hair. Ambient wave sounds.
Car passing
Car racing past at high speed. Static wide shot. Engine revving loudly.
A woman dances gracefully
The man slowly nods and smiles. Gentle camera push-in. Soft room tone.
Steam rising from the coffee cup
Static close-up, steam rising gently. Morning window light. Quiet café ambience.
No blur, avoid shaking, without grain
Sharp focus, stable tripod shot, clean cinematic look.
// Common Mistakes
7 Prompt Mistakes That Kill Your Output
These are the most frequent problems we see with Grok Imagine Video 1.5 prompts. Each mistake leads to blurry, inconsistent, or off-target results — and each has a simple fix.
Re-describing the image
The model already sees it. Describing what's in the photo wastes prompt budget and can cause drift.
Fix: Focus on motion, camera movement, atmosphere, and audio.
Contradicting the source image
Writing actions or subjects that don't match the uploaded photo confuses the output.
Fix: Match your prompt to what's actually in the image.
Tag stacking
"knight, castle, epic, 8K, cinematic" doesn't help — the model needs intent, not keywords.
Fix: Write a natural sentence with clear motion and camera direction.
Too many simultaneous actions
Multiple unrelated actions at once produce inconsistent results.
Fix: Keep it to one subject, one action, one camera move — or list multi-beat actions in order.
No camera direction
Without camera direction, the model defaults to static or unpredictable motion.
Fix: Always specify a shot type and camera movement.
Vague motion
"The thing moves" gives the model nothing to work with.
Fix: Use specific verbs with intensity modifiers — 'racing past at high speed' not 'passing.'
Using negative prompts
"No blur", "avoid shaking" — the model ignores negative instructions entirely.
Fix: Describe what you want instead.
// Think Like a Director
Write Prompts for Image-to-Video
Think like a director — your image is the scene. Write about motion, camera, and audio, not description. Every generation requires an input image.
Don't re-describe the image
The model sees it. Tell it what should change — the action, the camera movement, the atmosphere.
Don't contradict the image
If there's a man in the photo, don't write 'a woman dances.' Match your prompt to what's actually there.
Be specific about motion
'Car passing' is vague — 'car racing past at high speed' gives the model something to work with.
Anchor the subject
Mention prominent features: 'the old man wearing glasses' or 'the woman in the red jacket.'
Negative prompts don't work
The model ignores them. Describe what you want instead.
// What You Can Make
Use Cases for Image-to-Video
Grok Imagine Video 1.5 animates still images into short videos with synchronized audio. It handles both visual generation and audio synthesis in one pass.
Product Showcases
Transform static product photography into dynamic demonstrations. A watch photo becomes a luxury ad with an elegant wrist turn. A sneaker shot gets a 360-degree rotation with dramatic lighting.
Character Animation
Turn illustrated characters into smooth animations. The model understands cartoon physics and exaggerated motion, creating professional-quality animation that would typically require an entire animation team.
Portrait Videos
Animate professional headshots into video introductions with natural human motion. The model handles realistic facial expressions, head turns, and body language.
Creative Projects
Bring concept art to life, animate historical photos, or turn memes into short video clips with appropriate sound effects and music.
// FAQ
Frequently Asked Questions
Common questions about writing prompts for Grok Imagine Video 1.5 — covering length, structure, audio, camera control, and consistency.
// Ready to Generate
Put Your Prompts to Work
Upload a still image, paste a motion-focused prompt from this guide, and generate a short clip with native audio — no API key required on this site.
Want benchmark data and real test results? Read the full review





