Prompts / Guide

The best way to write image prompts (2026 edition)

A seven-slot formula, five worked examples across Flux, Recraft, DALL-E, SDXL, and Ideogram, and a 100-word modifier bank you can steal. Plus how to test the same prompt across every major image generator in one tab.

By ZeroTwo EditorialΒ·Published May 3, 2026Β·Updated May 3, 2026Β·12-min read
TL;DR
The best way to write image prompts is to follow a seven-slot formula β€” Subject, Style, Composition, Lighting, Mood, Technical parameters, and Negative prompt β€” then compare the result across multiple models. How to write image prompts comes down to specificity plus structure: dense descriptive tokens beat poetry, and the same prompt produces radically different results in Flux, DALL-E, SDXL, Recraft, and Ideogram. This guide gives you the formula, five worked examples, and a 100-modifier word bank.

Why prompt structure matters

Diffusion models are conditioned on text embeddings, and the conditioning signal is much sharper when the prompt is structured. The original SDXL paper from Stability AI (Podell et al., 2023) showed that adding micro-conditioning on image size, crop, and aesthetic score lifted CLIP score and human preference dramatically over SD 1.5. The lesson generalizes: the more structured signal you give the model, the more it can give you back.

OpenAI's DALL-E 3 system card credits its caption-improvement pipeline β€” rewriting training captions with GPT β€” for the leap in prompt-following over DALL-E 2 (OpenAI, 2023). On the user side, the inverse holds: better-structured prompts at inference time produce dramatically better images. Structure is the shared language between you and the model.

The seven-slot prompt formula

Memorize seven slots in order. Skip a slot only when you genuinely mean to. Each slot maps to a different conditioning lever inside the model.

SlotExample tokens
Subjecta weathered fisherman, 60s, deep wrinkles
Styleeditorial photography, Magnum Photos aesthetic
Compositionmedium close-up, rule of thirds, shallow DOF
Lightinggolden hour rim light, soft fill, overcast key
Moodcontemplative, melancholic, weather-worn
Technical85mm f/1.4, ISO 200, --ar 3:2 --stylize 250
Negativeno text, no watermark, no extra fingers

Five worked examples

One prompt per major model family. Steal the structure, swap the subject.

Flux 1.1 Pro excels at photoreal skin texture and natural light falloff.
Prompt
Editorial portrait of a 67-year-old female lighthouse keeper, weathered face, salt-grey hair pulled back, wool cable-knit sweater, standing on rocky Atlantic coast at golden hour, rim-lit by low sun, 85mm f/1.4, shallow depth of field, Kodak Portra 400 grain, Magnum Photos style, contemplative gaze
Model Β· Flux 1.1 Pro Β· Params Β· --ar 3:4, guidance 3.5, steps 30, seed 42
Recraft V3's vector-aware engine produces print-ready editorial work with consistent palette.
Prompt
Editorial magazine illustration for a story about urban loneliness, lone figure on midnight subway platform, sodium-vapor amber light, long shadows, flat geometric shapes, limited palette of teal navy and ochre, paper grain texture, New Yorker cover energy, Christoph Niemann influence, no text
Model Β· Recraft V3 Β· Params Β· style: digital_illustration/grain, --ar 4:5
DALL-E 3 follows long natural-language prompts faithfully β€” no parameter syntax needed.
Prompt
A high-end Octane render of a brutalist concrete pavilion floating above a still mirror lake at dawn, soft volumetric fog, polished concrete reflections, warm sodium interior glow spilling through narrow vertical slits, ultra-wide cinematic composition, Architectural Digest cover quality, 8K detail
Model Β· DALL-E 3 Β· Params Β· natural-language only, 1792x1024
Booru-tag prompting + a quality-booster suffix is the canonical SDXL anime pattern.
Prompt
1girl, solo, silver twin tails, red ribbon, sailor school uniform, sitting on rooftop at sunset, distant cityscape, golden hour, soft cinematic lighting, detailed eyes, masterpiece, best quality, very aesthetic, newest. Negative: lowres, bad anatomy, worst quality, jpeg artifacts, extra fingers
Model Β· SDXL + anime LoRA (Animagine XL) Β· Params Β· CFG 7, steps 28, sampler DPM++ 2M Karras
Ideogram 2.0 is unmatched for legible typography baked directly into the image.
Prompt
Abstract risograph poster, overlapping organic blobs in fluorescent pink coral and cobalt blue, visible misregistration, halftone dots, paper grain, the words "SLOW DOWN" set in heavy condensed sans-serif at bottom, Swiss design influence, screen-print aesthetic
Model Β· Ideogram 2.0 Β· Params Β· style: design, --ar 1:1, magic prompt off

Browse 50+ ready-to-copy prompts β†’

Compare across models

See the same prompt across all five models

ZeroTwo Image Studio bundles Flux, DALL-E, SDXL, Recraft, Ideogram, and more behind one prompt box. One subscription, every major image model. $19.99/month.

Open Image Studio

The 100-modifier word bank

Drop these into any slot. Mix two style modifiers, one lighting modifier, and one composition modifier β€” that's a starter recipe.

Style Β· 35
  • cinematic
  • editorial photography
  • Magnum Photos
  • Kodak Portra 400
  • fashion editorial
  • Vogue cover
  • National Geographic
  • documentary
  • film noir
  • 1970s film grain
  • Wes Anderson palette
  • Hayao Miyazaki
  • Studio Ghibli
  • ukiyo-e
  • art nouveau
  • art deco
  • bauhaus
  • brutalist
  • minimalist
  • maximalist
  • baroque
  • rococo
  • impressionist
  • Monet
  • Caravaggio chiaroscuro
  • Vermeer lighting
  • Hopper lonely
  • Rothko color field
  • Basquiat raw
  • graffiti
  • vaporwave
  • synthwave
  • cyberpunk
  • solarpunk
  • cottagecore
Lighting Β· 35
  • golden hour
  • blue hour
  • harsh midday
  • overcast soft
  • rim light
  • backlight
  • split lighting
  • Rembrandt lighting
  • loop lighting
  • butterfly lighting
  • broad lighting
  • short lighting
  • low-key
  • high-key
  • chiaroscuro
  • volumetric fog
  • god rays
  • dappled sunlight
  • neon glow
  • sodium-vapor amber
  • fluorescent green
  • moonlight
  • candlelight
  • firelight
  • studio softbox
  • ring light
  • single window light
  • Caravaggio shadow
  • silhouette
  • subsurface scattering
  • specular highlights
  • lens flare
  • anamorphic flare
  • bokeh
  • tilt-shift miniature
  • aurora borealis
Composition Β· 30
  • rule of thirds
  • centered symmetry
  • leading lines
  • negative space
  • shallow depth of field
  • deep focus
  • Dutch angle
  • low angle hero shot
  • high angle bird's-eye
  • worm's-eye view
  • over-the-shoulder
  • close-up
  • extreme close-up
  • medium shot
  • wide establishing
  • ultra-wide cinematic
  • macro
  • fish-eye
  • anamorphic 2.39:1
  • 9:16 portrait
  • 1:1 square
  • diptych
  • triptych
  • framing within frame
  • foreground silhouette
  • depth layering
  • Fibonacci spiral
  • isometric
  • axonometric
  • orthographic

Negative prompts and technical parameters

Every model exposes a small set of dials beyond the prompt itself. The Midjourney parameter docs (docs.midjourney.com) list the canonical set: --ar for aspect ratio, --stylize for artistic license (0–1000), --seed for reproducibility, --no for exclusions, --cref for character reference, and --sref for style reference. SDXL and Flux expose CFG scale, steps, sampler, and seed via their respective UIs.

Three rules of thumb: lock seeds when iterating, vary seeds when exploring, and treat aspect ratio as a creative choice (3:2 for editorial photo, 4:5 for editorial illustration, 9:16 for vertical social, 1:1 for poster work).

Six common mistakes

  1. Vague subjects β€” 'a person' beats 'a 67-year-old lighthouse keeper' every time. Specificity wins.
  2. Stacking conflicting styles β€” 'photorealistic anime watercolor' confuses every model.
  3. Skipping the negative prompt β€” for SDXL/Flux this is the difference between hands and horror.
  4. Treating prompts as poetry β€” models reward dense descriptive tokens, not flowery prose.
  5. No seed control β€” without a fixed seed you can't iterate on a near-miss; you start over each time.
  6. Testing on one model β€” Flux, DALL-E, and SDXL interpret the same prompt very differently. Compare.

By the numbers

  • SDXL's micro-conditioning lifted aesthetic preference over SD 1.5 in human evals β€” see Section 4 of the SDXL paper (arXiv 2307.01952).
  • DALL-E 3 substantially outperforms DALL-E 2 on prompt-following benchmarks per OpenAI's own evals (DALL-E 3 system card, OpenAI 2023).
  • Midjourney exposes 14+ parameters that materially change output β€” most users tune fewer than three (Midjourney parameter docs).
  • Adobe's 2024 Creative Cloud research found a majority of first prompts produce unusable output without iteration β€” see Adobe Firefly research (adobe.com/firefly); the takeaway is to iterate on seed-locked variants.
  • Long-prompt benefits in diffusion models plateau past a single dense paragraph (arXiv 2403.06952).
β€œPrompts are the new lens β€” describing what you see is becoming a craft.”
β€” David Holz, founder of Midjourney, in The Verge.

Frequently asked questions

What is the best way to write image prompts?

The best way to write image prompts is to use a seven-slot formula β€” Subject, Style, Composition, Lighting, Mood, Technical parameters, and Negative prompt β€” then test the same prompt across multiple models. Specificity beats poetry: 'a 67-year-old lighthouse keeper at golden hour, 85mm f/1.4, Magnum Photos aesthetic' will beat 'an old person by the sea' on every model.

How long should an image prompt be?

Most modern models (Flux, DALL-E 3, Ideogram 2.0) handle 75–250 token prompts well. SDXL works best with 60–120 tokens of comma-separated descriptors. Stuffing a prompt past ~250 tokens often causes the model to drop early tokens. Recent research on long-prompt benefits (arXiv 2403.06952) shows diminishing returns past a single dense paragraph.

Do negative prompts actually work?

Yes β€” for diffusion models like SDXL and Flux, the negative prompt is concatenated as the unconditional guidance signal. Standard negatives like 'lowres, bad anatomy, worst quality, jpeg artifacts, extra fingers, watermark' meaningfully reduce common failure modes. DALL-E 3 and Midjourney handle negatives differently β€” Midjourney uses --no, DALL-E 3 ignores them entirely and you should phrase exclusions in natural language.

What does CFG scale (or --stylize) actually do?

CFG (classifier-free guidance) controls how strictly the model follows your prompt. Low CFG (3–5) gives more creative, looser images; high CFG (10–15) hugs the prompt tightly but can over-saturate. Midjourney's --stylize parameter is the inverse β€” higher = more artistic license. Sweet spots: SDXL CFG 7, Flux guidance 3.5, Midjourney --stylize 100–250.

Why does the same prompt look different in DALL-E vs Midjourney vs Flux?

Each model was trained on different data with different conditioning. DALL-E 3 uses GPT-rewritten captions and rewards natural language. Midjourney is fine-tuned for aesthetic appeal. Flux 1.1 Pro is more literal and photoreal-leaning. SDXL is open-weights and depends entirely on the checkpoint. This is why testing the same prompt across 5+ models is the single highest-leverage habit.

What is a seed and why does it matter?

The seed is the random starting noise. Same prompt + same seed + same model + same parameters = identical image. Lock the seed when you want to iterate on a near-miss (change one word, keep everything else). Vary the seed when you want fresh variations. In Midjourney use --seed 42; in SDXL/Flux it's a parameter; DALL-E 3 doesn't expose seeds publicly.

Which AI gives the best image from this formula?

There is no single winner. Flux 1.1 Pro leads on photoreal portraits. Ideogram 2.0 wins for typography. DALL-E 3 follows long natural-language prompts most faithfully. SDXL with the right LoRA wins on stylized work. Recraft V3 is the editorial illustration leader. The fastest way to find out for your prompt is to run it across all of them in ZeroTwo Image Studio.

How do I get consistent characters across multiple images?

Three options: (1) lock the seed and change only one descriptor at a time; (2) train a LoRA on 10–20 reference images for SDXL/Flux; (3) use Midjourney's --cref (character reference) parameter pointing at a previous output URL. DALL-E 3 supports gen_id continuity in ChatGPT. For commercial work, LoRA training gives the most reliable identity lock.

Key takeaways
  • The seven-slot formula β€” Subject, Style, Composition, Lighting, Mood, Technical, Negative β€” works across every major image model.
  • Specificity beats poetry. Dense descriptive tokens outperform flowery prose every time.
  • Lock the seed when iterating; vary it when exploring. This single habit doubles your hit rate.
  • Negative prompts matter for SDXL and Flux; Midjourney uses --no; DALL-E 3 ignores negatives β€” phrase exclusions naturally instead.
  • The same prompt looks radically different across Flux, DALL-E, SDXL, Recraft, and Ideogram β€” always test on at least three models.

Keep reading

ZeroTwo Editorial
The ZeroTwo team writes hands-on AI guides drawing on direct experience shipping ZeroTwo's multi-model platform β€” 60+ frontier text and image models behind one prompt box.

Test your next prompt across 8+ generators

One prompt box. Flux, DALL-E, SDXL, Recraft, Ideogram, and more. $19.99/month.

Start for $19.99/mo