Prompt Length & Token Limits
Diffusion models don't read your prompt as English text — they convert it into tokens, numeric IDs from a fixed vocabulary. Each model family has a maximum token budget, and any words past that budget get truncated.
Sogni Studio shows a live token count next to the prompt field and turns the field's border red when you exceed the active model's limit.
#How tokenization works
A token isn't always a word. Common words map to a single token; unusual words get split into sub-words. For example, "reddog" tokenizes as red + dog (two tokens), and "photorealistic" might tokenize as photo + realistic. Punctuation and weighting syntax also consume tokens.
The token count includes both the Prompt and Style fields combined — your style picks add to your prompt's budget, not separately.
#Token budgets
The two limits you'll hit most often:
- On-device or Relaxed Supernet — 77 tokens. CoreML pipelines and Relaxed workers both cap at 77 tokens (the classic Stable Diffusion CLIP encoder limit). Anything past that is truncated.
- Fast Supernet — up to 750 tokens. Long-prompt support for models with larger text encoders (T5-based Flux, SD3-style triple encoders).
Studio knows the active model and processing mode, so the limit shown next to the prompt field is the effective limit for your current job. The red border and red counter tell you when you've crossed it.
Beyond those two ceilings, individual model architectures have their own native limits (SD 1.x/2.x and SDXL are CLIP-based and natively 77 tokens; SD3 and Flux use larger encoders and can absorb longer prompts when running on Fast Supernet). You don't need to memorize them — the live counter is the source of truth.
#What happens when you go over
- The token counter goes red.
- A red border highlights the prompt field.
- On generate, the prompt is truncated to the limit — words past the cutoff are silently dropped.
The model still renders, but only the first N tokens of your prompt contribute to the result.
#Tips
- Use Prompt Enhancer for budget-aware rewriting. Prompt Enhancer takes your model's active token budget into account and rewrites to fit.
- Frontload the important stuff. Most diffusion models weight earlier tokens more heavily. Subject first, then style, then details.
- Save style for the Style field. It's tokenized into the same budget, but separating concerns keeps your prompt readable.
- Bail out and switch models for very long prompts. If you have a paragraph of description, Flux on Fast Supernet is the right home for it. SDXL will only see the first 77 tokens regardless of how much you write.