Sogni API Reference — docs.sogni.ai

Introduction

The Sogni API exposes the Sogni Intelligence platform — an OpenAI-compatible LLM endpoint, a durable agent runtime, and a creative-workflow engine — over a single REST surface at https://api.sogni.ai. Requests are JSON; auth is a bearer API key; long-running operations stream progress over Server-Sent Events.

Surfaces at a glance

Chat Completions — OpenAI-compatible chat with optional server-side Sogni tool execution (image, video, music generation, plus composition planners).
Chat Runs — durable counterpart to chat completions: persisted state, replayable SSE events, cancel and resume, cost-approval pauses.
Creative Workflows — pre-planned multi-step jobs (storyboards, image→video, batch generation) with an in-band dependency graph and durable execution.
Workflow Templates — saveable, parameterized recipes. Invoke by ID with inputs to compile a fresh durable workflow run.
Media + Image URLs — presigned S3-style POST URLs for uploading reference assets and downloading generated artifacts.
Wallet + Replay — on-chain balance lookups and the RunRecord ingest/read surface for replay tooling.

Quick start

The shortest path — a chat completion that runs Sogni creative tools server-side:

$ curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer $SOGNI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Generate a cinematic image of a neon alley in Tokyo during rain."}
    ]
  }'

Conceptual context. This page is the endpoint reference. For prose explanations of how the pieces fit together, read the Sogni Intelligence guides.

Authentication

Sogni API requests authenticate with a bearer token in the Authorization header. Tokens are issued from the Sogni account dashboard.

Authorization: Bearer YOUR_API_KEY

Two credential types

API keys — long-lived UUIDs scoped to a wallet, intended for backend and SDK use. Required for durable chat runs and creative workflow execution. Sign in to app.sogni.ai, open the profile menu in the top-right, and click API Key to generate or rotate yours.
Session JWTs — short-lived browser tokens issued by the Sogni auth flow. Recognized by the leading eyJ header marker that JWTs always carry. Most read endpoints accept either credential type.

The legacy api-key header is also accepted as a fallback, but new integrations should use Authorization: Bearer.

Treat API keys like passwords. They authorize spend on your behalf. Rotate from the Sogni dashboard if a key is exposed; revoked keys stop authenticating immediately.

Versioning

Resources are mounted under /v1/*. A handful of resources have a newer-shape replacement at /v2/* (today: media and image upload URLs, wallet balance). This reference documents only the latest version of each endpoint. Legacy endpoints remain callable but are not promoted here — they live in upgrade and changelog notes.

Compatibility posture

Additive changes — new optional fields, new endpoints — ship without a version bump.
Behavioral changes that could surprise existing callers — new default values, stricter validation — are announced ahead of the bump.
Breaking changes ship under a new version prefix (e.g., /v2/); the prior version keeps running until the migration window closes.

Errors

The Sogni API uses standard HTTP status codes. Successful responses return 200, 201, or 202; failures return 4xx for caller mistakes and 5xx for server-side problems. Error bodies are JSON.

200	OK — synchronous success
201	Created — durable run, workflow, template, or replay record persisted
202	Accepted — durable chat run accepted for background execution
400	Validation error — body or query parameter failed validation
401	Authentication missing or invalid
402	Insufficient balance — vendor-model run needs Premium Spark or VIP status
404	Resource not found, or hidden from the caller
409	Conflict — duplicate confirm-cost, too many active workflows, invalid run state transition
413	Payload too large — replay record exceeds 1 MB
422	Unprocessable entity — workflow template failed schema validation, or template compile errors
429	Rate limited — see `Retry-After` header
500	Internal error — retry after backoff; report persistent failures

Error envelopes

LLM routes (/v1/chat/completions, /v1/models) emit the OpenAI-compatible error shape:

{
  "error": {
    "message": "'messages' is required and must be a non-empty array",
    "type": "invalid_request_error",
    "param": null,
    "code": "invalid_request_error"
  }
}

All other endpoints emit the Sogni envelope:

{
  "status": "error",
  "errorCode": 102,
  "message": "Durable workflow requires at least one step"
}

Rate limits

Sogni applies per-route rate limits keyed on the authenticated wallet (or IP for unauthenticated bursts). When a limit is exceeded, the API returns 429 Too Many Requests with a Retry-After header indicating the cool-down period in seconds.

Per-endpoint family

Chat completions — moderate IP-keyed limit (60 requests/minute by default).
Chat runs — moderate IP-keyed limit on the start endpoint as defense-in-depth, plus a per-wallet start cap (30 starts/hour by default). confirm-cost has a strict per-wallet limit (30/minute) to dampen double-click loops.
Creative workflows — per-wallet start rate limit (10 starts/hour by default), plus an active-workflow cap (default 3 concurrent per wallet) and a global active-workflow ceiling.

Defaults can be overridden by environment configuration on the API host. The values above are the in-repo defaults.

Active-workflow cap. Trying to start a fourth concurrent creative workflow returns 409. Cancel or finish an existing run before submitting another, or batch into a single multi-step workflow.

Idempotency

Write endpoints that can produce side effects accept an idempotency key. Reusing the same key for the same caller returns the original result instead of starting a duplicate run. Use this to make retries safe across network failures and double-clicks.

Headers (preferred)

Idempotency-Key: 7c9e6f7c-23a1-4f06-9d33-2dd5d6c8f5fb
# or
X-Idempotency-Key: 7c9e6f7c-23a1-4f06-9d33-2dd5d6c8f5fb

Supported endpoints

POST /v1/chat/runs — start a durable chat run (also accepts idempotency_key in the body)
POST /v1/chat/runs/:id/confirm-cost — resume a cost-approval pause
POST /v1/creative-agent/workflows — start a durable workflow

Scope. The key is scoped to the calling wallet. A duplicate key from a different wallet does not collide. Keys are accepted up to 192–200 characters; UUIDs are ideal.

Billing & tokens

Spend is denominated in two token types. Pick one explicitly via token_type on a request, or let the API pick automatically.

Token	How acquired	Used for
sogni	Native — earned via Supernet participation or staking	Default for Sogni-native models (Stable Diffusion, Flux, Qwen, LTX, WAN, ACE-Step)
spark	Purchased with cash	Required for external vendor models (e.g. `gpt-image-2`, `seedance2` family)

Selecting a token

{
  "token_type": "auto"   // "auto" (default) | "sogni" | "spark"
}

The X-Token-Type header is also accepted; the body field wins when both are present. Vendor-model jobs are normalized to spark automatically, regardless of preference.

Vendor model gating

Models from external vendors (OpenAI GPT Image 2, ByteDance Seedance 2.0) require an explicit opt-in by name ("model": "gpt-image-2", "videoModel": "seedance2"). The LLM router will never pick them on the caller's behalf. Workflows that bind a vendor model in a step return 402 immediately if the calling account is not eligible for Premium Spark, so no upstream steps run before the gate.

Cost approval

Creative workflows use a two-step confirmation: submit with confirm_cost: false to receive a 400 carrying the structured estimatedCapacity, then resubmit with confirm_cost: true to proceed. Use max_estimated_capacity_units as a hard cap — submissions over budget are rejected before persistence regardless of confirmation.
Chat runs opt in via runtime_config.requireJobConfirmation: true. Each paid media tool call then pauses the run in waiting_for_user with reason cost_approval_required and emits a run_awaiting_cost_confirmation SSE event; resume via POST /v1/chat/runs/:id/confirm-cost. The jobConfirmationThresholdUsd runtime-config field skips the pause when the estimate is below the threshold.

Quick recipes

Direct generation

Need a single image, video clip, or music track and don't want an LLM in the loop? Submit a one-step creative workflow. The same POST /v1/creative-agent/workflows endpoint that powers multi-step storyboards also runs single hosted-tool calls — no chat session, no tool routing, no prompt engineering. The server validates the step, dispatches to the worker, streams progress over SSE, and returns the artifact URL when done.

Generate an image

POST /v1/creative-agent/workflows Auth required

Single-step text-to-image. The default model is flux2; swap arguments.model for any image model from GET /v1/models or the Creative Workflows catalog.

$ curl https://api.sogni.ai/v1/creative-agent/workflows \
  -H "Authorization: Bearer $SOGNI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "title": "Neon Tokyo alley",
      "steps": [
        {
          "id": "image1",
          "toolName": "generate_image",
          "arguments": {
            "prompt": "A cinematic neon-lit Tokyo alley during rain, shallow depth of field",
            "model": "flux2"
          }
        }
      ]
    }
  }'

const wf = await sogni.workflows.start({
  input: {
    title: 'Neon Tokyo alley',
    steps: [
      {
        id: 'image1',
        toolName: 'generate_image',
        arguments: {
          prompt: 'A cinematic neon-lit Tokyo alley during rain, shallow depth of field',
          model: 'flux2',
        },
      },
    ],
  },
});

for await (const event of sogni.workflows.streamEvents(wf.workflowId)) {
  if (event.type === 'workflow_completed') break;
}

import os, requests

resp = requests.post(
    "https://api.sogni.ai/v1/creative-agent/workflows",
    headers={"Authorization": f"Bearer {os.environ['SOGNI_API_KEY']}"},
    json={
        "input": {
            "title": "Neon Tokyo alley",
            "steps": [
                {
                    "id": "image1",
                    "toolName": "generate_image",
                    "arguments": {
                        "prompt": "A cinematic neon-lit Tokyo alley during rain, shallow depth of field",
                        "model": "flux2",
                    },
                }
            ],
        }
    },
)
workflow = resp.json()["data"]["workflow"]

Poll GET /v1/creative-agent/workflows/:id or subscribe to /v1/creative-agent/workflows/:id/events/stream for the SSE event stream. The completed artifact URL is on workflow.steps[0].artifacts[0].url.

Generate a video from a prompt

POST /v1/creative-agent/workflows Auth required

Single-step text-to-video. Common models: ltx23 (Sogni-native), wan22, or seedance2 / seedance2-fast (Premium Spark only).

$ curl https://api.sogni.ai/v1/creative-agent/workflows \
  -H "Authorization: Bearer $SOGNI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "title": "Neon city flythrough",
      "steps": [
        {
          "id": "clip",
          "toolName": "generate_video",
          "arguments": {
            "prompt": "A slow drone fly-through above a rain-soaked neon city, cinematic lighting",
            "videoModel": "ltx23",
            "duration": 5
          }
        }
      ]
    }
  }'

const wf = await sogni.workflows.start({
  input: {
    title: 'Neon city flythrough',
    steps: [
      {
        id: 'clip',
        toolName: 'generate_video',
        arguments: {
          prompt: 'A slow drone fly-through above a rain-soaked neon city, cinematic lighting',
          videoModel: 'ltx23',
          duration: 5,
        },
      },
    ],
  },
});

import os, requests

resp = requests.post(
    "https://api.sogni.ai/v1/creative-agent/workflows",
    headers={"Authorization": f"Bearer {os.environ['SOGNI_API_KEY']}"},
    json={
        "input": {
            "title": "Neon city flythrough",
            "steps": [
                {
                    "id": "clip",
                    "toolName": "generate_video",
                    "arguments": {
                        "prompt": "A slow drone fly-through above a rain-soaked neon city, cinematic lighting",
                        "videoModel": "ltx23",
                        "duration": 5,
                    },
                }
            ],
        }
    },
)

Image-to-video (animate a reference image)

POST /v1/creative-agent/workflows Auth required

Two-step composition: first generate_image emits a keyframe, then generate_video animates from it via an in-band dependsOn binding.

$ curl https://api.sogni.ai/v1/creative-agent/workflows \
  -H "Authorization: Bearer $SOGNI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "title": "Robot sketch → video",
      "steps": [
        {
          "id": "keyframe",
          "toolName": "generate_image",
          "arguments": { "prompt": "A graphite robot sketch on a drafting table" }
        },
        {
          "id": "clip",
          "toolName": "generate_video",
          "arguments": { "prompt": "Slow dolly-in as the sketch comes alive.", "duration": 5 },
          "dependsOn": [{
            "sourceStepId": "keyframe",
            "sourceArtifactIndex": 0,
            "targetArgument": "referenceImageIndices",
            "transform": "image_index",
            "required": true
          }]
        }
      ]
    }
  }'

const wf = await sogni.workflows.start({
  input: {
    title: 'Robot sketch → video',
    steps: [
      {
        id: 'keyframe',
        toolName: 'generate_image',
        arguments: { prompt: 'A graphite robot sketch on a drafting table' },
      },
      {
        id: 'clip',
        toolName: 'generate_video',
        arguments: { prompt: 'Slow dolly-in as the sketch comes alive.', duration: 5 },
        dependsOn: [{
          sourceStepId: 'keyframe',
          sourceArtifactIndex: 0,
          targetArgument: 'referenceImageIndices',
          transform: 'image_index',
          required: true,
        }],
      },
    ],
  },
});

import os, requests

resp = requests.post(
    "https://api.sogni.ai/v1/creative-agent/workflows",
    headers={"Authorization": f"Bearer {os.environ['SOGNI_API_KEY']}"},
    json={
        "input": {
            "title": "Robot sketch → video",
            "steps": [
                {
                    "id": "keyframe",
                    "toolName": "generate_image",
                    "arguments": {"prompt": "A graphite robot sketch on a drafting table"},
                },
                {
                    "id": "clip",
                    "toolName": "generate_video",
                    "arguments": {"prompt": "Slow dolly-in as the sketch comes alive.", "duration": 5},
                    "dependsOn": [{
                        "sourceStepId": "keyframe",
                        "sourceArtifactIndex": 0,
                        "targetArgument": "referenceImageIndices",
                        "transform": "image_index",
                        "required": True,
                    }],
                },
            ],
        }
    },
)

Generate music

POST /v1/creative-agent/workflows Auth required

Single-step text-to-music using the ACE-Step audio family. For vocal songs, compose lyrics first with compose_lyrics (synchronous) and pass them as arguments.lyrics.

$ curl https://api.sogni.ai/v1/creative-agent/workflows \
  -H "Authorization: Bearer $SOGNI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "title": "Synthwave loop",
      "steps": [
        {
          "id": "track",
          "toolName": "generate_music",
          "arguments": {
            "prompt": "Mid-tempo synthwave with arpeggiated bass and warm pads, 90 BPM",
            "duration": 30
          }
        }
      ]
    }
  }'

const wf = await sogni.workflows.start({
  input: {
    title: 'Synthwave loop',
    steps: [
      {
        id: 'track',
        toolName: 'generate_music',
        arguments: {
          prompt: 'Mid-tempo synthwave with arpeggiated bass and warm pads, 90 BPM',
          duration: 30,
        },
      },
    ],
  },
});

import os, requests

resp = requests.post(
    "https://api.sogni.ai/v1/creative-agent/workflows",
    headers={"Authorization": f"Bearer {os.environ['SOGNI_API_KEY']}"},
    json={
        "input": {
            "title": "Synthwave loop",
            "steps": [
                {
                    "id": "track",
                    "toolName": "generate_music",
                    "arguments": {
                        "prompt": "Mid-tempo synthwave with arpeggiated bass and warm pads, 90 BPM",
                        "duration": 30,
                    },
                }
            ],
        }
    },
)

Going further. Direct generation uses the same persisted workflow runtime as multi-step jobs — every direct call gets a workflowId, SSE event stream, cancel, resume, and reseed support. See Creative Workflows for the full surface; see Chat Completions if you want the LLM to choose the tool and arguments from a natural-language prompt instead.

Surface · OpenAI-compatible

Chat Completions

OpenAI-compatible chat with optional server-side Sogni creative tools. Drop-in for any client that speaks the OpenAI chat shape. Supports streaming via SSE, vision input (inline data URIs), custom function tools, and the Sogni tool families.

POST /v1/chat/completions Auth required

Create a chat completion. Drop-in OpenAI-compatible. Returns a single JSON response, or a stream of OpenAI-style SSE events when stream: true.

Body

Name	Type	In	Description
messages*	array	body	Non-empty OpenAI-style message array. The `developer` role is normalized to `system`. User messages may carry mixed `text` + `image_url` parts (vision).
model	string	body	LLM model id. Defaults to `qwen3.6-35b-a3b-gguf-iq4xs`. Vendor models (e.g. `gpt-image-2`) require explicit naming and Premium Spark.
stream	boolean	body	When `true`, returns OpenAI-style SSE chunks.
max_tokens	integer	body	Maximum output tokens. `max_completion_tokens` is accepted as an OpenAI-SDK alias.
temperature	number	body	Sampling temperature. Forwarded to the LLM worker.
top_p	number	body	Nucleus sampling. Forwarded to the LLM worker.
tools	array	body	Standard OpenAI function-tool array. Merged with the auto-injected Sogni tool family unless `sogni_tools` is `"none"`.
tool_choice	string\|object	body	OpenAI tool-choice. Defaults to `"auto"` when Sogni tools are injected.
sogni_tools	string\|boolean	body	Sogni tool family. `"creative-tools"` (default) — image/video/music + composition. `"creative-agent"` — adds workflow planners and asset-manifest tools. `false` or `"none"` — text-only.
sogni_tool_execution	boolean	body	When `true` (default), the API executes Sogni tool calls server-side and returns the final assistant message with media URLs. Set `false` to receive raw `tool_calls` and run the loop yourself. Only takes effect with API-key auth.
task_profile	string	body	Optional task profile. `general`, `coding`, or `reasoning`. Defaults to `coding` when any message uses the `developer` role.
token_type	string	body	`spark`, `sogni`, or `auto` (default). `X-Token-Type` header accepted; body wins.
media_references	array	body	Optional uploaded/request media metadata available to hosted creative tools.
chat_template_kwargs	object	body	Forwarded to the worker. Thinking-mode controls and similar template kwargs go here. The API merges `enable_thinking: true` on top.
reasoning_effort	string	body	Optional reasoning hint: `minimal`, `low`, `medium`, `high`. Also accepts `reasoning.effort`.
response_format	object	body	OpenAI structured-output / JSON-mode hint. Forwarded.
app_source	string	body	Optional caller label (max 128 chars). `X-App-Source` header accepted.

Vision limits. Up to 20 images per request; each image ≤ 10 MB and ≤ 1024 px on its longest side; PNG or JPEG only; must be an inline base64 data: URI.

Tool families

Value	What's injected
"creative-tools"	Default. Image/video/music generation + editing + analysis, plus synchronous composition tools (`enhance_prompt`, `compose_script`, `compose_lyrics`, `compose_instrumental`).
"creative-agent"	Everything in `creative-tools` plus workflow control, asset-manifest tools, and the workflow planner `compose_workflow_template`.
false / "none"	No Sogni tools injected. Text-only or your own custom `tools`.

$ curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer $SOGNI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Generate a cinematic image of a neon alley in Tokyo during rain."}
    ],
    "sogni_tools": "creative-tools"
  }'

import { SogniClient } from '@sogni-ai/sogni-client';

const sogni = await SogniClient.createInstance({
  appId: 'your-app',
  apiKey: process.env.SOGNI_API_KEY,
});

const result = await sogni.chat.hosted.create({
  messages: [
    { role: 'user', content: 'Generate a cinematic image of a neon alley in Tokyo during rain.' },
  ],
  sogni_tools: 'creative-tools',
});

console.log(result.choices[0].message);

from openai import OpenAI

client = OpenAI(
    base_url="https://api.sogni.ai/v1",
    api_key=os.environ["SOGNI_API_KEY"],
)

response = client.chat.completions.create(
    model="qwen3.6-35b-a3b-gguf-iq4xs",
    messages=[
        {"role": "user", "content": "Generate a cinematic image of a neon alley in Tokyo during rain."},
    ],
    extra_body={"sogni_tools": "creative-tools"},
)

print(response.choices[0].message)

Response (synchronous)

{
  "id": "chatcmpl-XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
  "object": "chat.completion",
  "created": 1731950400,
  "model": "qwen3.6-35b-a3b-gguf-iq4xs",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "Here's your image:\n\n![Generated image](https://…/image.png)"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 42,
    "completion_tokens": 18,
    "total_tokens": 60,
    "prompt_tokens_details": { "cached_tokens": 0 }
  }
}

Media URLs. When the API executes Sogni tools server-side, generated media URLs are injected into the assistant message as Markdown — ![alt](url) for images, [▶ Generated video](url) for video, [▶ Generated music](url) for audio. Set sogni_tool_execution: false to receive raw tool_calls and run the loop yourself.

Streaming. Set stream: true and consume text/event-stream chunks. Each chunk is an OpenAI-compatible delta. Sogni tool progress is injected as content deltas in the same stream when sogni_tool_execution is enabled.

Surface · Catalog

Models

Discover LLMs available to your account. The exact catalog rotates as new workers come online — always consult the live endpoint rather than hardcoding model IDs.

GET /v1/models Auth required

List the LLM models currently routable for the caller's account.

$ curl https://api.sogni.ai/v1/models \
  -H "Authorization: Bearer $SOGNI_API_KEY"

const res = await fetch('https://api.sogni.ai/v1/models', {
  headers: { Authorization: `Bearer ${process.env.SOGNI_API_KEY}` },
});
const { data } = await res.json();

from openai import OpenAI

client = OpenAI(base_url="https://api.sogni.ai/v1", api_key=os.environ["SOGNI_API_KEY"])
models = client.models.list()

Response

{
  "object": "list",
  "data": [
    {
      "id": "qwen3.6-35b-a3b-gguf-iq4xs",
      "object": "model",
      "created": 1776384000,
      "owned_by": "qwen",
      "capabilities": { "reasoning": true }
    },
    {
      "id": "qwen3.5-35b-a3b-abliterated-gguf-q4km",
      "object": "model",
      "created": 1772582400,
      "owned_by": "qwen",
      "capabilities": { "reasoning": true }
    }
  ]
}

GET /v1/models/:model_id Auth required

Read a single model record.

Path parameters

Name	Type	In	Description
model_id*	string	path	An ID from `GET /v1/models`.

Surface · Durable agent

Chat Runs (durable)

A durable counterpart to POST /v1/chat/completions. Use it when a single chat turn may run long, call multiple tools, hit a safety gate, or require human cost approval before spending Spark. The server persists the run, streams typed events over SSE, and exposes cancel + cost-approval primitives. Status values: queued, running, completed, partial_failure, waiting_for_user, failed, cancelled.

POST /v1/chat/runs Auth required

Start a durable chat run. Returns 202 Accepted on first submission, 200 OK with idempotent: true when an idempotency key matches an existing run.

Headers

Name	Type	In	Description
Idempotency-Key	string	header	Optional. `X-Idempotency-Key` also accepted; body `idempotency_key` accepted as fallback. Max 200 chars.

Body

Name	Type	In	Description
messages*	array	body	Non-empty OpenAI-style message array.
tools	array	body	Custom OpenAI function-tool array.
tool_choice	string\|object	body	OpenAI tool-choice.
model	string	body	Optional LLM model id.
sampling	object	body	Optional sampling parameters (temperature, top_p, etc.).
media_references	array	body	Optional durable media references. All durable URLs must be externally addressable.
media_context	object	body	Optional initial media context: `images[]`, `videos[]`, `audio[]`, plus `uploadedImages[]` / `uploadedVideos[]` / `uploadedAudio[]` for caller-supplied uploads.
max_estimated_capacity_units	number	body	Recorded on the request snapshot so callers can surface the ceiling alongside the run. Not currently enforced server-side on chat runs.
confirm_cost	boolean	body	Recorded on the request snapshot. To actually pause chat runs for cost approval, set `runtime_config.requireJobConfirmation: true` (see below) — paid media tool calls will then emit `run_awaiting_cost_confirmation` SSE events and wait for `confirm-cost`.
session_id	string	body	Optional caller session identifier.
client_message_id	string	body	Optional caller message identifier — useful for client-side correlation.
token_type	string	body	`spark`, `sogni`, or `auto`.
app_source	string	body	Optional caller label.
runtime_config	object	body	Run-time tuning. Fields: `qualityTier` (`fast`\|`hq`\|`pro`), `safeContentFilter` (bool), `personaNames` (string[]), `requireJobConfirmation` (bool — set `true` to pause before each paid media tool dispatch), `jobConfirmationThresholdUsd` (number — skip pause when estimate is below this).

$ curl https://api.sogni.ai/v1/chat/runs \
  -H "Authorization: Bearer $SOGNI_API_KEY" \
  -H "Idempotency-Key: $(uuidgen)" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Storyboard a 5-shot neon teaser, 9:16, 15s."}
    ],
    "runtime_config": { "requireJobConfirmation": true }
  }'

const run = await sogni.chat.runs.create({
  messages: [{ role: 'user', content: 'Storyboard a 5-shot neon teaser, 9:16, 15s.' }],
  runtimeConfig: { requireJobConfirmation: true },
});

for await (const event of sogni.chat.runs.streamEvents(run.runId)) {
  if (event.type === 'run_awaiting_cost_confirmation') {
    await sogni.chat.runs.confirmCost(run.runId, {
      toolCallId: event.payload.toolCallId,
      decision: 'confirm',
    });
  }
}

import os, requests, uuid

resp = requests.post(
    "https://api.sogni.ai/v1/chat/runs",
    headers={
        "Authorization": f"Bearer {os.environ['SOGNI_API_KEY']}",
        "Idempotency-Key": str(uuid.uuid4()),
    },
    json={
        "messages": [{"role": "user", "content": "Storyboard a 5-shot neon teaser, 9:16, 15s."}],
        "runtime_config": {"requireJobConfirmation": True},
    },
)
run = resp.json()["data"]["run"]

Response (`HTTP 202 Accepted` on first submit, `HTTP 200 OK` when an idempotency key matches an existing run)

{
  "status": "success",
  "data": {
    "run": {
      "runId": "run_xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
      "ownerWalletAddress": "0x…",
      "status": "queued",
      "schemaVersion": "…",
      "backbone": { "schemaVersion": "…", "modelKnowledgeVersion": "…", "routingPolicyVersion": "…" },
      "timestamps": { "createdAt": "…", "updatedAt": "…" },
      "scope": { "ownerWalletAddress": "0x…", "appSource": "sogni-api" },
      "request": { /* request snapshot */ },
      "messages": [],
      "toolCalls": [],
      "toolResults": [],
      "mediaContext": { "images": [], "videos": [], "audio": [], "uploadedImages": [], "uploadedVideos": [], "uploadedAudio": [] },
      "events": [{ "sequence": 0, "type": "run_created", "at": "…" }],
      "createTime": 1731950400000,
      "updateTime": 1731950400000
    },
    "idempotent": false
  }
}

GET /v1/chat/runs/:id Auth required

Read the full run snapshot — current status, request, events, and (when paused) the waiting reason.

GET /v1/chat/runs/:id/events Auth required

Read the persisted event log. Use ?after=<sequence> to fetch only events past a known sequence number.

Query parameters

Name	Type	In	Description
after	integer	query	Only return events with `sequence > after`.

GET /v1/chat/runs/:id/events/stream Auth required

Server-Sent Events stream. Replays persisted events, then polls for new ones until the run reaches a terminal status (completed, failed, partial_failure, cancelled). Supports Last-Event-ID for resume and ?after=<sequence>.

Event types

Lifecycle — run_created, run_resumed, run_completed, run_partial_failure, run_failed, run_cancelled
Rounds — llm_round_started, assistant_message_delta, assistant_message_completed
Tool calls — tool_call_dispatched, tool_call_progress, tool_call_resolved
Media + state — media_context_updated, media_turn_intent_classified, asset_manifest_updated, billing_preview_updated
Pause + resume — run_waiting_for_user, run_awaiting_cost_confirmation (emits toolCallId + estimate), run_cost_confirmation_resolved
Synthetic — run_status frames emitted by the stream on status transitions

id: 42
event: tool_call_dispatched
data: {"runId":"run_…","sequence":42,"type":"tool_call_dispatched","payload":{…}}

id: 43
event: run_awaiting_cost_confirmation
data: {"runId":"run_…","sequence":43,"type":"run_awaiting_cost_confirmation","payload":{"toolCallId":"call_…","estimatedCapacityUnits":120}}

POST /v1/chat/runs/:id/cancel Auth required

Cooperative cancel. Flips the run to cancelled, halts any in-flight tool calls owned by the run, and appends a run_cancelled event so any SSE listeners see the transition.

Body

Name	Type	In	Description
reason	string	body	Optional cancellation reason. Defaults to `user_cancelled`.

Response

{
  "status": "success",
  "data": { "run": { "runId": "run_…", "status": "cancelled" }, "aborted": true }
}

POST /v1/chat/runs/:id/confirm-cost Auth required

Resume a run that paused with waiting_for_user + cost_approval_required. Records the caller's decision, flips the run back to running, and dispatches the held tool calls.

Body

Name	Type	In	Description
tool_call_id*	string	body	ID of the paused tool call (from the `run_awaiting_cost_confirmation` event).
decision*	string	body	`"confirm"` or `"cancel"`.
overrides	object	body	Optional argument overrides applied to the tool call on resume.
reason	string	body	Optional caller-supplied reason recorded with the decision.
idempotency_key	string	body	Optional. `Idempotency-Key` / `X-Idempotency-Key` headers also accepted.

Insufficient-credits and safety-review pauses cannot be resumed via confirm-cost. Insufficient-credits requires topping up + a fresh run; safety-review requires POST /cancel to release the run. The API returns 409 with a routing message in either case.

Surface · Pre-planned execution

Creative Workflows

Durable multi-step creative jobs with an explicit steps[] dependency graph. Use this when your application has already decided what to do — storyboards, image→video, batch generation. The API executes, persists state, streams SSE events, and supports cancel, resume, and reseed. (User-facing surfaces call these "cloud workflows".)

POST /v1/creative-agent/workflows Auth required

Start a durable creative workflow. Provide an inline input.steps plan or invoke a saved template by workflow_id + inputs. The two are mutually exclusive.

Headers

Name	Type	In	Description
Idempotency-Key	string	header	Optional. `X-Idempotency-Key` also accepted. Max 192 chars.

Body — inline steps

Name	Type	In	Description
input	object	body	Inline workflow plan. Allowed fields: `title`, `steps`. Required when `workflow_id` is absent.
input.title	string	body	Optional human-readable title for the run.
input.steps*	array	body	Array of step inputs. Each step: `id`, `toolName`, `arguments`, and an optional `dependsOn` array linking it to upstream artifacts.
workflow_id	string	body	Optional saved-template ID. Compiled server-side into a fresh `steps[]` before execution.
inputs	object	body	Object of typed input values when invoking a template. Keys match `WorkflowTemplate.inputs[].name`.
token_type	string	body	`spark`, `sogni`, or `auto`.
app_source	string	body	Optional caller label.
max_estimated_capacity_units	number	body	Hard ceiling on estimated capacity units. Over-budget submissions are rejected before persistence.
confirm_cost	boolean	body	Cost-confirmation gate. Submit with `false` to request an estimate — the API rejects with `400` and a structured `estimatedCapacity` body so you can show the user the cost. Resubmit with `true` to proceed.
media_references	array	body	Optional request media references available to `$input_media` bindings and negative media indices.

$ curl https://api.sogni.ai/v1/creative-agent/workflows \
  -H "Authorization: Bearer $SOGNI_API_KEY" \
  -H "Idempotency-Key: $(uuidgen)" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "title": "Robot sketch → video",
      "steps": [
        {
          "id": "keyframe",
          "toolName": "generate_image",
          "arguments": { "prompt": "A graphite robot sketch on a drafting table" }
        },
        {
          "id": "clip",
          "toolName": "generate_video",
          "arguments": { "prompt": "Slow dolly-in on the keyframe.", "duration": 5 },
          "dependsOn": [{
            "sourceStepId": "keyframe",
            "sourceArtifactIndex": 0,
            "targetArgument": "referenceImageIndices",
            "transform": "image_index",
            "required": true
          }]
        }
      ]
    },
    "token_type": "auto",
    "max_estimated_capacity_units": 25,
    "confirm_cost": true
  }'

const wf = await sogni.workflows.start({
  input: {
    title: 'Robot sketch → video',
    steps: [
      {
        id: 'keyframe',
        toolName: 'generate_image',
        arguments: { prompt: 'A graphite robot sketch on a drafting table' },
      },
      {
        id: 'clip',
        toolName: 'generate_video',
        arguments: { prompt: 'Slow dolly-in on the keyframe.', duration: 5 },
        dependsOn: [{
          sourceStepId: 'keyframe',
          sourceArtifactIndex: 0,
          targetArgument: 'referenceImageIndices',
          transform: 'image_index',
          required: true,
        }],
      },
    ],
  },
  max_estimated_capacity_units: 25,
  confirm_cost: true,
});

import os, requests, uuid

resp = requests.post(
    "https://api.sogni.ai/v1/creative-agent/workflows",
    headers={
        "Authorization": f"Bearer {os.environ['SOGNI_API_KEY']}",
        "Idempotency-Key": str(uuid.uuid4()),
    },
    json={
        "input": {
            "title": "Robot sketch → video",
            "steps": [
                {
                    "id": "keyframe",
                    "toolName": "generate_image",
                    "arguments": {"prompt": "A graphite robot sketch on a drafting table"},
                },
                {
                    "id": "clip",
                    "toolName": "generate_video",
                    "arguments": {"prompt": "Slow dolly-in on the keyframe.", "duration": 5},
                    "dependsOn": [{
                        "sourceStepId": "keyframe",
                        "sourceArtifactIndex": 0,
                        "targetArgument": "referenceImageIndices",
                        "transform": "image_index",
                        "required": True,
                    }],
                },
            ],
        },
        "max_estimated_capacity_units": 25,
        "confirm_cost": True,
    },
)

Response

{
  "status": "success",
  "data": {
    "workflow": {
      "workflowId": "wf_…",
      "status": "queued",
      "input": { "title": "Robot sketch → video", "steps": [ /* … */ ] },
      "events": []
    }
  }
}

Active-workflow cap. Default 3 concurrent workflows per wallet. The 4th returns 409; cancel or finish one before submitting another.

GET /v1/creative-agent/workflows Auth required

List the caller's workflows.

Query parameters

Name	Type	In	Description
offset	integer	query	Default `0`.
limit	integer	query	Default `20`, max `100`.

Response

{
  "status": "success",
  "data": { "workflows": [ /* … */ ], "next": 20 }
}

GET /v1/creative-agent/workflows/:id Auth required

Read a single workflow snapshot.

GET /v1/creative-agent/workflows/:id/events Auth required

Read the persisted event log for a workflow.

GET /v1/creative-agent/workflows/:id/events/stream Auth required

Server-Sent Events stream of workflow events. Replays persisted history, then streams live updates until terminal.

POST /v1/creative-agent/workflows/:id/resume Auth required

Resume a workflow from persisted state (e.g. after a waiting state clears).

Body

Name	Type	In	Description
token_type	string	body	Optional. Overrides the original run's billing preference.
app_source	string	body	Optional caller label.

POST /v1/creative-agent/workflows/:id/cancel Auth required

Cooperative cancel. Flips the workflow to cancelled and emits a cancel event for SSE replay.

POST /v1/creative-agent/workflows/:id/reseed Auth required

Clone a completed workflow with fresh RNG seeds — "alternate takes" without retyping a plan. Returns a new workflowId.

Body

Name	Type	In	Description
seed_overrides	object	body	Optional per-step seed overrides. Omit to let the server generate fresh seeds for every step.
token_type	string	body	Optional billing preference.
app_source	string	body	Optional caller label.

Surface · Reusable recipes

Workflow Templates

Saveable, parameterized versions of a workflow plan. A template declares typed inputs[] and a list of parameterized stages[]. Invoking by workflow_id + inputs compiles a fresh steps[] and starts a normal creative-workflow run.

POST /v1/creative-agent/workflows/templates Auth required

Create a workflow template. Returns 422 with structured issues when the template fails schema validation.

Body

Name	Type	In	Description
name*	string	body	Human-readable template name.
description	string	body	Plain-text description shown in builders and listings.
brief	string	body	Optional planner brief that produced this template.
category	string	body	Template category. Defaults to `custom`.
stability	string	body	`experimental` (default), `beta`, or `stable`.
author	string	body	Defaults to `system`.
visibility	string	body	`private` (default), `public`.
inputs*	array	body	Typed input slots. Each entry declares `name`, `type`, optional default + validation.
stages*	array	body	Parameterized stages that compile into `steps[]` at run time. Bind to inputs via `$inputs.<name>`.
exposeToLLM	boolean	body	When `true`, the template appears as a callable tool in chat (`workflow:<id>`).
tags	string[]	body	Free-form tags.
graph	object	body	Optional visual-builder graph metadata. Round-trips with the form-based view.
previewArtifacts	object	body	Optional preview artifacts (sample image URLs etc.) used in template listings.
estimatedCapacityUnits	number\|object	body	Optional capacity-unit estimate. Used by callers to show price before run.
llmPriority	number	body	Optional ranking signal when the LLM exposes saved workflows as tools.
metadata	object	body	Arbitrary caller metadata.
id	string	body	Optional caller-chosen ID. Otherwise minted server-side.

Response

{
  "status": "success",
  "data": { "template": { "id": "wf_…", "version": "0.0.1", /* … */ } }
}

GET /v1/creative-agent/workflows/templates Auth required

List workflow templates. Filter by visibility.

Query parameters

Name	Type	In	Description
visibility	string	query	`own`, `public`, or `all` (default).
offset	integer	query	Default `0`.
limit	integer	query	Default `20`, max `100`.

GET /v1/creative-agent/workflows/templates/:id Auth required

Fetch a single template.

PATCH /v1/creative-agent/workflows/templates/:id Auth required

Update a template. Same field set as create; the new shape is re-validated before persistence.

DELETE /v1/creative-agent/workflows/templates/:id Auth required

Delete a template. Existing in-flight runs continue against their pinned template version.

POST /v1/creative-agent/workflows/templates/:id/fork Auth required

Fork a public or shared template into the caller's namespace. Returns the new template.

Surface · Asset transport

Media Upload URLs

Presigned S3-style URLs for uploading reference media (audio, video) and downloading generated artifacts. The latest revision returns a presigned POST with form fields and a server-enforced max file size, replacing the previous PUT-style flow.

GET /v2/media/uploadUrl Auth required

Request a presigned POST URL plus form fields for uploading a single media object up to maxSizeBytes.

Query parameters

Name	Type	In	Description
type*	string	query	Asset type. Artist-side: `referenceAudio`, `referenceVideo`. Worker-side: `complete`, `preview`.
jobId*	string	query	Job identifier the asset belongs to.
id	string	query	Artifact identifier. Required for worker uploads; omit for artist pre-uploads.
contentType	string	query	Optional MIME type. Pinned so downloads sign the same S3 object key.

Response

{
  "status": "success",
  "data": {
    "url": "https://sogni-uploads.s3-accelerate.amazonaws.com/",
    "fields": {
      "Content-Type": "video/mp4",
      "bucket": "sogni-uploads",
      "key": "video/2026-05-18/<jobId>/reference.mp4",
      "X-Amz-Algorithm": "AWS4-HMAC-SHA256",
      "X-Amz-Credential": "…",
      "X-Amz-Date": "…",
      "Policy": "<base64-encoded-policy>",
      "X-Amz-Signature": "<hex-signature>"
    },
    "maxSizeBytes": 104857600,
    "allowedContentTypes": [ /* see below — varies by `type` */ ]
  }
}

allowedContentTypes depends on the asset type:

Video uploads (e.g. referenceVideo): video/mp4, video/quicktime, video/webm
Audio uploads (e.g. referenceAudio): audio/mp4, audio/mpeg, audio/flac, audio/wav, audio/x-wav, audio/wave
Preview frames: image/jpeg, image/jpg, image/png, image/webp

How to upload. POST to url as multipart/form-data, including every key/value from fields, then your file field last. The server returns 204 on success.

GET /v2/media/downloadUrl Auth required

Get a presigned download URL for a previously uploaded asset or a completed job artifact.

Query parameters

Name	Type	In	Description
type*	string	query	Asset type — same values as upload.
jobId*	string	query	Job identifier.
id	string	query	Artifact id.
contentType	string	query	Optional MIME hint. The server uses the stored upload contentType when available.

Response

{
  "status": "success",
  "data": { "downloadUrl": "https://s3.amazonaws.com/sogni-…" }
}

Surface · Asset transport

Image Upload URLs

Image-specific counterpart to the media endpoints. Same presigned-POST flow with image-aware validation and content-type pinning.

GET /v2/image/uploadUrl Auth required

Presigned POST URL plus form fields for image uploads (PNG, JPEG, WEBP, GIF).

Query parameters

Name	Type	In	Description
type*	string	query	Asset type. Artist-side: `startingImage`, `referenceImage`, `referenceImageEnd`, `contextImage*`. Worker-side: `complete`, `preview`.
jobId*	string	query	Job identifier.
imageId	string	query	Artifact id. Required for worker uploads.
contentType	string	query	Optional MIME type — pinned for download.
startContentType	string	query	Optional content type for the starting image (used by paired image+ref flows).
includeMetadata	boolean	query	Whether to compute and surface `assetSha256` on download (worker-side only).

Response

{
  "status": "success",
  "data": {
    "url": "https://sogni-uploads.s3-accelerate.amazonaws.com/",
    "fields": {
      "Content-Type": "image/png",
      "bucket": "sogni-uploads",
      "key": "2026-05-18/<jobId>/startingImage.png",
      "X-Amz-Algorithm": "AWS4-HMAC-SHA256",
      "X-Amz-Credential": "…",
      "X-Amz-Date": "…",
      "Policy": "<base64-encoded-policy>",
      "X-Amz-Signature": "<hex-signature>"
    },
    "maxSizeBytes": 104857600,
    "allowedContentTypes": ["image/png", "image/jpeg", "image/jpg", "image/webp", "image/gif"]
  }
}

GET /v2/image/downloadUrl Auth required

Get a presigned download URL for an image artifact.

Response

{
  "status": "success",
  "data": {
    "downloadUrl": "https://s3.amazonaws.com/sogni-…",
    "assetSha256": "…"   // only when includeMetadata=true on internal worker downloads
  }
}

Surface · On-chain

Wallet & Balance

Read on-chain balances and RPC endpoints for the wallet associated with an account. Useful for surfacing Sogni and Spark balances inside an integration.

GET /v2/wallet/balance Public

Read SOGNI, Spark, and native (ether) balances for a wallet on the chosen provider.

Query parameters

Name	Type	In	Description
walletAddress*	string	query	Checksummed EVM address.
provider	string	query	Network provider. Defaults to `BASE`; also accepts `ETHERLINK`.

Response

{
  "status": "success",
  "data": {
    "sogni": "1234.56",
    "spark": "78.90",
    "ether": "0.0123"
  }
}

GET /v1/wallet/rpc-endpoints Public

Read the canonical RPC endpoint list for a network provider. Useful for adding Sogni's network to an EVM wallet.

Query parameters

Name	Type	In	Description
provider	string	query	`BASE` (default) or `ETHERLINK`.

Response

{
  "status": "success",
  "data": {
    "RPCEndpoints": {
      "endpoint1": "https://mainnet.base.org",
      "endpoint2": "https://base-rpc.publicnode.com"
    }
  }
}

Surface · Replay store

Replay

Persistent RunRecord storage for replay tooling. Server-side workflow executions write to this store automatically; client-side agents (e.g. sogni-chat) ingest their own records through the public ingest endpoint. Records are server-redacted defense-in-depth on every write.

POST /v1/replay/records Auth required

Ingest a RunRecord. The owner is derived from the authenticated wallet — clients cannot write into someone else's namespace.

Body

Name	Type	In	Description
schemaVersion*	integer	body	RunRecord schema version. Must be in the server's supported range.
run_id*	string	body	Caller-chosen run id (max 128 chars). Uniqueness is per-owner.
user_request*	string	body	The user prompt or request that triggered the run.
rounds*	array	body	Recorded LLM/tool rounds. Server replays for review.

Payload cap. A single record is capped at 1 MB. Larger payloads return 413.

Response

{
  "runId": "run_…",
  "schemaVersion": 2,
  "redacted": true,
  "createTime": 1731950400,
  "updateTime": 1731950400
}

GET /v1/replay/records Auth required

List recent RunRecords for the caller (summary view).

Query parameters

Name	Type	In	Description
limit	integer	query	Default `50`, max `200`.

Response

{
  "records": [
    {
      "runId": "run_…",
      "schemaVersion": 2,
      "createTime": 1731950400,
      "updateTime": 1731950400,
      "userRequest": "Storyboard a 5-shot neon teaser.",
      "finalResponse": "Here's your storyboard.",
      "modelId": "qwen3.6-35b-a3b-gguf-iq4xs",
      "rounds": 3
    }
  ]
}

GET /v1/replay/records/:id Auth required

Read a full RunRecord (suitable for the replay viewer).

Response

{
  "record": { "schemaVersion": 2, "run_id": "run_…", "rounds": [ /* … */ ] },
  "createTime": 1731950400
}

Resources

SDKs & Resources

The REST surface above is the source of truth. Most teams build against one of the following.

@sogni-ai/sogni-client Flagship JS/TS SDK — account, chat, durable runs, workflows, replay. Mirrors the REST surface 1:1. @sogni-ai/sogni-intelligence-client Public mirror of the Sogni creative-agent — contracts, tool catalog, OpenAI manifests, workflow primitives, public skill runtime. @sogni-ai/sogni-protocol Language-neutral JSON Schemas, OpenAI tool manifests, prompt contracts, and enums. Zero runtime deps. @sogni-ai/sogni-creative-agent-skill Public agent skill + sogni-agent CLI — adapter for Claude Code, OpenClaw, Hermes Agent, Manus. OpenAI SDK compatibility Drop-in instructions for the openai npm + Python packages and other OpenAI-compatible clients. Sogni Intelligence guides Conceptual narrative — when to use chat completions vs runs vs workflows, billing model, asset manifest. GitHub — Sogni-AI Source for the SDKs, the creative-agent skill, and example integrations. Sogni SDK guide JavaScript/TypeScript access to the same capabilities, ready to drop into Node and browser apps.

Sogni API

Introduction

Surfaces at a glance

Quick start

Authentication

Two credential types

Versioning

Compatibility posture

Errors

Error envelopes

Rate limits

Per-endpoint family

Idempotency

Headers (preferred)

Supported endpoints

Billing & tokens

Selecting a token

Vendor model gating

Cost approval

Direct generation

Generate an image

Generate a video from a prompt

Image-to-video (animate a reference image)

Generate music

Chat Completions

Body

Tool families

Response (synchronous)

Models

Response

Path parameters

Chat Runs (durable)

Headers

Body

Response (HTTP 202 Accepted on first submit, HTTP 200 OK when an idempotency key matches an existing run)

Query parameters

Event types

Body

Response

Body

Creative Workflows

Headers

Body — inline steps

Response

Query parameters

Response

Body

Body

Workflow Templates

Body

Response

Query parameters

Media Upload URLs

Query parameters

Response

Query parameters

Response

Image Upload URLs

Query parameters

Response

Response

Wallet & Balance

Query parameters

Response

Query parameters

Response

Replay

Body

Response

Query parameters

Response

Response

SDKs & Resources

Response (`HTTP 202 Accepted` on first submit, `HTTP 200 OK` when an idempotency key matches an existing run)