Sogni: Learn logo

Durable Chat Runs

POST /v1/chat/runs starts a durable hosted chat turn. Use it when an LLM should interpret the user's request and choose Sogni hosted tools, but your application also needs persisted state, replayable progress events, cancellation, and recovery across client disconnects or API restarts.

Durable chat runs are the durable counterpart to /v1/chat/completions with server-side Sogni tool execution. A run owns the LLM round loop, dispatched tool calls, tool results, media context, child workflow IDs, artifact references, billing previews, final assistant response, and an append-only event log.

#Chat Run Endpoints

Endpoint Method Use
/v1/chat/runs POST Start a durable hosted chat run.
/v1/chat/runs/:id GET Read the latest run snapshot.
/v1/chat/runs/:id/events GET Read the persisted event log. Supports ?after=<sequence>.
/v1/chat/runs/:id/events/stream GET Stream persisted and live run events over SSE. Supports Last-Event-ID and ?after=<sequence>.
/v1/chat/runs/:id/cancel POST Cooperatively cancel a queued or running chat run.

All routes are scoped to the authenticated wallet. Starting a run requires an API key so the executor can perform Sogni hosted media work; first-party account sessions may use the owner's stored API key, while API clients should send Authorization: Bearer YOUR_API_KEY.

The start response returns after the run is persisted and scheduled. Treat the response as acceptance plus the first run snapshot, not completion. Read the snapshot or stream events until the status reaches a terminal state.

#Start Request

The request body accepts public REST fields in either snake_case or camelCase where noted.

Field or Header Use
messages Required OpenAI-style message array.
model Optional model ID. Defaults to qwen3.6-35b-a3b-gguf-iq4xs.
tools Optional OpenAI-style tool definitions visible to the LLM. The durable executor automatically runs Sogni hosted tools; use /v1/chat/completions manual mode if your app needs to execute its own tool loop.
tool_choice, toolChoice Optional OpenAI-style tool choice. Forced tool choice is applied only to the first LLM request so the run cannot repeat the same paid tool forever after a tool result.
sampling Optional runtime controls such as max_tokens, temperature, top_p, top_k, min_p, penalties, task_profile / taskProfile, and think.
media_references, mediaReferences Optional HTTPS media references seeded into the hosted tool media context.
media_context, mediaContext Optional existing media context snapshot with images, videos, audio, uploadedImages, uploadedVideos, or uploadedAudio. Values must be HTTP(S) URLs.
max_estimated_capacity_units, maxEstimatedCapacityUnits Optional estimated-cost ceiling captured on the run request.
confirm_cost, confirmCost Optional cost-confirmation flag captured on the run request.
session_id, sessionId Optional caller session ID for grouping UI turns.
client_message_id, clientMessageId Optional caller message ID for deduping UI state.
token_type, tokenType Optional billing token preference: spark, sogni, or auto. External media providers still settle in Spark.
app_source, appSource Optional caller identifier for analytics and support. Defaults to sogni-api.
Idempotency-Key, X-Idempotency-Key, idempotency_key, idempotencyKey Optional retry key. Reusing the same key returns the existing run instead of launching duplicate media work.

Unknown fields are rejected with 400 so clients notice misspelled or unsupported options early. Durable chat runs always use the hosted Sogni tool execution path; request fields such as stream, sogni_tools, and sogni_tool_execution belong to /v1/chat/completions, not this endpoint.

#Durable Media Rules

Durable runs cannot store inline base64 data: media. Upload media first, then pass HTTP(S) URLs in message image_url.url, request media_references, or media_context. Use Media Upload URLs when your app needs Sogni-hosted presigned URLs for local files.

For example, this durable run shape is valid:

{
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "Animate this product photo into a 5 second launch teaser." },
        { "type": "image_url", "image_url": { "url": "https://...presigned-download-url..." } }
      ]
    }
  ],
  "token_type": "spark",
  "confirm_cost": true
}

This differs from /v1/chat/completions, where inline PNG/JPEG data: URIs are accepted for short-lived vision input. Durable records must survive retries, event replay, recovery, and UI refreshes, so persisted media references must be retrievable URLs.

#Start A Run

curl https://api.sogni.ai/v1/chat/runs \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: chat-run-demo-001" \
  -d '{
    "session_id": "campaign-chat-42",
    "client_message_id": "msg-001",
    "model": "qwen3.6-35b-a3b-gguf-iq4xs",
    "messages": [
      {
        "role": "user",
        "content": "Create a cinematic product-launch image and then suggest a short video direction."
      }
    ],
    "sampling": {
      "max_tokens": 4096,
      "temperature": 0.7,
      "task_profile": "general"
    },
    "token_type": "spark",
    "confirm_cost": true,
    "app_source": "my-product-ui"
  }'

Representative response:

{
  "status": "success",
  "data": {
    "run": {
      "runId": "run_0f08d0b9-...",
      "status": "queued",
      "sessionId": "campaign-chat-42",
      "clientMessageId": "msg-001",
      "messages": [],
      "toolCalls": [],
      "toolResults": [],
      "mediaContext": {
        "images": [],
        "videos": [],
        "audio": [],
        "uploadedImages": [],
        "uploadedVideos": [],
        "uploadedAudio": []
      },
      "artifacts": [],
      "events": [
        { "sequence": 0, "type": "run_created", "at": "2026-05-15T12:00:00.000Z" }
      ]
    },
    "idempotent": false
  }
}

A newly accepted run returns 202. Retried submissions with the same idempotency key return the same run snapshot; use data.run.runId, data.run.status, and data.idempotent to reconcile caller state.

#Run Status

Status Meaning
queued The run was persisted and is waiting for an executor lease.
running An executor owns the lease and is driving LLM and tool rounds.
waiting_for_user The run reached a user-decision boundary such as a clarifying question, media selection, cost approval, or safety review. Read waiting for details.
completed The final assistant response is available in finalResponse and any generated media is listed in artifacts.
partial_failure The run hit a non-fatal boundary such as round-limit exhaustion. Earlier completed artifacts can still be present.
failed The run failed before reaching a useful terminal response. Read failureReason and recent events.
cancelled The caller cancelled the run. Read cancellationReason and the run_cancelled event.

Terminal statuses are completed, partial_failure, failed, and cancelled. waiting_for_user is a durable pause state: show waiting.message to the user, collect the next answer or approval, and submit the next turn as a new run with the updated message history and the same session_id.

#Stream Events

curl https://api.sogni.ai/v1/chat/runs/run_0f08d0b9-.../events/stream \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Accept: text/event-stream"

The SSE stream replays persisted events first, then polls for new events. Each persisted run event is emitted with its sequence as the SSE id, the run event type as the SSE event, and the full event JSON as data:

id: 4
event: tool_call_resolved
data: {"sequence":4,"type":"tool_call_resolved","at":"2026-05-15T12:00:07.000Z","payload":{"toolCallId":"call_abc","status":"ok","mediaUrls":[{"url":"https://...","mediaType":"image"}]}}

The stream also emits run_status events with { "runId": "...", "status": "..." } snapshots and : keepalive comments. It closes when the run reaches a terminal status. If your client reconnects, send the last seen SSE id as Last-Event-ID or pass ?after=<sequence> to replay only newer events.

EventSource cannot send the Authorization header. Use fetch with ReadableStream, or another HTTP client that can set headers, when consuming the stream from a browser.

#Event Types

Event type Meaning
run_created Initial run record was created.
run_resumed Recovery reacquired a stale queued or running run.
assistant_message_delta Assistant text progress emitted by the executor.
assistant_message_completed Assistant text for a round was persisted.
tool_call_dispatched The LLM selected a hosted tool and the executor dispatched it.
tool_call_progress A hosted tool reported progress or final progress for this run event stream.
tool_call_resolved A hosted tool finished and any media URLs or artifact refs were persisted.
media_context_updated Generated or uploaded media context changed for future rounds.
asset_manifest_updated Asset manifest state changed.
billing_preview_updated A hosted tool returned a billing preview.
run_waiting_for_user The run paused for a user decision.
run_completed The run reached a final assistant response.
run_failed The run failed.
run_partial_failure The run stopped after a partial failure, such as too many LLM rounds.
run_cancelled The caller cancelled the run.

Renderable media appears in tool_call_progress.payload.mediaUrls, tool_call_resolved.payload.mediaUrls, tool_call_resolved.payload.artifacts, and the final run snapshot's artifacts[].

#Read Events

curl "https://api.sogni.ai/v1/chat/runs/run_0f08d0b9-.../events?after=3" \
  -H "Authorization: Bearer YOUR_API_KEY"

Representative response:

{
  "status": "success",
  "data": {
    "events": [
      {
        "sequence": 4,
        "type": "tool_call_resolved",
        "at": "2026-05-15T12:00:07.000Z",
        "payload": {
          "toolCallId": "call_abc",
          "status": "ok",
          "mediaUrls": [{ "url": "https://...", "mediaType": "image" }]
        }
      }
    ]
  }
}

#Cancel A Run

curl -X POST https://api.sogni.ai/v1/chat/runs/run_0f08d0b9-.../cancel \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "reason": "user_cancelled" }'

Cancellation first transitions the owned run record to cancelled, then aborts active in-process work when this API worker owns the executor. The response includes the updated run and aborted, which tells you whether an in-process executor was actively signalled.

#Recovery

The executor uses a durable lease and heartbeat while it runs LLM and tool rounds. If an API worker dies or loses its lease, the recovery worker can scan stale queued or running runs, append run_resumed, reacquire a lease, and continue with the owner's API key. Completed, failed, partial-failure, cancelled, and waiting-for-user runs are not automatically resumed.

#Choosing An Endpoint

Use /v1/chat/runs when:

  • An LLM should decide which Sogni media tools to use.
  • The turn may take longer than a synchronous HTTP request.
  • Your UI needs persisted progress, event replay, generated artifact refs, cancellation, or recovery.
  • Media references are already uploaded or publicly fetchable as HTTP(S) URLs.

Use /v1/chat/completions when you need OpenAI-compatible chat, regular streaming tokens, inline vision data: URIs, manual custom-tool loops, or a single synchronous response.

Use /v1/creative-agent/workflows when your app already knows the exact media steps and wants deterministic durable orchestration without model-selected tool calls.