Sogni: Learn logo

🎨Creative-Agent Workflows

POST /v1/creative-agent/workflows starts one durable creative workflow shape: an input object with explicit steps. Your application provides the exact steps and tool arguments up front.

Use this endpoint when your application already knows the media operation to run and needs durable state, replayable event logs, SSE progress, resume, or cancellation.

#Workflow Endpoints

Endpoint Method Use
/v1/creative-agent/workflows POST Start a durable creative-agent workflow.
/v1/creative-agent/workflows GET List the caller's workflows.
/v1/creative-agent/workflows/:id GET Read a workflow snapshot.
/v1/creative-agent/workflows/:id/events GET Read the persisted event log.
/v1/creative-agent/workflows/:id/events/stream GET Stream workflow events over SSE.
/v1/creative-agent/workflows/:id/resume POST Resume a recoverable workflow from persisted step state.
/v1/creative-agent/workflows/:id/cancel POST Cooperatively cancel a workflow.
/v1/creative-agent/workflows/:id/reseed POST Clone a completed or partial-failure workflow with fresh seeds.

All workflow routes require authentication. Starting a workflow requires API-key auth.

The POST response returns after validation, plan creation, and initial persistence. Treat the 201 response as acceptance plus the first workflow snapshot, not as completion. Use the workflow snapshot, event log, or SSE stream to observe progress until the status becomes terminal.

Workflow status values are queued, running, completed, partial_failure, waiting_for_user, failed, and cancelled. SSE streams close on completed, failed, partial_failure, waiting_for_user, or cancelled. A partial_failure can still include artifacts from earlier completed steps.

#Start Request

Public REST fields are accepted in either snake_case or camelCase; the table below shows the preferred snake_case form.

Field or Header Use
input Required durable workflow input: { "title": "...", "steps": [...] }.
input.steps Required array of exact creative-agent tool steps.
workflow_id Alternative to input.steps. Compile and run a saved workflow template by ID.
inputs Input values for a workflow_id template run.
Idempotency-Key, X-Idempotency-Key Optional retry key. Reusing the same key returns the existing workflow instead of launching duplicate media jobs.
max_estimated_capacity_units Optional ceiling. Rejects the workflow before persistence if the shared estimate is above this value.
confirm_cost Optional explicit cost confirmation. Set false to reject positive estimated-cost workflows until your caller retries with confirmation.
token_type Optional billing token preference: spark, sogni, or auto. External media providers still settle in Spark.
app_source Optional caller identifier for analytics and support.
media_references Optional request-level media references available to workflow steps and $input_media dependency bindings.

The API also enforces active workflow and workflow-start rate limits per owner, plus a global active-workflow capacity limit. If a start is rejected by those limits, retry after another workflow completes or cancel a running workflow.

#Step Shape

Each step supplies an exact hosted tool call:

{
  "id": "clip",
  "toolName": "generate_video",
  "arguments": {
    "prompt": "Slow dolly-in on the generated keyframe.",
    "duration": 5
  },
  "dependsOn": [
    {
      "sourceStepId": "keyframe",
      "sourceArtifactIndex": 0,
      "targetArgument": "referenceImageIndices",
      "mediaType": "image",
      "transform": "image_index",
      "required": true
    }
  ]
}

Supported hosted tools include generation and editing tools (generate_image, edit_image, restore_photo, apply_style, refine_result, change_angle, generate_video, animate_photo, sound_to_video, video_to_video, generate_music, stitch_video, orbit_video, dance_montage, extend_video, replace_video_segment, overlay_video, and add_subtitles), media inspection tools (analyze_image, analyze_video, and extract_metadata), plus the currently hosted agent/control and manifest tools: create_asset_manifest, inspect_asset, label_asset, map_assets_for_model, validate_asset_references, ask_clarifying_question, and finalize_response.

#Planning From A Brief

If you have a creative brief but not an exact steps[] array, call the compose_workflow planner via /v1/chat/completions with sogni_tools: "creative-agent" and forcing tool_choice to compose_workflow. It returns a validated plan (matching this endpoint's input shape), an estimated_capacity_units value, and a fits_budget flag against the caller-supplied max_estimated_capacity_units. Submit the plan unchanged here for durable execution; pair the submission with an explicit Idempotency-Key (the planner itself is non-deterministic).

compose_workflow_template is the sibling planner for builder UIs: it returns a savable, parameterized template_draft (typed inputs[], stages[] referencing $inputs.<name>, optional graph layout) alongside an example plan for the inputs the planner used. Save template_draft through /v1/creative-agent/workflows/templates, then start future runs with workflow_id and inputs.

See Chat Completions → Workflow Planning for the end-to-end Plan → Review → Execute example.

#Run From A Template

Saved templates compile into normal durable workflow runs. Send workflow_id plus an inputs object instead of input.steps:

curl https://api.sogni.ai/v1/creative-agent/workflows \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: template-run-001" \
  -d '{
    "workflow_id": "wf_two-shot-product-teaser_a1b2c3d4",
    "inputs": {
      "brief": "A translucent speaker on a glossy black table, neon rim light"
    },
    "token_type": "spark",
    "confirm_cost": true
  }'

The API rejects requests that include both workflow_id and input.steps. Compile failures return 422 with compileErrors and compileWarnings. See Workflow Templates for template CRUD, visibility, and fork behavior.

Request-level media_references seed the same media execution context used by hosted chat, so creative tool arguments can use negative indices such as sourceImageIndex: -1, sourceVideoIndex: -1, audioSourceIndex: -1, or referenceImageIndices: [-1] to reference request media. A dependency with sourceStepId: "$input_media" can bind one of those request media references into a later step as an image, video, audio URL, media index, or structured asset_ref. Dependency bindings are considered during validation, so a required media field may be omitted from arguments when a dependency supplies it.

#Generated Keyframe To Video

To generate an image and then animate it, send two explicit steps. The generated image is passed to the video step as an artifact dependency.

OpenAI GPT Image 2 (model: "gpt-image-2") and ByteDance Seedance 2.0 (videoModel: "seedance2" or "seedance2-fast") are external media models that require credit card purchased Premium Spark. Use token_type: "spark" for explicit billing; requests using auto are normalized to Spark for those media jobs.

curl https://api.sogni.ai/v1/creative-agent/workflows \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: robot-sketch-001" \
  -d '{
    "input": {
      "title": "Generated keyframe to video",
      "steps": [
        {
          "id": "keyframe",
          "toolName": "generate_image",
          "arguments": {
            "prompt": "A graphite robot sketch on a drafting table",
            "model": "flux2",
            "width": 1024,
            "height": 576
          }
        },
        {
          "id": "clip",
          "toolName": "generate_video",
          "arguments": {
            "prompt": "The camera pushes in as the sketch comes alive",
            "videoModel": "ltx23",
            "duration": 5,
            "width": 1024,
            "height": 576
          },
          "dependsOn": [
            {
              "sourceStepId": "keyframe",
              "sourceArtifactIndex": 0,
              "targetArgument": "referenceImageIndices",
              "mediaType": "image",
              "transform": "image_index",
              "required": true
            }
          ]
        }
      ]
    },
    "token_type": "spark",
    "max_estimated_capacity_units": 25,
    "confirm_cost": true
  }'

Representative response:

{
  "status": "success",
  "data": {
    "workflow": {
      "workflowId": "wf_durable_workflow_...",
      "status": "running",
      "input": {},
      "plan": {},
      "events": [],
      "artifacts": [],
      "createTime": 1773353812,
      "updateTime": 1773353812
    }
  }
}

#Request Media

Seedance hosted video steps accept HTTPS image, video, and audio reference URLs. The model runs at fixed 24 fps, supports 4-15 second clips, omits negative prompts, and requires credit card purchased Premium Spark. Use Media Upload URLs when your app needs Sogni-hosted presigned URLs for local files.

curl https://api.sogni.ai/v1/creative-agent/workflows \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "media_references": [
      { "kind": "video", "url": "https://...presigned-download-url..." }
    ],
    "input": {
      "title": "Seedance V2V",
      "steps": [
        {
          "id": "seedance_v2v",
          "toolName": "video_to_video",
          "arguments": {
            "prompt": "Transform the source clip into a polished perfume commercial with glass reflections",
            "videoSourceIndex": -1,
            "videoModel": "seedance2",
            "controlMode": "seedance-v2v",
            "duration": 4
          }
        }
      ]
    },
    "token_type": "spark"
  }'

#Chained Media Steps

Hosted workflows can chain GPT Image 2 outputs into Seedance clips by passing the image artifact URL into the video step:

curl https://api.sogni.ai/v1/creative-agent/workflows \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "title": "GPT Image 2 keyframe to Seedance",
      "steps": [
        {
          "id": "keyframe",
          "toolName": "generate_image",
          "arguments": {
            "prompt": "A premium sneaker hero shot on wet asphalt with neon rim light",
            "model": "gpt-image-2",
            "gptImageQuality": "high",
            "outputFormat": "webp",
            "width": 1280,
            "height": 720
          }
        },
        {
          "id": "clip",
          "toolName": "generate_video",
          "arguments": {
            "prompt": "Use @Image1 as the first frame. Slow dolly-in, rain glints, polished product ad energy.",
            "videoModel": "seedance2",
            "duration": 5,
            "width": 1280,
            "height": 720
          },
          "dependsOn": [
            {
              "sourceStepId": "keyframe",
              "sourceArtifactIndex": 0,
              "targetArgument": "referenceImageIndices",
              "mediaType": "image",
              "transform": "image_index"
            }
          ]
        }
      ]
    },
    "token_type": "spark"
  }'

Hosted preflight validation blocks workflow steps that would generate more than 20 minutes of video content in one request, including long-video segment plans, variations, and batch fan-out. Split larger jobs into multiple workflows.

Storyboarding workflows that generate a GPT Image 2 storyboard sheet should treat the sheet as production planning material, not as the final video frame. Unless the user explicitly specifies a storyboard canvas or output aspect, generated storyboard sheets default to a landscape board. Visible text is scoped to the scene or end-card where it is requested; do not assume earlier scene text should repeat on later panels or the final frame.

stitch_video joins whole clips end-to-end. For alternating or interleaved slices of existing videos, use repeated replace_video_segment steps with explicit replacementStartSeconds / replacementEndSeconds source windows.

#Sogni Agent CLI

The public Sogni Creative Agent Skill wraps durable workflows with sogni-agent --api-workflow:

sogni-agent --api-workflow \
  --video-prompt "The camera slowly pushes in as the sketch comes alive" \
  --duration 5 \
  "A graphite robot sketch on a drafting table"

Use --workflow-input for exact JSON:

sogni-agent --api-workflow \
  --workflow-input @workflow.json \
  --watch-workflow

Workflow management helpers map directly to the REST routes: --list-workflows, --get-workflow <id>, --workflow-events <id>, --stream-workflow <id>, --resume-workflow <id>, and --cancel-workflow <id>. --watch-workflow streams shared human-readable progress labels for planning, approvals, repairs, tool execution, waiting states, and terminal errors. Hosted API modes require SOGNI_API_KEY. The CLI forwards media references from flags such as --ref, --ref-audio, and --ref-video as hosted API metadata; for durable workflows, prefer public HTTPS URLs or Sogni artifact URLs because the backend must retrieve non-inline media. Use the direct CLI path for private or large local media.

#Stream Events

curl https://api.sogni.ai/v1/creative-agent/workflows/wf_.../events/stream \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Accept: text/event-stream"

The SSE stream replays known events on connect, then tails the persisted workflow record. It closes when the workflow reaches a terminal status.

EventSource cannot send the Authorization header. Use fetch with ReadableStream, or another HTTP client that can set headers, when consuming the stream from a browser.

#Resume A Workflow

curl -X POST https://api.sogni.ai/v1/creative-agent/workflows/wf_.../resume \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "token_type": "spark" }'

For recoverable queued/running workflows, resume starts a background executor from the persisted workflow plan and returns 202 with the current snapshot:

{
  "status": "success",
  "data": {
    "workflow": { "...": "..." },
    "resumed": true
  }
}

The executor reacquires a workflow lease, rehydrates completed step outputs, skips steps already marked completed, and continues from the next uncompleted step. Use it for executor interruptions or stale in-progress runs; completed and cancelled workflows are not resumable, and failed or partial-failure runs should generally be inspected and relaunched with corrected input. The API can also run a recovery worker that scans stale queued/running workflows, reacquires expired leases, and resumes them with the owner's API key when available.

#Reseed A Workflow

curl -X POST https://api.sogni.ai/v1/creative-agent/workflows/wf_.../reseed \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "token_type": "spark",
    "seed_overrides": {
      "keyframe": 123456789
    }
  }'

Reseed clones a completed or partial_failure workflow plan, replaces seed values, starts a new durable workflow, and returns 201 with the new workflow plus reseed metadata:

{
  "status": "success",
  "data": {
    "workflow": { "...": "..." },
    "reseed": {
      "cloned_from_run_id": "wf_...",
      "steps": [
        { "stepId": "keyframe", "previousSeed": 111, "newSeed": 123456789 }
      ]
    }
  }
}

seed_overrides is optional. When omitted, every seedable step receives a fresh random seed. Reseed uses the same active-workflow, global-capacity, and workflow-start rate limits as a new workflow start.

#Cancel A Workflow

curl -X POST https://api.sogni.ai/v1/creative-agent/workflows/wf_.../cancel \
  -H "Authorization: Bearer YOUR_API_KEY"

Cancellation is idempotent. If the workflow is already terminal, the API returns the current snapshot with transitioned: false.

#Structured Contracts + Permission Gate

Durable workflow execution shares the Structured Contracts dispatch behavior with /v1/chat/completions: gating policies, repair recipes, prompt contracts, typed media/session state, and the destructive-tool permission gate all apply. The currently hosted app/control subset is non-destructive. Future destructive tools default to blocked until a shared explicit-intent permission rule exists.

#Replay Records

/v1/creative-agent/workflows produces durable workflow records. /v1/chat/runs produces durable chat-run records for LLM-steered hosted tool turns. /v1/replay/records is a separate, lighter surface that captures one RunRecord per chat turn. See Chat Completions -> Replay Records for the endpoint shape.

  • Workflows answer "what state is this multi-step plan in?" - durable, event-replay, cancellation, presigned media URLs.
  • Chat runs answer "what state is this model-selected hosted tool turn in?" - durable LLM/tool rounds, event replay, cancellation, recovery, final response, and artifact refs.
  • Replay records answer "what did the agent do for this user turn?" - the assistant message, the tool calls (with cost_class + risk_level chips from the shared per-tool cost metadata table), the tool results, and aggregated cost.

#Choosing Workflows

Use /v1/creative-agent/workflows when:

  • Your app already knows the exact steps and tool arguments.
  • You need durable workflow state, replayable event logs, SSE progress, or cooperative cancellation.
  • You want to use uploaded Sogni artifact URLs or presigned HTTPS media URLs instead of inline chat data: URIs.
  • You want deterministic orchestration rather than model-selected tools.
  • You are building a production media pipeline where the UI should track each creative step independently of a chat response.

Use Durable Chat Runs when an LLM should interpret the user's request, choose hosted Sogni tools automatically, and still provide durable progress, event replay, cancellation, and recovery.

Use Chat Completions when an LLM should interpret the user's request and return a synchronous OpenAI-compatible response.

For agent runtime or CLI-style integration, use the public Sogni Creative Agent Skill. It wraps Sogni media generation as an installable agent skill and CLI while the REST workflow API remains the durable backend integration surface.