Creative Agent Skill
The Sogni Creative Agent Skill plugs the Sogni Supernet into the agent runtime you already use — Claude Code, OpenClaw, Hermes Agent, Manus AI, and others. It ships as a Node CLI (sogni-agent), a SKILL.md source any skill-aware agent can load, and a published OpenClaw plugin.
#npm: @sogni-ai/sogni-creative-agent-skill →

#What it does
- Image generation and editing. Quality presets (
-Q fast|hq|pro) pick the right model/steps/size;gpt-image-2, Flux 2 Dev, Z-Image Turbo, and Qwen Image Edit are first-class. - Video generation. Text-to-video, image-to-video, sound-to-video, audio-to-video, video-to-video, and animate workflows — auto-routed across LTX-2.3, WAN 2.2, and Seedance 2.0 with native dialogue and audio.
- Music. Direct instrumental or full-song generation with lyrics, BPM, key, and time-signature controls.
- Hosted creative workflows.
/v1/chat/completionswith Sogni creative-agent tools and/v1/creative-agent/workflowsdurable workflows including a storyboard-video preset. - Durable chat and replays.
--durable-chatmaps to/v1/chat/runs, while--list-replays,--get-replay, and--ingest-replayexpose redacted Sogni Intelligence RunRecords for debugging. - Personas, memory, personality. Saved reference photos and voice clips (
--persona), persistent preferences (--memory-set), and per-agent personality (--personality-set) survive across sessions at~/.config/sogni/. - Per-skill loading. Hosts can load the root
SKILL.mdor focusedskills/*.mdmanifests for image, video, music, media analysis, persona/memory, app settings, composition planning, session control, and asset-reference management. - Token auto-fallback.
--token-type autotries free Spark first and falls back to SOGNI when Spark runs out.
#Install
#Node CLI (default)
npm install -g @sogni-ai/sogni-creative-agent-skill@latest
sogni-agent --version
Then point your agent runtime at this repo's SKILL.md. For agents that fetch over HTTPS, llm.txt is the condensed install/setup reference.
The Sogni Intelligence docs keep the API-focused reference for this package, including the alpha release track and diagnostic flags: Sogni Creative Agent Skill.
#OpenClaw plugin
openclaw plugins install sogni-creative-agent-skill
OpenClaw reads defaults from openclaw.plugin.json; CLI flags always override.
#Hermes Agent / Manus / other frameworks
Install the CLI globally as above and point the agent at SKILL.md. The agent invokes sogni-agent directly.
#Setup
Get an API key from dashboard.sogni.ai and save it:
mkdir -p ~/.config/sogni
echo 'SOGNI_API_KEY=your_api_key' > ~/.config/sogni/credentials
chmod 600 ~/.config/sogni/credentials
Or export SOGNI_API_KEY in the environment.
#CLI at a glance
# Image
sogni-agent -Q hq -o dragon.png "a dragon eating tacos"
# Edit existing image
sogni-agent -c subject.jpg "add a neon cyberpunk glow"
# Photobooth face transfer
sogni-agent --photobooth --ref face.jpg "80s fashion portrait"
# Text-to-video with dialogue
sogni-agent --video 'A narrator says "welcome to the story" as ocean waves crash'
# Seedance 2.0 (4–15s, native audio)
sogni-agent --video -m seedance2 --duration 8 "A polished product reveal"
# Direct music
sogni-agent --music --duration 30 "uplifting cinematic synthwave"
# Storyboard → Seedance video sequence (durable workflow)
sogni-agent --api-workflow storyboard-video --storyboard-frames 7 --duration 12 -Q hq \
"Create a 9:16 bakery launch video with a neon street-window reveal"
# Hosted chat with creative-agent tools
sogni-agent --api-chat "Create a 4-shot product video concept for a red sneaker"
Add --json for machine-parseable success/error payloads in agent contexts. --last reads the previous render's metadata.
#Workflows
Storyboard to video. --api-workflow storyboard-video generates a storyline, renders a GPT Image 2 storyboard sheet, then drives Seedance from that sheet as the visual reference — one command produces a multi-shot video.
Persona-driven generation. Save a face photo and voice clip with --persona-add, then every image or video with --persona <name> injects the reference and (for video) the voice identity automatically.
Local segment + concat. Generate two video clips with --video-start offsets, then --concat-videos out.mp4 clip-1.mp4 clip-2.mp4 --concat-audio song.mp3 stitches them with ffmpeg locally.
Sogni Intelligence replays. --list-replays and --get-replay <id> expose redacted RunRecord storage for debugging hosted runs.
#Tips for getting started
- Lead with
-Q, not-m. Quality presets map to known-good models, steps, and sizes. Use-mwhen you have a reason. - Use
--api-workflowfor durable work,--api-chatfor conversation. Workflows persist with event streaming and cancellation; chat is best for natural-language exploration. - Cost-cap durable workflows.
--workflow-max-cost <n>rejects starts above a capacity-unit ceiling;--confirm-costrequires explicit billing confirmation. - Respect SSRF defaults. The CLI rejects localhost and private-network URLs before forwarding to hosted models; only public HTTPS references are forwarded as Seedance multimodal context.
- Don't generate bootstrap scripts. When upgrading from inside an agent, run
npm install -g …@latestdirectly. Inlinebash -cclone-or-pull scripts trip sandboxes and stall.
#See also
- Sogni n8n Plugin — the same Sogni capabilities for n8n workflows.
- Sogni Voice — local TTS/STT that pairs naturally with agent runtimes.
- Sogni Intelligence — the LLM platform behind
--api-chat. - Sogni Client SDK — the underlying SDK.
- Source on GitHub