Face and Pose Capture

Face and Pose Capture pulls a stick-figure skeleton (and optionally a set of face landmarks) out of a reference image, then uses that skeleton to compose the generation. The output can be a different person, an alien, a robot, or a stylized character — the pose and facial structure come from the reference, but the identity, style, and setting come from your prompt.

This is the OpenPose ControlNet, running inside Pocket. It works from a still you've selected from your library or a live camera capture.

Watch the tutorial video

#What it captures

Two complementary signals you can use together or separately:

#Pose Capture

Analyzes full-body posture and produces a "bones and joints" skeleton — torso, limbs, hands, head position. This is what locks the silhouette of your output to the reference. Sports, dance, action, choreography, hero shots: anywhere you care that the body is doing a specific thing.

#Face Capture

Detects faces and marks their landmarks — eyes, eyebrows, mouth, jaw. This carries over the expression and angle of the face, but not the identity. The model still invents the face; it just constrains how that face is posed and what it's emoting. (If you want identity preservation, use Face Transfer instead.)

Both signals feed the same OpenPose ControlNet model alongside your prompt.

#How to start

Open Advanced Guidance and tap Pose & Facial Expressions.
Choose a reference: pick an image from your library or use the live camera.
Toggle Face Capture and/or Pose Capture depending on what you want to keep.
Write a prompt for the new subject and style.
Generate.

#When to use each

You want to keep	Turn on
Body posture only	Pose
Expression and head angle only	Face
Both — a full performance	Both

Use Pose alone when the subject shouldn't have a human face at all (a robot, a mech, an animal in a human stance).

#Prompt tips

Describe the subject, not the pose. The skeleton already encodes the pose. Spend tokens on who and what — "a samurai in storm-lit rain, ink wash style."
If unwanted human faces keep showing up, add "human face" to the Avoid prompt field. Pose can pull faces in even when you didn't want them.
Lower ControlNet strength to about 75% to give the model more freedom with the body — useful when the reference pose is anatomically extreme.
Match camera framing in the prompt — "full body shot," "from above," "low angle." The skeleton doesn't carry the lens.

#Workflows

Celebrity pose, different subject. Reference a famous magazine cover or movie still, capture the pose, prompt for a completely different character. Common for fan art and album covers.

Choreography reference. Capture a dancer mid-move from a photo; prompt for a stylized character to repeat the move. Good for storyboarding music videos.

Sports and action. A snapshot of an athlete becomes a comic-book superhero, a video-game character, or a fantasy warrior in the same moment of motion.

Consistent character poses. Use the same skeleton across several generations to build a turn-around or pose sheet, then re-prompt for variations.

#Tips

One clear subject in the reference works best. Crowds confuse the skeleton.
High-contrast, well-lit references give cleaner skeletons.
Pose and Face can run together — they are not exclusive.
If hands look broken, that's expected: hand keypoints are the noisiest part of OpenPose. Closing the prompt with "detailed hands" or accepting that limitation are both valid.