How to use multiple ControlNets to generate a Single Image

A frequently requested feature is the possibility of assigning more than one ControlNet reference image and model. This would allow you to, for example, assign a depth map to guide the background and an OpenPose skeleton for the subject.

The reason we haven't released this yet is mostly due to memory management. Things become resource-intensive when you start loading multiple models simultaneously, but we are getting closer.

The following steps show how to achieve the same effect, essentially replicating the process that would occur if multiple models were loaded at once, but providing you with more control throughout the process. Although it is slower:

1. (Optional) Set a Guide image.

2. Drag the image you want to use for the Depth map into the ControlNet model.

3. Enable the Depth preprocessor to generate the depth map and select the Depth CN model.

4. Click "Imagine" to generate the image.

5. (Optional) Adjust your prompt, steps, guidance, and other settings and regenerate until you are happy with the result.

6. Set the generated image as Guide Image. Set the Strength value to 50% for a balanced result. The lower the Guide Image Strength, the more it will allow the next ControlNet to affect the image.

7. Set the image that contains the pose you would like the subject to have as the new ControlNet reference image.

8. Select the Face/Pose Capture preprocessors to generate a skeleton and/or face map from the reference image. Additionally, select the OpenPose CN model

9. Click "Imagine" to generate the image.

This process also gives you extra control by allowing you to use different prompts in each "layer" and adjust any other settings.

Last updated