Skip to main content

Documentation Index

Fetch the complete documentation index at: https://na-36-handover-docs-v2-into-docs-v2-dev-20260518.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.


ComfyStream is a Livepeer-maintained extension of ComfyUI that swaps batch image generation for a real-time video loop. A ComfyUI workflow that takes an image and returns an image becomes a live-video-to-video pipeline when run through ComfyStream: video frames flow in over WebRTC, the workflow processes each frame, transformed frames flow back out at sub-second latency. Phase 4 (January 2026) hardened ComfyStream for production. The runtime added audio processing, data-channel output, dynamic workflow warm-up, and PyTrickle-based BYOC packaging. Daydream and Embody both run on ComfyStream infrastructure. The canonical install reference is docs.comfystream.org. The repository is livepeer/comfystream. The current Docker image is livepeer/comfystream, with livepeer/comfyui-base:stable as the BYOC base.

Pipeline Modes

ComfyStream workflows produce one of four output types. Every workflow declares its output mode through the nodes it composes.
ModeInputOutputRepresentative node
Image-to-image (live)Live video framesTransformed video framesStreamDiffusionSampler
Video-to-videoVideo segmentProcessed videoStreamDiffusion V2
Audio processingAudio track from streamAudio (pass-through or transformed)LoadAudioTensor
Data-channel outputAudio or video framesStructured text alongside videoAudioTranscription + data output node
A single ComfyStream container can host multiple pipelines (Phase 4 addition). Dynamic warm-up loads new workflows mid-stream without restarting the server, which lets one orchestrator advertise multiple capabilities from one image.

Node Ecosystem

ComfyStream uses standard ComfyUI custom nodes. Any node that executes per-frame without maintaining incompatible state runs in a real-time workflow.

Core I/O Nodes

Required for every ComfyStream workflow. They handle the real-time tensor handoff between the stream and the ComfyUI graph.
NodeSourcePurpose
LoadTensorlivepeer/comfystreamLoad a video frame tensor from the live stream
LoadAudioTensorlivepeer/comfystreamLoad an audio frame tensor for audio-aware processing

Real-Time Control Nodes

These nodes update their output on every workflow execution, which makes them suitable for animating parameters across a continuous stream.
NodeSourcePurpose
FloatControlComfyUI_RealtimeNodesOutputs a float that changes over time (sine, bounce, random)
IntControlComfyUI_RealtimeNodesSame as FloatControl for integer values
StringControlComfyUI_RealtimeNodesCycles through a list of strings per frame
FloatSequenceComfyUI_RealtimeNodesCycles through comma-separated float values
IntSequenceComfyUI_RealtimeNodesCycles through comma-separated integer values
Motion detection nodesComfyUI_RealtimeNodesDetect motion between frames; can trigger parameter changes

StreamDiffusion Nodes (Phase 4)

The primary generative video nodes. Ported from Daydream’s StreamDiffusion pipeline.
NodePurpose
StreamDiffusionCheckpointLoads a StreamDiffusion checkpoint model. Use with SD1.5 or SDXL
StreamDiffusionConfigConfigures CFG, t-index, acceleration mode
StreamDiffusionSamplerRuns StreamDiffusion inference per frame
StreamDiffusionLPCheckpointLoaderAlternative checkpoint loader for Livepeer-hosted models
StreamDiffusionTensorRTEngineLoaderLoads a pre-compiled TensorRT engine. Not compatible with all ControlNets
StreamDiffusion V2 adds video-to-video mode and stable diffusion V2 base models.

Phase 4 Additions

  • SuperResolution. Real-time video upscaling. Input: standard-resolution frame. Output: upscaled frame.
  • AudioTranscription. Whisper-based real-time speech transcription. Two output modes: SRT subtitles burned into video, or text delivered to the application via WebRTC data channel.

Workflow Format

ComfyStream requires workflows in ComfyUI API format, not the default save format. The default ComfyUI export includes layout metadata that ComfyStream does not parse. To export a workflow in API format:
  1. Enable Developer Mode in ComfyUI settings.
  2. Use Save (API Format) to produce the JSON file.
Workflows saved in the default format will not load correctly. API format is the only supported input. Place the workflow file in your ComfyStream workspace’s workflows/ directory. For Docker deployments, mount this directory as a volume. The canonical workspace layout is in docs.comfystream.org. When the workflow loads in the ComfyStream UI, the server compiles TensorRT engines for the relevant nodes. First run takes between two and ten minutes depending on the model and the GPU. Subsequent loads skip compilation.

Data-Channel Output

Phase 4 added a structured-text output path that runs alongside video. The ComfyStream WebRTC connection extends with a data channel; workflows containing a data output node emit text to the browser or application that connects to the server. Use cases:
  • Real-time audio transcription delivered as text to a downstream application
  • Frame-level metadata (object labels, confidence scores) delivered to an overlay UI
  • Any workflow where the output is data, not video
To receive data-channel output from a browser client, use , which handles WebRTC video streaming and the data channel from the same connection.

Performance Characteristics

ComfyStream compiles TensorRT engines and runs torch.compile on model components at first run. This is a one-time cost per workflow on each machine.
OperationDurationFrequency
TensorRT compilation2-10 minutesFirst run per machine, per workflow
torch.compile (ControlNet, VAE)On first frameFirst frame per session
Subsequent workflow loadsImmediateAll later runs
Achievable frame rate depends on model complexity, GPU, and image resolution. Reference figures from community testing on an RTX 4090:
  • SD1.5 + DMD one-step + DepthControlNet workflow: 14-15 fps at 640x360 input
  • StreamDiffusion with TensorRT: higher throughput at the same resolution
Frame rates vary substantially with LoRA stack and ControlNet load. Test under expected concurrency before production launch.

Hardware Requirements

ComfyStream requires an NVIDIA GPU. The server component runs on Linux only; Windows and macOS are not supported for the server, though the browser client runs anywhere.
WorkloadMinimum VRAMRecommended
Real-time AI (ComfyStream)12 GB16 GB+
VRAM headroom matters for stability. A workflow that runs at 12 GB may stutter under load that 16 GB absorbs cleanly. Source: the . CUDA 12.0+ is required. Current ComfyStream releases target CUDA 12.8 with NVIDIA driver 570.124.06 or later.

Relationship to BYOC

ComfyStream is itself BYOC-compatible. Phase 4 integrated ComfyStream with PyTrickle, which means the livepeer/comfystream Docker image can register directly as a BYOC capability on an orchestrator without rewriting the workflow as a custom container.
If you want toUse
Run a ComfyUI workflow as a real-time pipelineComfyStream directly
Run a custom Python model that isn’t a ComfyUI workflow
Run multiple ComfyStream workflows from one orchestratorComfyStream’s multi-pipeline mode (Phase 4)
Earn fees from your ComfyStream instanceRegister as a BYOC capability on go-livepeer
The ComfyStream quickstart gets you to a working pipeline in under 30 minutes. Start there.

Next Steps

ComfyStream Quickstart

Docker, RunPod, or local install. First real-time AI effect on a webcam in fifteen minutes.

Workflow Authoring

Build a custom workflow, configure StreamDiffusion, tune for latency.

ComfyStream as BYOC

Register a ComfyStream instance as a BYOC capability and earn fees.

docs.comfystream.org

Canonical install reference, hardware deep-dive, troubleshooting.
Last modified on May 19, 2026