Production hardening checklist

Gateway selection
Authentication
Error handling
Model selection (AI applications)
Video applications
BYOC applications
Cost estimation
Monitoring
Launch readiness

Gateway selection

Environment	Gateway	Notes
Development / testing	`dream-gateway.livepeer.cloud`	Free, no API key required, unpredictable latency, no SLA
Production (managed)	Gateway provider with API key	Authenticated, rate-limited, provider-specific SLA
Production (self-hosted)	go-livepeer in broadcaster mode	Full control over pricing, routing, and orchestrator selection. Requires ETH deposit on Arbitrum One.

Switch before launch. Do not ship user-facing applications on the community gateway.

Authentication

API key stored in environment variables or secrets manager, never in source code
Backend API key used for server-side requests; CORS-enabled key for browser-side
Key rotation schedule set (90-day recommended)
.env files excluded from version control via .gitignore
For self-hosted gateways: Ethereum keystore secured, TicketBroker deposit funded

Error handling

401 handler: log and surface a configuration error; do not retry (the key is wrong)
422 handler: log the full response body to identify the failing field; fix the request shape
503 handler: retry with exponential backoff; cold model load is expected behaviour, not a failure
429 handler: back off and retry after the rate limit window resets
500 handler: retry once; if persistent, check model_id, input dimensions, and file format
Global timeout set on all requests (recommended: 300 seconds for cold model scenarios)
SDK retry configuration enabled with exponential backoff

Model selection (AI applications)

Warm models respond immediately. Cold models take 30 seconds to 5 minutes to load. Use warm models for latency-sensitive paths:

Pipeline	Warm model
text-to-image	`SG161222/RealVisXL_V4.0_Lightning`
image-to-image	`timbrooks/instruct-pix2pix`
audio-to-text	`openai/whisper-large-v3`
image-to-text	`Salesforce/blip-image-captioning-large`
LLM	`meta-llama/Meta-Llama-3.1-8B-Instruct`

Custom model_id values cold-start on every request until an orchestrator warms them. Test custom models under expected load before launch.

Video applications

Transcoding profiles defined for all target renditions (resolution, bitrate, FPS)
Webhook endpoint configured and signature verification implemented (Livepeer-Signature header)
Access control policy set on streams and assets (JWT or webhook playback policy)
Recording enabled if VOD archival is required (record: true on stream creation)
Player tested across target browsers (HLS fallback for non-WebRTC environments)

BYOC applications

Container /health endpoint returns {"status": "ok"} under load
Container registered in orchestrator’s aiModels.json with correct pipeline, model_id, and price_per_unit
Container handles graceful shutdown on SIGTERM
Container tested against local orchestrator before registering on mainnet
GPU memory usage profiled under concurrent session load

Cost estimation

AI inference pricing is orchestrator-set and denominated in wei. Indicative rate for text-to-image: approximately $0.019 per megapixel of output. Illustrative example for text-to-image at 1024x1024:

1024 x 1024 = 1,048,576 pixels = 1.05 megapixels
At $0.019/MP: approximately$ 0.020 per image
At 1,000 images/day: approximately $20/day

Actual rates vary by orchestrator and pipeline. Monitor real costs after the first week of production traffic to calibrate estimates. Video transcoding is priced in wei per pixel across all output renditions.

Monitoring

Log HTTP status codes for all gateway requests
Track p50, p95, p99 latency per pipeline
Alert on sustained 503 rate above 5% (warm model unavailability)
Alert on sustained 5xx rate above 1% (inference failures)
For self-hosted gateways: Prometheus metrics enabled (-metrics flag) and scraped
TicketBroker deposit balance monitored (for self-hosted gateways)

Launch readiness

Production gateway configured (not dream-gateway.livepeer.cloud)
API key in secrets manager, not in code
Retry policy implemented with exponential backoff
All error types (401, 422, 503, 429, 500) handled
Warm models used for latency-sensitive paths (AI applications)
Cost projection completed
Monitoring and alerting configured
Incident response contact identified: Livepeer Discord #builders

See job debugging for the error diagnosis flow when production issues arise.

Last modified on May 19, 2026

Developer Guides

Payments Overview

⌘I

Start here

Concepts

Learn

Build

Guides

Resources

Production hardening checklist

Gateway selection

Authentication

Error handling

Model selection (AI applications)

Video applications

BYOC applications

Cost estimation

Monitoring

Launch readiness

Start here

Concepts

Learn

Build

Guides

Resources

Documentation Index

​Gateway selection

​Authentication

​Error handling

​Model selection (AI applications)

​Video applications

​BYOC applications

​Cost estimation

​Monitoring

​Launch readiness

Gateway selection

Authentication

Error handling

Model selection (AI applications)

Video applications

BYOC applications

Cost estimation

Monitoring

Launch readiness