Live SSAI Mode
Real-time ad insertion for live streams with strict latency requirements, sliding window management, and resilience patterns.
Latency Requirements
Live SSAI operates under strict timing constraints. The system must detect ad break signals, run decisioning, and stitch ad segments before the player requests them.
Latency Budget (per Ad Break)
| Stage | Target | p99 Max | Notes |
|---|---|---|---|
| Signal Detection | <10ms | 50ms | SCTE-35/DATERANGE parsing |
| Pod Decisioning | <200ms | 300ms | Partner requests + filtering |
| Manifest Stitching | <20ms | 50ms | Segment list + discontinuities |
| Total | <130ms | 300ms | Before break start |
Deadline Enforcement
The DecisionSLAEnforcer tracks time remaining until break start. If the deadline approaches (<200ms remaining), the system enters "safe mode": partner fanout is reduced, and slate (fallback) content is served rather than risking a late decision that causes black screen.
Sliding Window Management
Live manifests maintain a sliding window of segments. As new segments are added, old segments slide out. The SSAI system handles this correctly even when ad breaks span window boundaries.
HLS Live Window
- Window size:
liveWindowSizesegments (default: 6) EXT-X-MEDIA-SEQUENCEincrements monotonicallyEXT-X-DISCONTINUITY-SEQUENCEtracks sliding discontinuities- No
EXT-X-PLAYLIST-TYPEfor live streams
DASH Live Window
type="dynamic"MPDtimeShiftBufferDepth="PT5M"(default)minimumUpdatePeriod="PT2S"for manifest refresh- Stable
UTCTimingfor player sync
Window Behavior During Ad Break
Timeline:
[C1][C2][C3][C4][C5][C6] ← Window before break (6 content segments)
↓ Break signaled
[C2][C3][C4][C5][C6][A1] ← First ad segment enters window
[C3][C4][C5][C6][A1][A2] ← Second ad segment, content sliding out
[C4][C5][C6][A1][A2][A3] ← Ad break in middle of window
↓ Break ends
[C6][A1][A2][A3][A4][C7] ← Content resumes, ads sliding out
[A2][A3][A4][C7][C8][C9] ← Mixed window (normal)
[C7][C8][C9][C10][C11][C12] ← Ad break fully out of windowResilience Patterns
Live streams require robust fault tolerance. The SSAI system implements multiple resilience patterns based on the SafeModeManager and SSAICircuitBreaker.
Circuit Breaker
Per-partner circuit breakers prevent cascading failures. Based on actualSSAICircuitBreaker implementation:
- • Error threshold: 50% failure rate triggers open state
- • Timeout threshold: Partners returning >2s are tracked
- • Cooldown: 30s in open state before half-open retry
- • Recovery: 3 consecutive successes to close
Safe Mode
Automatic degradation when system health deteriorates. From SafeModeManager:
- • Error rate threshold: 10% triggers safe mode
- • Latency threshold: 500ms p99 triggers safe mode
- • Minimum samples: 100 requests before evaluation
- • Behaviors: Use slate, reduced timeouts (200ms), disable non-critical
Slate Fallback
When ads cannot be decisioned in time, pre-encoded slate content fills the break. The SlateManager maintains ready-to-use slate segments in all supported formats (HLS/TS, HLS/CMAF, DASH/CMAF).
Operational Controls
Live streams require operational controls for incidents and maintenance. Based on actual implementation in the operability module:
Kill Switches
Granular kill switches via KillSwitchManager:
- • Per-tenant disable
- • Per-channel disable
- • Per-region disable
- • Per-partner disable
- • Per-format disable (HLS/DASH)
- • Per-protocol disable (TS/CMAF)
Rollout Controls
Gradual rollout via RolloutManager:
- • Percentage-based rollout (0-100%)
- • Geographic targeting
- • Device type targeting
- • Player version targeting
- • A/B experiment support
Live Session Management
Live sessions have different lifecycle requirements than VOD. Based onSessionService implementation:
| Behavior | Live | VOD |
|---|---|---|
| Session TTL | Refresh-on-access (sliding window) | Fixed TTL (content duration + buffer) |
| Manifest Caching | private, no-store | private, no-store |
| Ad Decisioning | Just-in-time (per break) | At session creation (all breaks) |
| Retry Storm Protection | Critical (active) | Less critical |
Retry Storm Prevention
During network flaps, players may request manifests every 250ms. TheSessionService detects retry storms via isRetryStorm()and rate limits to prevent service overload. The system remains stable and session counts don't explode.
Origin Restart Handling
Live origins may restart mid-stream, causing discontinuities. The SSAI system handles this gracefully:
Media sequence and discontinuity sequence continue from pre-restart values
Active ad breaks complete even if origin restarts mid-break
Discontinuity tags inserted at origin restart boundary
Live-Specific Metrics
The SSAIMetrics module exposes Prometheus metrics for live stream monitoring:
Counters
ssai_sessions_active- Active live sessionsssai_breaks_total- Total breaks processedssai_breaks_slate- Breaks served with slatessai_safe_mode_triggers- Safe mode activations
Histograms
ssai_decision_latency_ms- Pod decision timessai_stitch_latency_ms- Manifest stitch timessai_manifest_latency_ms- Total manifest request timessai_partner_latency_ms- Per-partner response time