LL-HLS encoding

Low Latency HLS (LL-HLS) is an extension of Apple’s standard HTTP Live Streaming (HLS) protocol designed to reduce the “glass-to-glass” delay (the time from a camera capturing a frame to a viewer seeing it) from the traditional 15–30 seconds down to under 5 seconds—often as low as 2 seconds.

Apple introduced this to allow live streams to compete with cable TV and social media “real-time” expectations while still using the scalable, reliable architecture of the web.


How LL-HLS Works: The Technical Pillars

Standard HLS is slow because it requires a full video segment (usually 6 seconds) to be completely finished and written to a playlist before a player can download it. LL-HLS cheats this waiting game through several key mechanisms:

1. Partial Segments (The “Parts”)

Instead of waiting for a 6-second segment to finish, the encoder breaks it into tiny “Partial Segments” (e.g., 200–500ms chunks). These parts are added to the playlist immediately, allowing the player to start downloading the beginning of a segment while the end of it is still being encoded.

2. Blocking Playlist Reload

In standard HLS, the player “polls” the server, asking, “Is there a new segment yet?” If not, it waits and asks again. LL-HLS allows the player to request the next segment before it exists. The server “holds” that request (blocks it) and responds the split-second the new media becomes available.

3. Preload Hints

The server tells the player exactly where the next partial segment will be located before it is even created. This allows the player to prepare the connection so that data flows the instant it’s ready, eliminating the round-trip time of a discovery request.

4. Playlist Delta Updates

To save bandwidth and processing power, the server can send “Delta Updates”—only the parts of the playlist that have changed since the last request—rather than the entire historical list of segments.


HLS vs. Low Latency HLS

FeatureStandard HLSLow Latency HLS
Typical Latency15 – 30 seconds2 – 5 seconds
Segment HandlingFull segments onlyPartial segments (parts)
DiscoveryPolling (Client-driven)Blocking Reloads & Hints
Backward CompatibilityUniversalHigh (Falls back to standard HLS)
Network OverheadLowModerate (Frequent updates)

Comparing LL-HLS to HESP:

HESP provides sub-second end-to-end latency together with large GOP sizes (10-12 seconds).Thanks to the initialization stream, the quality switch in ABR is not limited to the GOP boundaries and it can happen at any given moment. This means HESP is not limited to a small GOP size. Thus, the GOP size can be kept large while having a small buffer size (HESP has sub-second target buffer) and so it is possible to have low latency and smooth quality switch at any time without risk of rebuffering. 

By setting the same latency target as LL-HLS in HESP (~3sec) you would have more margin to encode the video more efficiently resulting in lower video bitrate for the same video quality and so you could save bandwidth consumption. 

As described earlier, LL-HLS cannot really exploit the small part size as there are also other consequences to be taken into account; no matter how small the part size is in LL-HLS, you are limited to the keyframe interval to be able to switch the quality in bad network conditions. In HESP, on the other hand, starting the playback is not limited to the GOP boundaries. Therefore, you do not need to sacrifice video quality (smaller GOP) to have the lowest end-to-end latency.  

While LL-HLS cannot really exploit the small part size to have low latency in bad network conditions, HESP offers a small buffer size, low latency, large GOP size, and higher video quality all at the same time.

Depending on the use case and the desired priorities (e.g. latency, bandwidth, consumption, video quality and network resiliency), encoding and packaging parameters, as well as buffer size, could be configured differently.

Dolby.io (Millicast technology) achieves sub-second latency and high-performance advertising by shifting from traditional cloud-centralized models to a highly distributed Edge-first architecture.

While you asked about CSAI (Client-Side Ad Insertion), it is important to note that Dolby has recently championed SGAI (Server-Guided Ad Insertion) as the superior way to handle ads in low-latency environments.

Here is how these components work together to keep latency low:

1. The Edge Server Network (Real-time Distribution)

Traditional streaming (HLS/DASH) relies on centralized servers and large buffers. Dolby.io uses a global WebRTC-based CDN that operates differently:

  • Proximity: By deploying “Points of Presence” (PoPs) at the network edge, Dolby ensures the “last mile” distance between the server and the viewer is as short as possible.
  • Protocol (WebRTC): Unlike traditional HTTP streaming which sends video in 2-6 second chunks, WebRTC streams data as a continuous flow of packets. This eliminates the “chunk-loading” delay entirely.
  • No Buffering: Edge servers are optimized to push data to the viewer immediately upon receipt from the broadcaster, typically resulting in <500ms “glass-to-glass” latency.

2. Why CSAI is used in Low Latency

In a low-latency environment, Client-Side Ad Insertion (CSAI) is often preferred over Server-Side (SSAI) for several technical reasons:

  • Eliminating the “Stitcher” Delay: In SSAI, a server must “stitch” the ad into the video stream in real-time. This processing adds several seconds of latency. CSAI happens entirely on the viewer’s device, so the main video stream remains untouched and “fast.”
  • Instant Personalization: Since the player (the “Client”) makes the call to the ad server, it can include real-time data (location, device type) to get a targeted ad without the delay of a middleman server.

Dolby recently introduced SGAI as the “evolution” of CSAI to solve the common problems of low-latency ads (like black frames or buffering during transitions):

  • Server-Led Signaling: The Dolby edge server sends a “metadata signal” (SCTE-35) to the player exactly when an ad break is coming.
  • Client-Led Execution: The player receives this signal and “pre-fetches” the ad in the background while the live content is still playing.
  • Seamless Handover: Because the player has the ad ready to go, it can swap from the live WebRTC stream to the ad and back again without any “spinning wheel” or sync issues, which are common in traditional CSAI.

4. Summary: The Workflow

  1. Source: The broadcaster sends a high-quality stream to the nearest Dolby.io Edge Ingest.
  2. Distribution: The stream is replicated across the global Edge Network via WebRTC.
  3. Ad Trigger: An ad marker is detected in the stream.
  4. Ad Logic (SGAI/CSAI): The player is instructed to call an ad server. Because the player is doing the work (CSAI-style), the main low-latency video feed is never interrupted or delayed by a “cloud stitcher.”
  5. Playback: The viewer sees the ad and the live content with synchronized timing across the globe.

By combining WebRTC at the Edge with Client-managed ad insertion, Dolby.io avoids the 10–30 second delays found in standard streaming platforms.

LiVE Production Workflow (Sphere-scale) — How Disguise, 7thSense, MA, Q-SYS, Dante, Ravenna & Mellanox fit together

Below is a clean, field-proven way to wire these systems for video playback, audio, broadcast, rigging/automation, production control, and architectural lighting. It’s vendor-agnostic where it should be, but calls out exactly how the named tools interoperate.

1) Spine-Leaf Network & Timing  

Fabric

  • Mellanox (NVIDIA Networking) spine-leaf, 25/100 GbE to servers, 1/10/25 GbE to endpoints.
  • VLANs (hard rule: isolate real-time traffic):
    • VLAN 10 – PTP/2110 (Video-over-IP & PTPv2)
    • VLAN 20 – Dante (PTPv1; Primary) |  VLAN 21 – Dante Secondary
    • VLAN 30 – Ravenna/AES67 (PTPv2; Primary) |  VLAN 31 – Secondary
    • VLAN 40 – MA-Net/sACN/Art-Net (lighting)
    • VLAN 50 – Media mgmt/NAS/backup
    • VLAN 60 – Control/OSC/REST/NMOS
  • QoS: PTP highest → real-time AoIP/2110 → control → file traffic last.
  • IGMP snooping/querier on all multicast VLANs; keep PTP boundary clocks on Mellanox; no jumbo frames for PTP.
  • House clock: PTPv2 Grandmaster (SMPTE 2059 profile).
    • Dante uses PTPv1 → keep Dante on its own VLAN + clock domain (don’t merge).
    • Ravenna/AES67 + ST 2110 share the PTPv2 house clock.

Redundancy

  • Dual spines, dual PSUs/NICs everywhere possible, A/B network for AoIP (Dante & Ravenna) and 2110.
  • Hot-standby routes & MLAG/VPC for leafs.

2) Video Playback & VFX / Media Systems

Servers & Roles

  • Disguise (vx/px) — LED and complex projection mapping, content sequencing, XR/IMAG compositing, SockPuppet control via lighting. GPU framelock (Quadro Sync) + genlock to house.
  • 7thSense (Delta/Infinity/Juggler) — ultra-high-bandwidth playback, auto-blend/warp, high-bit-depth projection feeds for domes or specialty canvases; genlock to house.
  • Ingest & Storage — high-throughput NAS/SAN (on VLAN 50) with scheduled sync to local SSD RAIDs on servers.

Color & Calibration

  • Content mastered to Rec.2020/PQ or HLG (LED/projection dependent).
  • 7thSense warp/blend for projectors; disguise OmniCal / camera-based alignment for LED surfaces.
  • LUTs applied consistently; keep a color bible per canvas.

Control & Cues

  • Timecode (LTC/MTC) generated centrally (see Q-SYS below).
  • SockPuppet (disguise) lets MA trigger video cues as “virtual fixtures” over sACN/Art-Net.
  • Both disguise & 7thSense listen to LTC or OSC for frame-accurate sequences.

Failover

  • Primary/Backup pairs per canvas with auto-switch on loss of sync/signal; outputs via SDI/12G or HDMI/DP to LED/processors; if using IP video, feed ST 2110-20 through gateways with NMOS control.

3) Audio, DSP, and Show-Control (Q-SYS + Dante + Ravenna)

DSP & Orchestration

  • Q-SYS Core is your audio DSP, routing matrix, paging, timecode generator/bridge, and show-control surface (UCI, Lua scripts).
  • Dante for most show playback, mics, IEMs; Primary/Secondary networks on VLAN 20/21.
  • Ravenna/AES67 for broadcast/2110 interop; Q-SYS bridges AES67 flows between show sound and broadcast when needed (48 kHz, packet time 1 ms/125 µs per interop matrix).

Clocking

  • Dante island: its own leader clock (PTPv1) — typically a stagebox or the Q-SYS Dante I/O.
  • Ravenna/2110: locked to house PTPv2 GM.
  • Do not try to make one PTP domain for all; keep Dante separate and bridge audio with AES67 where appropriate.

Timecode & Triggers

  • Q-SYS generates LTC (or converts MTC↔LTC), multicasts via Dante to disguise/7thSense and to Automation(read-only).
  • Q-SYS fires OSC/TCP cues on timeline markers (e.g., trigger pyro-safe macros, lighting presets, media start).

Redundancy

  • Redundant Q-SYS Cores with sSO; dual Dante & AES67 network paths; mirrored show files.

4) Lighting & Architectural Lighting (MA Lighting)

Control Topology

  • grandMA3 session: one Master, one Backup, separate Tracking console optional; MA-Net isolated on VLAN 40.
  • sACN is recommended for large universes; Art-Net only where required.
  • Architectural and house-lights via sACN to gateways; MA controls both show and house looks, or MA hands house-light control to Q-SYS via plugin when needed.

Interop with Video

  • SockPuppet (disguise): media layers exposed as fixtures to MA; MA cues advance video.
  • Alternatively, disguise (or 7thSense) is master timecode, and MA follows; pick one master per show.

Redundancy

  • Dual lighting switches, mirrored showfiles, hot-backup console, RDM disabled on show-critical runs unless actively troubleshooting.

5) Broadcast & Content Services

Contribution & Distribution

  • For facility IP: SMPTE ST 2110 for video (-20), audio (-30/-31), data (-40); device discovery via NMOS IS-04/IS-05 on VLAN 60.
  • For remote/bonded or OTT paths: SRT/Zixi/RIST contribution encoders pulling program feeds from production switcher or media servers (as SDI/2110).

Intercom/Comms

  • (If used) AES67-capable intercom (e.g., Riedel/RTS) on the Ravenna/AES67 VLAN; Q-SYS bridges IFBs/utility audio as needed.

Graphics/AR

  • If camera-tracked AR is required, pass genlock + timecode to tracking system; feed AR renderers back to switcher via SDI/2110. Disguise can composite for xR/IMAG while maintaining deterministic delay.

6) Rigging & Automation (Safety-first)

Control Boundary

  • Motion systems live on an isolated safety network with safety PLCs.
  • They listen to timecode and read-only triggers (OSC/TCP), but never accept direct position commands from show control during public operation.
  • E-stops hard-wired; dead-man devices per moving effect.

Integration

  • Q-SYS sends a “go” or “scene safe**” trigger; automation executes internally scripted moves pre-validated against interlocks and zones.
  • Status (healthy/fault/at mark) returned to Q-SYS UI for stage manager awareness.

7) Three proven cueing topologies  

  1. Lighting-as-Master (most common)
  • MA3 rolls master timecode (LTC via Q-SYS or internal), advances lighting + sends SockPuppet/sACN to disguise/7thSense; Q-SYS follows for sound beds + SFX.
  • Pros: LD owns pacing; tight busking possible.
  • Cons: Complex if broadcast/XR want different timelines.
  1. Media-as-Master (video-driven spectacles)
  • disguise (or 7thSense) is master timeline + LTC source; MA and Q-SYS chase.
  • Pros: Pixel-perfect sync to picture; ideal for mapped canvases.
  • Cons: Less flexible for last-second lighting improvisation.
  1. Q-SYS-as-Master (operator-centric)
  • Q-SYS timeline fires LTC + OSC to video and lighting, and rolls audio internally.
  • Pros: One brain for SFX, PA, paging, safety triggers.
  • Cons: Keep Q-SYS scripting disciplined; avoid feature creep.

8) Pre-pro Show-day workflow (checklist)

Pre-production

  • Lock canvas specs (pixel maps, refresh, EOTF, bit-depth).
  • Build Content Bible (color space, max-nit levels, safe areas).
  • Calibrate LED/projectors; commit LUTs/warps; smoke-test failover.

Load-in

  • Verify VLANs, QoS, PTP domains; label trunks.
  • Validate genlock everywhere; GPU framelock groups.
  • Dante: confirm Primary/Secondary paths; leader clock; latency uniform.

Rehearsals

  • Run full-length timing with automation safeties armed.
  • Induce failures (pull primary links, kill a server) and confirm seamless failover.
  • Record baseline logs (latency, dropped packets, sync offsets).

Show

  • Freeze configs; version tag showfiles; enable monitoring dashboards (QoS queues, PTP offsets, Dante latency, server health).
  • One comms channel for “GO” and one for “HOLD” only.

Strike

  • Export logs & configs; update the runbook with any deviations.

9) Security & Ops

  • RBAC on switches, servers, and Q-SYS; no shared accounts.
  • Change windows + rollbacks; infra-as-code for switch configs where possible.
  • Separate Mgmt network; disable unused services (mDNS/LLMNR) on show VLANs.
  • Continuous monitoring: PTP offset, multicast group counts, NIC drops, GPU framelock status, Dante xrun counters.

10) Quick wiring map (at a glance)

  • PTP GM → Mellanox fabric → (PTPv2) 2110/Ravenna endpoints, disguise, 7thSense, cameras/switcher.
  • Dante Leader (PTPv1) → Dante Primary/Secondary → Q-SYS Dante I/O, stageboxes, amps, RF racks.
  • Q-SYS Core ⇄ AES67/Ravenna bridges (to broadcast) | LTC out (to disguise/7thSense/Automation) | OSC/TCP cues (to media, lighting, automation).
  • grandMA3 → MA-Net (VLAN 40) → sACN to nodes/fixtures + SockPuppet to disguise.
  • disguise & 7thSense → LED processors / projectors (SDI/12G or 2110 via gateways), with Primary/Backup servers.
  • Automation (isolated) ← LTC & read-only triggers; status back to Q-SYS UI.
  • Broadcast (2110) ⇄ NMOS control; contribution encoders for SRT/Zixi/RIST as needed.