SSAI (Server-Side Ad Insertion): stitches ads into the manifest (HLS M3U8 / DASH MPD) and/or media at the origin/edge so the player sees one continuous stream. Handles ad-break timing (from SCTE-35), manifest conditioning, and ad tracking beacons.

DRM systemsWidevine, PlayReady, FairPlay protect the media. They’re applied by the packager when generating segments. The player fetches licenses during playback.

CPIX: an XML spec for securely exchanging keys, KIDs, and DRM signaling between your key server and packager (think: “how the packager gets the right keys & PSSH/SKD blobs”). AWS’s SPEKE is a closely related API used in that ecosystem.

Ads are pre-transcoded, encrypted, and licensed just like content; I pre-warm licenses or bundle multiple KIDs where supported to avoid mid-pod stalls.

Package to CMAF/fMP4 and encrypt with CENC-cbcs so the same segments serve HLS (FairPlay) and DASH (Widevine/PlayReady).

CPIX/SPEKE provides keys & DRM signaling to the packager; I use key rotation per period and SSAI respects KID boundaries for ad pods.

SCTE-35 in contribution becomes ID3/DATERANGE (HLS) or EventStream (DASH); the SSAI service swaps in ad URIs and handles tracking beacons.

vFor LL-HLS/LL-DASH, I keep small CMAF chunks, tune CDN caching, and ensure SSAI preserves chunk timing.v

How they fit with HLS / DASH / CMAF & containers

  • Use CMAF (fMP4) segments so one set of encrypted chunks can serve both HLS and DASH.
  • Use Common Encryption (CENC) with the cbcs scheme:
    • FairPlay requires cbcs.
    • Widevine & PlayReady also support cbcs → one encryption pass works for all three (multi-DRM).
  • HLS advertises encryption via #EXT-X-KEY (and KEYFORMAT for FairPlay/Widevine/PlayReady).
  • DASH advertises encryption with <ContentProtection> elements in the MPD (PSSH, default_KID, etc.).
  • TS vs MP4: modern SSAI + DRM prefers CMAF/fMP4. Legacy HLS with TS works, but you lose low-latency and unified packaging benefits.

SSAI + DRM in practice

  • Where SSAI happens: at the packager/origin (e.g., MediaPackage + SSAI service) or at a dedicated SSAI (e.g., MediaTailor/Google DAI/Yospace/FreeWheel SSAI), which rewrites manifests and serves media.
  • Ad pods & encryption:
    • Ads must be pre-transcoded to the same ladder & codecs as program content.
    • Encrypt ads too (same or separate KIDs). If using key rotation, SSAI must switch the manifest to the ad KID then back to content KID cleanly.
    • License timing: ensure the player can acquire ad licenses fast enough (pre-warm, short TTL, or include ad KIDs in the same license if your provider supports it).
  • Signaling:
    • SCTE-35 → mapped to HLS (ID3/EXT-X-DATERANGE) or DASH (EventStream) to mark avail windows.
    • SSAI replaces those windows with ad media URIs in the manifest and fires beacons (server- or edge-side).
  • Low latency: stick to CMAF with partial segments (HLS preload hints / DASH chunked CMAF). SSAI must preserve chunk boundaries so latency targets hold.

Where CPIX sits in the pipeline

  1. Key Server (DRM provider) issues keys/KIDs + DRM objects (PSSH for Widevine/PlayReady, SKD URL + cert for FairPlay).
  2. Packager requests them via CPIX/SPEKE → receives keys & signaling.
  3. Packager encrypts CMAF segments once (cbcs) and adds the right tags/boxes to HLS/DASH manifests.
  4. Player uses EME/FairPlay APIs to fetch licenses at playback time.

Cloud mapping you can name-drop

  • AWS: MediaLive (encode) → MediaPackage (CMAF HLS/DASH + DRM via SPEKE) → MediaTailor (SSAI)→ CloudFront.
  • GCP: Custom/live encoder or partners → Transcoder API (VOD) / packager → Media CDN, SSAI via Google DAI (or partners), Widevine license + multi-DRM via provider.
  • OCIMedia Streams (packaging/origin HLS/DASH + DRM via partners) + Media Flow (processing) + third-party SSAI (Yospace/FreeWheel etc.).

Here’s a structured breakdown of SCTE-35Dynamic Ad Insertion (DAI)SCTE-224 (ESNI), and SCTE-250 (ESAM / POIS) — how they relate, why they exist, and how they’re used together in modern live / OTT / streaming workflows.


Key Terms & Role Overview

Let me first give a quick glossary / roles so you can see the relationships:

  • SCTE-35 — in-band cue / splice signaling in video streams (e.g. ad break markers, splice points).
  • DAI (Dynamic Ad Insertion) — the process of inserting or stitching ads into video streams on the fly, usually in OTT / streaming contexts, often using server / manifest-level logic.
  • SCTE-224 (ESNI: Event Scheduling & Notification Interface) — an out-of-band metadata/notification interface to distribute event schedules, blackout rules, policies, etc. to streaming systems in advance.
  • SCTE-250 (ESAM: Event Signaling & Management API) — an API interface between “signal acquisition systems” (encoders, packagers, etc.) and a “decision system” (which contains rules, business logic, placement opportunity info) — e.g. POIS — for conditioning or reacting to SCTE-35 cues.

POIS” is shorthand for Placement Opportunity Information Service, a service/database that knows how and when to place (or alter) ads, blackouts, etc., often driven by rules / policies (for different audiences, geos, device categories).

So in a full stack, you often see:

  1. SCTE-35 cues embedded in the first video stream / contribution / transport path
  2. ESAM / SCTE-250 used to query what to do with those cues (i.e., signal conditioning, switching, insertion) via a decision service (POIS)
  3. SCTE-224 / ESNI carrying the policy / schedule / audience metadata out-of-band (preconfigured or dynamically updated) so the decisions (in ESAM / POIS) know what constraints / policies to apply
  4. Then DAI (manifest manipulation, ad stitching, content replacement) uses that combined information to insert the correct ad or alternate content downstream (in the streaming delivery path)


SCTE-35: In-Band Event Signaling

  • SCTE-35 (ANSI/SCTE-35) is the standard for in-stream cue signaling in transport streams (or converted into manifest metadata in HLS / DASH). It’s often described as the “core signaling standard for advertising, program, and distribution control.” SCTE Account+1
  • The SCTE-35 signal indicates “splice points” — places where alternate content (ads, blackouts, etc.) can be inserted. It may carry metadata like a splice start, splice end, segmentation descriptors, and unique identifiers (UPIDs) to indicate what is being inserted or replaced. Bitmovin+2broadpeak.io+2
  • In OTT / ABR workflows, those cues are often translated into manifest-level tags (e.g. in HLS, DASH) so that downstream ad stitching / manifest manipulation systems can see them. broadpeak.io+2Bitmovin+2
  • SCTE-35 is “in-band” (i.e. embedded in the same video / transport stream) so as the video flows through encoders, transport paths, etc., the cues travel along (assuming systems preserve / pass them). SCTE Account+3NCTA Technical Papers+3Bitmovin+3

However, SCTE-35 cues alone are not enough to enforce business rules or regional policies — they only say “here is a splice / ad opportunity or boundary” — not what ad or alternate content to put there or who should get what.


DAI (Dynamic Ad Insertion)

  • DAI refers to the process of dynamically selecting and inserting ads (or alternative content) into a stream, often at manifest or server side, in response to cues and policy decisions.
  • In streaming / OTT, DAI typically works by reading SCTE-35 cues (or equivalent signals) and then manipulating manifests or stitching ad segments (VAST / VMAP / SSAI) so that the client sees a seamless stream with ads.
  • The DAI engine often needs to know more than just “there’s an ad break here” — it must know which ad(s), which user audience, geolocation, allowed ads, blackout constraints, etc. That’s where policy / decision services (POIS) come into play.
  • So DAI is downstream of SCTE-35 + policy / metadata, and DAI is what actually does the insertion / splicing in manifest or media segments.

SCTE-224 (ESNI): Policy, Scheduling & Audience Metadata

  • SCTE-224 is also known as ESNI (Event Scheduling & Notification Interface). It is a standard (out-of-band, over web / API) interface / protocol for distributing metadata, scheduling, content policies, regionalization rules, audience segmentation, and business logicAccuris Standards Store+4SCTE Account+4broadpeak.io+4
  • ESNI / SCTE-224 provides a structure for telling downstream systems in advance what events will occur (or what policies apply) for a given media (e.g. “On this channel, from 2:00–2:30, blackout for Region A; use alternate content for Region B; local ad insertion allowed in these zones”). ANSI Webstore+3broadpeak.io+3SCTE Account+3
  • The standard defines objects like MediaMediaPointViewingPolicyAudiencePolicy, etc. These describe scheduled events (MediaPoints), what actions (ad insertion, blackout, alternate content) should be taken per audience (e.g. zip code, device type), and so on. ANSI Webstore+3NCTA Technical Papers+3SCTE Account+3
  • Because SCTE-224 is out-of-band (i.e. delivered via APIs / web services), systems (encoders, packagers, manifest stitchers) can preload and understand in advance how to apply policies. This helps in coordination, planning, and ensuring that correct content is chosen when a cue happens. Comcast Technology Solutions+3broadpeak.io+3SCTE Account+3
  • For example, a streaming distributor might receive SCTE-224 metadata from the content owner or rights holder indicating blackout regions for a sports event. Then when the SCTE-35 ad cue arrives in the stream, the DAI / decision logic can use those rules to override or adjust the ad break (e.g. not show specific content in certain geos). NCTA Technical Papers+3Comcast Technology Solutions+3broadpeak.io+3

Example: For regional blackout, SCTE-224 can carry that metadata (which region must blackout, what alternate feed to show). Then at runtime, when SCTE-35 signals a splice point, logic can consult that policy and choose to splice in alternate content for affected regions.


SCTE-250 / ESAM: Signal Conditioning & Messaging with POIS / Decision Engines

  • SCTE-250, also called ESAM (Event Signaling and Management API), defines a standard API / messaging interface between a Signal Acquisition System (SAS: e.g. encoder, packager, splicer) and a Signal Decision System (SDS: e.g. POIS, rule engine) for how to handle / condition SCTE-35 cues. SCTE Account+2Amazon Web Services, Inc.+2
  • The SAS (e.g. an encoder or live streamer) sees a SCTE-35 cue, then formulates a Signal Processing Event (SPE)request to the SDS (POIS) asking “what should I do with this cue?” The SDS then returns a Signal Processing Notification (SPN) containing instructions on how to modify, drop, enrich, or act on that cue. NCTA Technical Papers+2Amazon Web Services, Inc.+2
  • The API is typically REST / XML based (HTTP POST with XML), where the SPE carries data (e.g. binary / base64-encoded SCTE-35 data, identifiers, timestamps), and the SDS can respond with how to condition (e.g. include / exclude the cue, adjust parameters). NCTA Technical Papers+1
  • ESAM APIs also support unsolicited notifications — rules that are not triggered purely by in-band cues (e.g. schedule-based events) can be pushed to the encoder / SAS ahead of time. That is, SDS might send a decision / instruction not triggered by a direct cue but by an event schedule. NCTA Technical Papers+1
  • In many architectures, when an encoder (or packager) sees a cue, it contacts POIS via ESAM and asks: “do I insert? do I suppress? do I switch? do I modify duration or flags?” The POIS / SDS makes a decision based on SCTE-224 policies, rights, regional rules, ad availability, etc. Then the SAS implements that instruction (modify the cue, drop, or adjust). Amazon Web Services, Inc.+1
  • For example, AWS Elemental Live supports ESAM: when it receives SCTE-35 signals, it can do signal conditioning (filter, remove, re-map) by talking to a POIS via ESAM. Amazon Web Services, Inc.
  • The ESAM interface can also be used for input switching (i.e. instructing an encoder / playout engine to switch from one input feed to another) in response to cues or schedule requests. AWS Documentation

Thus, ESAM / SCTE-250 is the glue that allows a runtime system to act upon SCTE-35 cues in a policy-driven, dynamic way.

Workflow

  1. Pre-distribution (out-of-band policy delivery via SCTE-224 / ESNI)
    • The content owner / rights holder defines policies (blackout regions, alternate streams, ad rules) and publishes them via an ESNI interface (SCTE-224). Distributors / streaming platforms ingest those metadata (MediaPoints, ViewingPolicies, Audiences).
    • This gives the system the “map” of what rules to apply when events occur.
  2. Live video encoding / contribution path with embedded cues (SCTE-35)
    • In the live production chain, SCTE-104 (or other internal cue signals) get mapped to SCTE-35 cues in the compressed / transport stream (e.g. splice_insert, time_signal).
    • These SCTE-35 cues flow downstream embedded with video/audio.
  3. Signal detection & ESAM request
    • The signal acquisition system (SAS) — e.g. an encoder or splicer — detects a SCTE-35 cue in the stream.
    • It packages a Signal Processing Event (SPE) request (including metadata about the cue, identifier, timestamp, etc.) and sends it to the POIS / Signal Decision System (SDS) via ESAM API.
    • The SDS consults the SCTE-224 metadata and the policies / viewing rules to decide the right action for that cue (e.g. include the cue, drop it, rewrite it, switch to alternate content, enforce blackout, etc.). It replies with a Signal Processing Notification (SPN) containing instructions.
  4. Cue conditioning / adaptation
    • The SAS (encoder / packager / splicer) receives the SPN and conditions the cue accordingly:
      • It may drop or suppress the cue if policy says “do not insert ad here for this region.”
      • It may modify metadata, adjust duration, or add segmentation UPIDs.
      • It may switch to an alternate input feed if instructed (e.g. alternate content for regions).
      • It may leave the cue intact for downstream DAI systems to act on.
  5. Streaming / packaging & manifest staging / DAI
    • After cue conditioning, the downstream streaming / packaging / manifest manipulation systems see the modified (or original) SCTE-35 cues.
    • A DAI engine (manifest stitcher / ad server) reads the cue + any metadata (from SCTE-224) and decides which ad(s) or alternate content to insert for that user / region / audience.
    • The manifest is manipulated (e.g. segment pointers, ad segments inserted) so that the client sees a seamless stream with ads in the right spot, per rules.
  6. Client playback
    • The client fetches the manifest (HLS / DASH) and plays the stream, seeing the ad insertion / splice seamlessly, with the right content per region / policy.

If needed, unsolicited ESAM notifications can push new instructions to the pipeline (e.g. schedule changes, emergency overrides) ahead of the cue.

Also, some systems support virtual input switching — i.e. when the decision system instructs the SAS (via ESAM) to switch from one feed to another (for example, national feed to local feed) when a certain cue or policy demands. AWS Elemental Live supports this in its “virtual input switching” mode. AWS Documentation

Below is a simplified sequence:

SCTE-224 (ESNI) → distributes rules / schedules ahead  
             ↓  
Live feed with embedded SCTE-35 cues (in transport)  
             ↓  
SAS detects cue → sends SPE (via ESAM / SCTE-250) → SDS / POIS replies with SPN  
             ↓  
SAS conditions / adapts cue (or switches feed)  
             ↓  
DAI / manifest stitching sees cue + policy, inserts correct ad / alternate  
             ↓  
Client plays final stream

So the roles are:

  • SCTE-35In-band marker / splice event signaling
  • SCTE-224 (ESNI)Out-of-band policy / metadata / schedule interface
  • SCTE-250 (ESAM)Runtime API / messaging interface for handling / conditioning cues via POIS / decision logic
  • DAIDownstream ad insertion / splice logic using cues + policy

Here’s a signal flow / interaction diagram (conceptual) that shows how SCTE-35SCTE-224 / ESNISCTE-250 / ESAMPOIS / decision logic, and DAI interoperate in a modern live / streaming (OTT) advertisement insertion chain. I’ll annotate the steps afterwards.


Diagram Description (from left to right / top to bottom)

  1. Live Video & Cue Source
    An upstream stream (e.g. encoder, broadcast feed) carries SCTE-35 cues embedded in the video transport or multiplexed stream.
  2. Signal Acquisition / Encoder / Splicer (SAS)
    This block sees the SCTE-35 cues. It also has a connection to a POIS / Decision System (SDS) via ESAM (SCTE-250) API.
    It may also receive SCTE-224 / ESNI metadata (policy, schedule, rules) from a scheduling/metadata server.
  3. ESAM / Decision Interface
    • The SAS issues a Signal Processing Event (SPE) to the SDS / POIS, passing details of the cue (timestamp, cue type, IDs, etc.).
    • The SDS / POIS consults SCTE-224 / ESNI policy metadata + business rules + audience / regional constraints, and returns a Signal Processing Notification (SPN) with instructions (e.g. condition, drop, switch, modify the cue).
    • SAS acts upon the SPN (cue conditioning, input switching, suppression, rewriting, etc.)
  4. Downstream Packaging / DAI / Manifest Stitching
    The conditioned video stream (with modified / approved cues) is passed to the packaging / origin / manifest stitching / DAI engine. The DAI engine uses the cues + rule metadata to insert the correct ad content or alternate content in the manifest or stream back to the client.
  5. Client Playback
    Clients fetch the manipulated manifests / segments (with ads stitched), and playback is seamless—users see the program + targeted ads / alternate content depending on region / policy.
  6. Out-of-Band Policy / Metadata Flow
    The SCTE-224 / ESNI interface (often via web / API) distributes schedule, region rules, blackout policies, audience segmentation metadata ahead of time to POIS / SDS / encoding elements so that the decision logic is ready before cues arrive.
  7. Unsolicited / Schedule-Based Decisions
    The SDS / POIS may also send unsolicited ESAM notifications (not triggered by immediate cues) to SAS—for example, schedule changes, ad swap overrides, etc.

Step-by-Step Interaction & Flow

Here’s a stepwise narrative of how it works in a live event:

StepEvent / ActionSignal / API InvolvedDecision / Data ConsultedResulting Action
1Live feed with SCTE-35 cue arrives at SAS (encoder / splicer)In-band SCTE-35SAS notices cue
2SAS sends an SPE requestto the POIS / SDS via ESAM (SCTE-250)ESAM (SPE)Basic cue metadata (timestamp, cue type, identifiers)Asks “what to do with this cue?”
3SDS / POIS processes the requestUses SCTE-224 / ESNI metadata (schedule, regional policies, audience, blackout rules) + business logic + ad availabilitySDS formulates SPN (Signal Processing Notification) with instructions
4SAS receives SPN and conditions the cue or streamSAS modifies / suppresses / rewrites / switches / filters the cue per instructionThe outgoing stream has a “conditioned” cue (or no cue)
5Packaging / DAI engine sees the conditioned cueDAI logic uses cue + metadata (rule, ad inventory, targeting)Inserts / stitches appropriate ad(s) or alternate content in manifest / segments
6Client receives final manifest / segments with adsPlayback logic / ABR player sees ad boundaries / segmentsSeamless playback of program + ads tailored to policy / region
7(Optional) SDS / POIS sends unsolicited ESAMmessagesESAM unsolicitedBased on schedule or changesSAS may proactively reconfigure cue behavior, switch feeds, etc.

Key Relationships & Roles (revisited with context)

  • SCTE-35: the in-band cues that mark splice points / ad break boundaries inside the stream. Without them, there’s no timing anchor for DAI.
  • SCTE-224 / ESNI: the out-of-band metadata / schedule / policy layer that informs the decision logic what to do when a cue arrives (blackouts, regional restrictions, alternate content rules, ad eligibility).
  • SCTE-250 / ESAM: the runtime API / control interface between the signal acquisition system and the decision engine (POIS) — enabling dynamic interactions at the moment a cue arrives.
  • POIS / Decision Logic (SDS): the business rules engine that merges incoming cues, policy metadata, audience / regional logic, ad inventory, schedule, etc. to decide how to act.
  • DAI / Manifest Stitching / Ad Insertion: the downstream system that executes on the conditioned cues and actually inserts or splices ads or alternate content into the user-facing stream.

Signal Flow / Interaction Diagram (Conceptual)

 +-----------------------------------+
 |  Policy / Metadata Server         |  ← SCTE-224 / ESNI (out-of-band)
 |  (Schedules, Rules, Blackouts)   |
 +----------------+------------------+
                  |
                  v
 +---------------------------------------+
 |  POIS / Decision System (SDS)         |  ← Receives metadata, applies logic
 +---+------------------------+----------+
     |                        ^
     | (ESAM / SCTE-250 API)  | (SPE / SPN – Signal Processing Events / Notifications)
     v                        |
 +-----------------------------+        +----------------------------+
 |  Signal Acquisition System   |        |  Packaging / DAI / Stitcher |
 |  (SAS: encoder / splicer /   | ------>|  (Manifest manipulation, ad insertion) |
 |   signal processor)          |        +----------------------------+
 +-----------------------------+  
     ^  
     |  
     |  (embedded)  
     |  SCTE-35 cues flow here  
     |  
     |  
 +-------------------------------+
 |  Live / Contribution Feed     |
 |  (with SCTE-35 in stream)     |
 +-------------------------------+

Key points / arrows:

  • The Policy / Metadata Server distributes schedule rules, blackout policies, audience segmentation, etc., using SCTE-224 / ESNI (out-of-band).
  • The POIS / Decision System (SDS) uses that metadata and business logic to decide what to do when a cue occurs.
  • The Signal Acquisition System (SAS) (encoder, splicer, signal processor) monitors the incoming feed, detects SCTE-35 cues, and when one comes, it issues a Signal Processing Event (SPE) over ESAM / SCTE-250 to the POIS.
  • The POIS responds with a Signal Processing Notification (SPN) instructing how to condition or handle that cue (e.g. drop, modify, switch, reassign).
  • The SAS acts on that SPN, modifying or passing through the cue(s) accordingly.
  • The conditioned stream (with the modified / filtered / approved cues) goes into DAI / manifest stitching / packaging, which reads cues and inserts the correct ads or alternate content for the user, enforcing rules from policy metadata.