CHANGE REQUEST

 

 

DASH-IF IOP

CR

 

rev

-

Current version:

4.3

 

 

Status:

 

Draft

X

Internal Review

 

Community Review

 

Agreed

 

 

Title:                    

Advanced Ad Insertion in DASH

 

 

Source:

Ad Insertion TF

 

 

Supporting Companies:

Hulu, Qualcomm, Tencent, Unified Streaming, <others to be added>

 

 

Category:

A

 

Date:

2019-08-12

 

Use one of the following categories:
C
  (correction)
A  (addition of feature)
B  (editorial modification)

 

 

 

Reason for change:

Ad Insertion is considered as one of the most important aspects in online video distribution. Also with the development of CMAF, some additional aspects are relevant, such as consistent development of Ad content, content insertion into CMAF live content, etc. This document addresses latest development in the context of Ad Insertion and maps this to DASH.

 

 

Summary of change:

1)    Description of most relevant use cases

2)    Ad Insertion architecture

3)    Definition of main content requirements and recommendations

4)    Definition of ad content requirements and recommendations

5)    Definition of combined main and ad content

6)    Ad specific metadata

7)    Ad tracking

 

 

Consequences if not approved:

Insufficient Ad Insertion capabilities in DASH

 

 

Sections affected:

References,

The whole clause 8 on DASH Ad Insertion in DASH-IF IOP is replaced

 

 

Other comments:

 

 

Disclaimer:

This document is not yet final. It is provided for public review until the deadline mentioned below. If you have comments on the document, please submit comments by one of the following means:

-          at the github repository https://github.com/Dash-Industry-Forum/AdInsertion/issues, or

-          dashif+iop@groupspaces.com with a subject tag [AdInsertion]

Please add a detailed description of the problem and the comment.

 

Based on the received comments a final document will be published latest by the expected publication date below, integrated in a new version of DASH-IF IOP if the following additional criteria are fulfilled:

-          All comments from community review are addressed

-          The relevant aspects for the Conformance Software are provided

-          Verified IOP test vectors are provided

 

 

Commenting Deadline:

September 30th, 2019

 

 

Expected Publication:

December 15th, 2019

Contributors:

Zachary Cava (Hulu)

Thomas Stockhammer (Qualcomm)

Iraj Sodagar (Tencent)

Rufael Mekuria (Unified Streaming)

Andy Rosen (DSR)

Gary Hughes (independent)

Nicol So (Arris)

Will Law (Akamai)

Alex Giladi (Comcast)

Cooper Pope (Turner)

And others

 

Add References in yellow

Notes:

1)    If appropriate, the references refer to specific versions of the specifications. However, implementers are encouraged to check later versions of the same specification, if available. Such versions may provide further clarifications and corrections. However, new features added in new versions of specifications are not added automatically.

2)    Specifications not yet officially available are marked in italics.

3)    Specifications considered informative only are marked in Arial

[1]                   DASH-IF DASH-264/AVC Interoperability Points, version 1.0, available at http://dashif.org/w/2013/06/DASH-AVC-264-base-v1.03.pdf

[2]                   DASH-IF DASH-264/AVC Interoperability Points, version 2.0, available at http://dashif.org/w/2013/08/DASH-AVC-264-v2.00-hd-mca.pdf

[3]                   ISO/IEC 23009-1:2012/Cor.1:2013 Information technology -- Dynamic adaptive streaming over HTTP (DASH) -- Part 1: Media presentation description and segment formats.

Note: this document is superseded by reference [4], but maintained as the initial version of this document is provided in the above reference. 

[4]                   ISO/IEC 23009-1:2014 Information technology -- Dynamic adaptive streaming over HTTP (DASH) -- Part 1: Media presentation description and segment formats. Including:

·         ISO/IEC 23009-1:2014/Cor 1:2015

·         ISO/IEC 23009-1:2014/Cor 2:2017

·         ISO/IEC 23009-1:2014/Amd 1:2015 High Profile and Availability Time Synchronization

·         ISO/IEC 23009-1:2014/Amd 2:2015 Spatial relationship description, generalized URL parameters and other extensions

·         ISO/IEC 23009-1:2014/Amd 3:2016 Authentication, MPD linking, Callback Event, Period Continuity and other Extensions.

·         ISO/IEC 23009-1:2014/DAmd 4:2016 Segment Independent SAP Signalling (SISSI), MPD chaining, MPD reset and other extensions.

All the above is expected to be rolled into a third edition of ISO/IEC 23009-1 as:

·         ISO/IEC 23009-1:2018 Information technology -- Dynamic adaptive streaming over HTTP (DASH) -- Part 1: Media presentation description and segment formats. [Note: Expected to be published by end of 2018. The draft third edition is available in the MPEG document m44441.]

In addition, the following documents are under preparation in MPEG:

·         ISO/IEC 23009-1:2014/DCor 3:2018 [Note: Expected to be published by mid of 2019. The study of the COR is available as an output document w17951.]

·         ISO/IEC 23009-1:2014/DAmd 5:2018 Device Information and other extensions. 2018 [Note: Expected to be published by mid of 2019. The DAM is available as an output document w18057.]

[5]             ISO/IEC 23009-2:2014: Information technology -- Dynamic adaptive streaming over HTTP (DASH) -- Part 2: Conformance and Reference.

[6]             ISO/IEC 23009-3:2014: Information technology -- Dynamic adaptive streaming over HTTP (DASH) -- Part 3: Implementation Guidelines.

[7]                   ISO/IEC 14496-12:2015 Information technology -- Coding of audio-visual objects -- Part 12: ISO base media file format. This also includes amendments and corrigendas, for details see here: https://www.iso.org/standard/68960.html

[8]                   ITU-T Recommendation H.264 (04/2017): "Advanced video coding for generic audiovisual services" | ISO/IEC 14496-10:2010: "Information technology – Coding of audio-visual objects – Part 10: Advanced Video Coding".

[9]                   ISO/IEC 14496-15:2017:2015: Information technology -- Coding of audio-visual objects -- Part 15: Carriage of network abstraction layer (NAL) unit structured video in ISO base media file format.

[10]               IETF RFC 6381, The 'Codecs' and 'Profiles' Parameters for "Bucket" Media Types, August 2011.

[11]               ISO/IEC 14496-3:2009 - Information technology -- Coding of audio-visual objects -- Part 3: Audio with Corrigendum 1:2009, Corrigendum 2:2011, Corrigendum 3:2012, Amendment 1:2009, Amendment 2:2010, Amendment 3:2012, and Amendment 4:2014.

[12]               ISO/IEC 14496-14:2003/Amd 1:2010 Information technology -- Coding of audio-visual objects -- Part 14: The MP4 File Format

[13]          3GPP (2005-01-04). "ETSI TS 126 401 V6.1.0 (2004-12) - Universal Mobile Telecommunications System (UMTS); General audio codec audio processing functions; Enhanced aacPlus general audio codec; General description (3GPP TS 26.401 version 6.1.0 Release 6)"

[14]               ANSI/CEA-708-E: Digital Television (DTV) Closed Captioning, August 2013

[15]          3GPP TS  26.245: "Transparent end-to-end Packet switched Streaming Service (PSS); Timed text format"

[16]               W3C Timed Text Markup Language 1 (TTML1)  (Second Edition) 24 September 2013.

[17]               SMPTE ST 2052-1:2013 "Timed Text Format (SMPTE-TT)", https://www.smpte.org/standards

[18]          W3C WebVTT - The Web Video Text Tracks,— http://dev.w3.org/html5/webvtt/

[19]               ITU-T Recommendation H.265 (02/2018): "Advanced video coding for generic audiovisual services" | ISO/IEC 23008-2:2015/Amd 1:2015: " High Efficiency Coding and Media Delivery in Heterogeneous Environments – Part 2: High Efficiency Video Coding", downloadable here: http://www.itu.int/rec/T-REC-H.265

[20]               EBU Tech 3350, "EBU-TT, Part 1, Subtitling format definition", July 2012, http://tech.ebu.ch/docs/tech/tech3350.pdf?vers=1.0

[21]               IETF RFC 7230, Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing, June 2014.

[22]               IETF RFC 7231, Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content, June 2014.

[23]               IETF RFC 7232, Hypertext Transfer Protocol (HTTP/1.1): Conditional Requests, June 2014.

[24]               IETF RFC 7233, Hypertext Transfer Protocol (HTTP/1.1): Range Requests, June 2014.

[25]               IETF RFC 7234, Hypertext Transfer Protocol (HTTP/1.1): Caching, June 2014.

[26]               IETF RFC 7235, Hypertext Transfer Protocol (HTTP/1.1): Authentication, June 2014.

[27]          SMPTE RP 2052-10-2013: Conversion from CEA-608 Data to SMPTE-TT  https://www.smpte.org/standards

[28]          SMPTE RP 2052-11-2013: Conversion from CEA 708 to SMPTE-TT  https://www.smpte.org/standards

[29]               ISO/IEC 14496-30:2014, "Timed Text and Other Visual Overlays in ISO Base Media File Format".  Including:

ISO/IEC 14496-30:2014, Cor 1:2015

ISO/IEC 14496-30:2014, Cor 2:2016

[30]               ISO/IEC 23001-7:2016: "Information technology -- MPEG systems technologies -- Part 7: Common encryption in ISO base media file format files".

[31]               DASH Industry Forum, Test Cases and Test Vectors: http://testassets.dashif.org/.

[32]               DASH Industry Forum, "Guidelines for Implementation: DASH-AVC/264 Conformance Software", http://dashif.org/conformance.html.

[33]               DASH Identifiers Repository, available here: http://dashif.org/identifiers

[34]               DTS 9302J81100, “Implementation of DTS Audio in Media Files Based on ISO/IEC 14496”, http://www.dts.com/professionals/resources/resource-center.aspx

[35]               ETSI TS 102 366 v1.2.1, Digital Audio Compression (AC-3, Enhanced AC-3) Standard (2008-08)

[36]               MLP (Dolby TrueHD) streams within the ISO Base Media File Format, version 1.0, September 2009.

[37]               ETSI TS 102 114 v1.3.1 (2011-08), “DTS Coherent Acoustics; Core and Extensions with Additional Profiles”

[38]               ISO/IEC 23003-1:2007 - Information technology -- MPEG audio technologies -- Part 1: MPEG Surround

[39]              DTS 9302K62400, “Implementation of DTS Audio in Dynamic Adaptive Streaming over HTTP (DASH)”, http://www.dts.com/professionals/resources/resource-center.aspx

[40]              IETF RFC5905, "Network Time Protocol Version 4: Protocol and Algorithms Specification," June 2010.

[41]               IETF RFC 6265: "HTTP State Management Mechanism", April 2011.

[42]               ETSI TS 103 285 v.1.1.1: "MPEG-DASH Profile for Transport of ISO BMFF Based DVB Services over IP Based Networks".

[43]               ANSI/SCTE 128-1 2013: "AVC Video Constraints for Cable Television, Part 1 - Coding", available here: http://www.scte.org/documents/pdf/Standards/ANSI_SCTE%20128-1%202013.pdf

[44]          IETF RFC 2119, "Key words for use in RFCs to Indicate Requirement Levels", April 1997.

[45]               ISO: “ISO 639.2, Code for the Representation of Names of Languages — Part 2: alpha-3 code,” as maintained by the ISO 639/Joint Advisory Committee (ISO 639/JAC), http://www.loc.gov/standards/iso639-2/iso639jac.html; JAC home page: http://www.loc.gov/standards/iso639-2/iso639jac.html; ISO 639.2 standard online: http://www.loc.gov/standards/iso639-2/langhome.html.

[46]               CEA-608-E, Line 21 Data Service, March 2008.

[47]               IETF RFC 5234, “Augmented BNF for Syntax Specifications: ABNF”, January 2008.

[48]               SMPTE ST 2086:2014, “Mastering Display Color Volume Metadata Supporting High Luminance And Wide Color Gamut Images”

[49]               ISO/IEC 23001-8:2016, “Information technology -- MPEG systems technologies -- Part 8: Coding-independent code points”, available here: http://standards.iso.org/ittf/PubliclyAvailableStandards/c069661_ISO_IEC_23001-8_2016.zip

[50]               IETF RFC 7164, “RTP and Leap Seconds”, March 2014.

[51]          void

[52]               IAB Video Multiple Ad Playlist (VMAP), available at https://www.iab.com/guidelines/digital-video-multiple-ad-playlist-vmap-1-0-1/

[53]               IAB Video Ad Serving Template (VAST), available at https://www.iab.com/guidelines/digital-video-ad-serving-template-vast/

[54]               ANSI/SCTE 35 2015, Digital Program Insertion Cueing Message for Cable

[55]               ANSI/SCTE 67 2014, Recommended Practice for SCTE 35 Digital Program Insertion Cueing Message for Cable

[56]               ANSI/SCTE 214-1, MPEG DASH for IP-Based Cable Services, Part 1: MPD Constraints and Extensions

[57]               ANSI/SCTE 214-3, MPEG DASH for IP-Based Cable Services, Part 3: DASH/FF Profile

[58]               EIDR ID Format - EIDR: ID Format, v1.2, March 2014, available at http://eidr.org/documents/EIDR_ID_Format_v1.2.pdf

[59]               Common Metadata, TR-META-CM, ver. 2.0, January 3, 2013, available at http://www.movielabs.com/md/md/v2.0/Common_Metadata_v2.0.pdf

[60]               IETF RFC 4648, "The Base16, Base32, and Base64 Data Encodings", October 2006.

[61]               W3C TTML Profiles for Internet Media Subtitles and Captions 1.0 (IMSC1), Editor’s Draft 03 August 2015, available at: https://dvcs.w3.org/hg/ttml/raw-file/tip/ttml-ww-profiles/ttml-ww-profiles.html  

[62]              W3C TTML Profile Registry, available at: https://www.w3.org/wiki/TTML/CodecsRegistry

[63]               ETSI TS 103 190-1 v1.2.1, “Digital Audio Compression (AC-4); Part 1: Channel based coding”.

[64]               ISO/IEC 23008-3:2018, Information technology -- High efficiency coding and media delivery in heterogeneous environments -- Part 3: 3D audio.

[65]               IETF RFC 5246, “The Transport Layer Security (TLS) Protocol, Version 1.2”, August 2008.

[66]               IETF RFC 4337, “MIME Type Registration for MPEG-4”, March 2006.

[67]               SMPTE: “Digital Object Identifier (DOI) Name and Entertainment ID Registry (EIDR) Identifier Representations,” RP 2079-2013, Society of Motion Picture and Television Engineers, 2013.

[68]               SMPTE: “Advertising Digital Identifier (Ad-ID®) Representations,” RP 2092-1, Society of Motion Picture and Television Engineers, 2015.

[69]               W3C Encrypted Media Extensions - https://www.w3.org/TR/encrypted-media/.

[70]               void

[71]               SMPTE ST 2084:2014, “Mastering Display Color Volume Metadata Supporting High Luminance and Wide Color Gamut Images”

[72]               ISO/IEC 23001-8:2013, “Information technology -- MPEG systems technologies -- Part 8: Coding-independent code points”, available here: http://standards.iso.org/ittf/PubliclyAvailableStandards/c062088_ISO_IEC_23001-8_2013.zip

[73]               Recommendation ITU-R BT.709-6 (06/2015): "Parameter values for the HDTV standards for production and international programme exchange".

[74]               Recommendation ITU-R BT.2020-1 (06/2014): "Parameter values for ultra-high definition television systems for production and international programme exchange".

[75]               ETSI TS 101 154 v2.2.1 (06/2015): "Specification for the use of Video and Audio Coding in Broadcasting Applications based on the MPEG-2 Transport Stream."

[76]               ETSI TS 103 285 v1.1.1 (05/2015): "Digital Video Broadcasting (DVB); MPEG-DASH Profile for Transport of ISO BMFF Based DVB Services over IP Based Networks.”

[77]               3GPP TS 26.116 (03/2016): "Television (TV) over 3GPP services; Video Profiles.”

[78]               DECE (05/2015): “Common File Format & Media Formats Specification”, http://uvcentral.com/sites/default/files/files/PublicSpecs/CFFMediaFormat-2_2.pdf

[79]               Ultra HD Forum: Phase A Guidelines, version 1.1, July 2015

[80]               Recommendation ITU-R BT.2100-1 (07/2016): "Image parameter values for high dynamic range television for use in production and international programme exchange".

[81]               SMPTE ST 2086:2014, “Mastering Display Color Volume Metadata Supporting High Luminance And Wide Color Gamut Images”

[82]               SMPTE ST 2094-1:2016, “Dynamic Metadata for Color Volume Transform – Core Components”

[83]               SMPTE ST 2094-10:2016, “Dynamic Metadata for Color Volume Transform – Application #1”

[84]               Recommendation ITU-R BT.1886: “Reference electro-optical transfer function for flat panel displays used in HDTV studio production”

[85]               ETSI DGS/CCM-001 GS CCM 001 “Compound Content Management”

[86]               VP9 Bitstream & Decoding Process Specification. https://storage.googleapis.com/downloads.webmproject.org/docs/vp9/vp9-bitstream-specification-v0.6-20160331-draft.pdf

[87]               VP Codec ISO Media File Format Binding https://www.webmproject.org/vp9/mp4/

[88]               ETSI TS 143 433-1, “High-Performance Single Layer High Dynamic Range (HDR) System for use in Consumer Electronics devices; Part 1: Directly Standard Dynamic Range (SDR) Compatible HDR System (SL-HDR1)”

[89]               ST 2094-40:2016 - SMPTE Standard - Dynamic Metadata for Color Volume Transform — Application #4, Sept. 2016

[90]               CTA 861-G - CTA Standard: A DTV Profile for Uncompressed High Speed Digital Interfaces, Nov. 2016

[91]               ISO/IEC 23000-19:2018 - Information technology -- Coding of audio-visual objects -- Part 19: Common media application format (CMAF) for segmented media, Amendment 2: xHE-AAC and other media profiles

[92]               DASH-IF IOP: specification of live ingest, May, 2019, https://dashif-documents.azurewebsites.net/Ingest/master/DASH-IF-Ingest.html

[93]               CableLabs Video-On-Demand Content Specification, Version 1.1, available at https://specification-search.cablelabs.com/cablelabs-video-on-demand-content-specification-version-1-1

[94]               ISO/IEC 1318-1, MPEG-2 Part 1, Systems

[95]               IAB Lab Tech Open Measurement SDK, available at https://iabtechlab.com/standards/open-measurement-sdk/

[96]               W3C Media Source Extensions, available at https://www.w3.org/TR/media-source/

[97]               W3C Encrypted Media Extensions, available at https://www.w3.org/TR/encrypted-media/

[98]               Consumer Technology Association Web Application Video Ecosystem (CTA Wave), https://cta.tech/Research-Standards/Standards-Documents/WAVE-Project/WAVE-Project.aspx

[99]               ISO/IEC 23000-19:2020, "Common Media Application Format", Second Edition FDIS is available as MPEG Output w18636.

[100]           ISO/IEC 23009-1:2020, "Dynamic Adaptive Streaming over HTTP, Media presentation description and segment formats", Fourth Edition FDIS is available as MPEG Output w18609.

[101]           ISO/IEC 23009-1:2020/Amd.1:2020, "Dynamic Adaptive Streaming over HTTP, Media presentation description and segment formats", Working Draft is available as MPEG Output w18641.

[102]           www.videoservicesforum.org/activity_groups/RIST_poster_for_VidTrans2018Feb25.pdf

[103]           ANSI/SCTE 130-3 2013, Digital Program Insertion-Advertising Systems Interface Part 3 https://www.scte.org/documents/pdf/Standards/ANSI_SCTE%20130-3%202013.pdf

 

 

Replace Clause 8 with the following

 

1.  Ad Insertion in DASH

1.1.      Introduction

1.1.1.         Use Cases and Scenarios

1.1.1.1        Overview

This clause provides an overview of guiding use cases considered in the context of ad insertion for DASH. The initial focus is on use cases addressed in clause 1.1.1.3 together with the transition issues in clause 1.1.1.7. 

In future version of this document, the remaining use cases will be addressed. However, the tools documented in this clause may very well be used for ad insertion for all documented use cases.

1.1.1.2        VoD

In this case content is statically defined and made available on demand to clients. Ad insertion takes place at pre-defined placement opportunities within the content. Opportunities are located at conventional pre-, mid-, and post-roll positions within the content.

No restriction is placed on the duration of the inserted ads. Service providers may choose to fill the opportunities when the client first requests content and/or when the client playout approaches the opportunity location. Service providers may also choose to skip an opportunity, in which case content will seamlessly continue.

If possible, content should be preconditioned such that segment boundaries are created at placement opportunities.

1.1.1.3        Live

In this case content is being made periodically available to clients as part of a live event. Placement opportunities are signalled by the content author via in-band cues such as SCTE-35. Service providers may have the right to replace a subset or all of the placement opportunities that occur.

Opportunities will have an explicit expected duration announced with them and may come with little to no pre-warning. Inserted advertisements will replace in stream content and should exactly match the expected opportunity duration to avoid delaying the main content.

While opportunities are generally expected to match the announced duration, in practice opportunities may be terminated early by the content author in response to the occurring event. In this case, the main content will take priority and the inserted advertisement will be cut short at the point of in-stream opportunity termination.

In addition to early termination, opportunities may be extended by the content author in response to the occurring event. In this case, the service provider may elect to return to the main stream and use the original in stream content for the remainder of the break or treat the extension as a new opportunity and fill the announced extended duration.

Service providers may choose to skip a replacement opportunity entirely, in which case the original in stream content will be played instead.

If in-band cues are used to signal opportunities, the content encoding should produce exact segment boundaries at the cue points.

1.1.1.4        Recorded Live

In this case content is a capture of a live stream that is made available on demand to clients. Placement opportunities are the same that occurred during the original live event. Service providers may have the right to replace a subset or all of the placement opportunities that occur.

Opportunities have an explicit duration and default content associated with them. Inserted advertisements will replace the default content and may vary in duration from the original content.

Service providers may choose to skip a replacement opportunity, in which case the default content will be played instead. Service providers may also choose to remove a placement opportunity, in which case the content before the opportunity will seamlessly transition to the content after the opportunity.

1.1.1.5        Pre-Roll into Live

In this case a service provider desires to present an advertisement prior to entering a live stream. The advertisement is a static asset that is available on demand to clients and the live stream is being made periodically available to clients as part of a live event.

The advertisement may be of any duration desired and is not associated with any conditioning or markers in the live stream.

Following the playout of the advertisement, the client will join the live stream and no longer be able to access the original advertisement.

1.1.1.6        Obfuscation of Inserted Ads

In this case a service provider wishes to present an advertisement with a content stream, but does not wish for the advertisement to be detectable by the client. To accomplish this the advertisement may be stitched into content prior to packaging and manifest generation such that there is a single asset produced containing the stitched assets.

This use case is not currently in the scope of this document as the single asset will result in less interoperability challenges. However, the DASH-IF Working Group is continuing to study effective obfuscation methods and practices within DASH and will provide information in future editions of this document.

1.1.1.7        Changes that may happen at Transitions of Main content and Ads

A complex issue in the playback of ad content in combination with main content is the transition between the two contents. This transition should happen in a smooth and seamless manner such that the user does not observe discontinuities, quality changes, audio glitches, rebuffering or other artefacts. DASH provides different signaling mechanisms to indicate how content is offered. In many times, it is then a question on the capability of the underlying playback platform, whether it can handle such content in a smooth and seamless manner or if issues and problems are expected to occur. At such splice points, different issues may happen in general, some of them listed below:

·         Timeline discontinuities: in order to avoid rewriting content, the inserted content may not follow the timeline of the main content. However, this can be handled in DASH at Period boundaries.

·         Overlaps or possibly even gaps or of content on the master timeline: The content may not exactly be matching the envisaged insertion instructions and hence content may be overlapping at the splice points or there may be gaps. DASH permits signaling of such properties and playback platforms can handle the playback of content with such properties.

·         Encryption and key changes: In case of DRM protected content, changes of the encryption or of keys may happen at splice points. DASH permits signaling of such properties and playback platforms can handle the playback of content with such properties.

·         Codecs changes: Ads may be prepared with different codecs than the main content. This may result in complex codec change operations and not all platforms can handle such operations. DASH permits signaling of such changes, but also allows playback platforms to identify their capabilities, whether they can handle such changes or not.

·         Codec profile/level changes: Similarly, to the above, ads may be prepared with different codec profiles or levels. Again, DASH permits signaling of such changes, but also allows playback platforms to identify their capabilities, whether they can handle such changes or not.

·         Signal changes (HDR/SDR, 4K/HD, Stereo/5.1): Ad content may be offered with different signal properties, for example the resolution of the video may changes, the color space or transfer charactistics may changes, or in audio, the channel configuration may change. Again, DASH permits signaling of such changes, but also allows playback platforms to identify their capabilities, whether they can handle such changes or not.

·         Addition or removal of a track (e.g. a language, subtitle): At ad or program boundaries, certain tracks or sub-assets may not be available, for example a specific language may not be available, the content may not provide subtitles, or even the offering in a certain format or codec may not be available. Again, DASH permits signaling of such changes, but also allows playback platforms to identify their capabilities, whether they can handle such changes or not.

This specification addresses three aspects in the context of the above:

1)      The signalling of the DASH formatted content at splice points on what changes may happen

2)      Certain requirements on DASH formatted content in order to support playback on a majority of devices

3)      The ability of signaling the required capabilities for a playback platform in order to seamlessly playback the content.

1.1.2.            Definitions

ABR Encoder: live encoder that converts a broadcast stream or mezzanine into a ladder of different bit-rate tracks.

Ad Avail Processor: logical service that, given cue data, determines the placement of advertisement content within a stream and describes the necessary ad decision service communication

Ad Content Server: server storing the ad content and serving it on a per request basis.

Ad Creative: linear visual and auditory asset that represents the content of an advertisement

Ad Decision Service: functional entity that decides which ad(s) will be shown to the user. It interfaces deployment-specific and are out of scope for this document.

Ad Insertion MPD Manipulator: functional entity that proxies a DASH MPD and may change it to insert the ad creative in the streaming presentation. It may also embed other ad related metadata, or remove ad related metadata in the mpd.

Ad Pod: location or point in time where one or more ad slots may be scheduled for delivery; same as ad break, avail, and placement opportunity; pre-, mid-, and post- prefix may be used to denote pod location relative to content as before, during, and after respectively.

Ad Reporting Server: functional entity for collecting viewer impressions of advertisement content.

Ad Slot: single ad creative that is one of possibly many others that make up an ad pod

CDN node: functional entity returning a segment on request from DASH client. There are no assumptions on location of the node.

CMAF packager: functional entity, often residing with the ABR Encoder, which packages the adaptive bit-rate tracks into CMAF tracks.

DASH Ad resolver: functional entity which returns one or more remote elements on request from DASH client.

DASH Access Client: client consuming the DASH stream, possibly also contains functionality for client side ad insertion and viewer impression reporting.

DASH Ad resolver: functional entity which returns one or more ad creatives in a dash formatted construct on request from a DASH Access client.

DASH Packager: functional entity that processes conditioned content and produces media segments suitable for consumption by a DASH client. This entity is also known as fragmenter, encapsulater, or segmenter.

DASH-IF Ad Content: Content that follows specific restrictions and requirements according to this specification to be independently produced and inserted into well-formated main content by simple MPD manipulation processes.

MPD Generator: functional entity returning an MPD on request from DASH client. It may be generating an MPD on the fly or returning a cached one.

Origin: functional entity that contains all media segments indicated in the MPD, and is the fallback if CDN nodes are unable to provide a cached version of the segment on client request.

Reference Playback Platform: reference platform for playback (e.g. HTML-5 MSE/EME)

Server-Side Ad Insertion (SSAI): ad serving architecture that interleaves content and ad assets prior to the stream reaching the client.

Server-Guided Ad Insertion (SGAI): ad serving architecture that fully describes ad opportunities within content prior to the stream reaching the client, but has the client resolve opportunities as needed.

Splice Point: point in media content where its stream may be switched to the stream of another content, e.g. to an ad.

Tracking Event: data payload associated with an ad creative that is emitted by an application when a specific time point or criteria is met during the creative playout.

1.1.2    Architectures

In the context of DASH-IF guidelines, primarily two architectures are considered. In the Server-Side Ad Insertion (SSAI) architecture, the ad is inserted in the network before reaching the DASH Client. In the Server-Guided Ad Insertion (SGAI) architecture, information about ad placement and resolution is inserted in the network, but final resolution is done on demand by the DASH client. The architectures share a significant amount of the functions and interfaces documented in Figure 1.

Figure 1 DASH-IF Ad Insertion Architecture

In this document, requirements and recommendations are provided for different interfaces. The main focus of the work are the interfaces to and from the DASH client. However, network interfaces and functions are also discussed as they impact the processing in certain functions.

Note 1: The above diagram combines the MPD and Segment servers, in a refined version they may be separated.

Note 2: The latency of each function/interface may be provided in a revised version. Input is welcome.

Note 3: The interface names are only numeric. Should DASH-IF provide more instructive names and if so, feedback is welcome on commonly agreed names for each of the interfaces. This is discussed in https://github.com/Dash-Industry-Forum/AdInsertion/issues/40, please join the discussion.

An overview of the functions and interfaces are provided in clause 1.1.3.

1.1.3    Overview on Interfaces and Functions

The Ad Insertion architectures start with the ingest of an input stream over IF-0 which is processed by an ABR Encoder and output as well-formed CMAF content over the IF-1 interface. A DASH Packager / MPD Generator uses IF-1 input to generate a conformant DASH content presentation that is sent over IF-2 and additional opportunity metadata that is sent (IF-3).

An Ad Insertion MPD Manipulator uses the inputs of IF-2 and IF-3 to generate a DASH presentation that is a mixture of content and advertisements. In the SSAI architecture, the manipulator uses IF-4 to ask an Ad Decisioning / Content Server to provide advertisement placements for the content stream before generating the final DASH MPD for IF-5 which contains metadata about the inserted ads via IF-6. In the SGAI architecture, the manipulator does not immediately use IF-4, instead it embeds opportunity information from IF-3 into the DASH MPD IF-5 output so that the DASH Client may later use IF-7 to retrieve the proper ad placements.

The DASH Client utilizes the reference media pipeline provided by IF-9 to perform seamless playout of the mixed content and ad presentation obtained via IF-5. Ad measurement and tracking is enabled in the client by IF-8 utilizing the ad metadata embedded as part of IF-6.

In Table 1 the interfaces defined are detailed with section references and some example instantiations. Each interface section will provide an informative overview of said interface and where aspects of the interface falls within the scope of this document, normative requirements will be provided.

Table 1 Interfaces identified in the ad insertion architecture, example instantiations and references within the document

Interface

Function

Example instantiations

Reference

IF-0

ABR Stream Source

MPEG-2 TS, RIST

1.2.1

IF-1a

Packager Ingest Media

DASH Ingest interface 1, azure smooth ingest, CMAF

1.2.2

IF-1b

Packager Ingest Metadata

DASH Ingest interface 1 metadata, azure smooth ingest metadata

1.2.2

IF-1c

Configuration Parameters

See for example DASH-IF IOP v4.3 and LL-DASH extensions

1.2.2

IF-2

Content Preparation

MPEG DASH, IOP v4.3.

1.2.4

IF-3

Ad Avail Signalling

SCTE-214.X, CableLabs

1.2.5

IF-4a

Ad Decisioning Parameters

This specification

1.2.6

IF-4b

Ad Content Conditioning Parameters

This specification

1.2.6

IF-4c

Dynamic Ad Content Format

This specification

1.2.6

IF-4d

Ad Storage Format

This specification

1.2.6

IF-4e

Ad Selection Result

VAST/VMAP, SCTE-130

1.2.6

IF-5

MPD and Segments with Ad Placement

MPEG DASH, IOP v4.3, this specification

1.2.7

IF-6

Ad Metadata Signalling

MPEG DASH, IOP v4.3

1.2.8

IF-7

Remote Resolution with Decisioning Parameters

MPEG DASH, IOP v4.3

1.2.9

IF-8

Ad Tracking and Measurement

VAST, Open Measurement SDK

1.2.10

IF-9

Reference Media Playback and Content Decryption

HTML-5 video, MSE, EME, CTA WAVE Device Playback Platform

1.2.3

 

1.2.      Interface definitions

1.2.1.            IF-0: ABR Stream Source

1.2.1.1            General

The formatting and delivery of media input to the ABR encoder is described by IF-0. The ad insertion architectures in this document are agnostic to the choice of this interface instantiation and as such information in this section shall be considered informational.

Example interface instantiations may differ depending on the type of media input being supplied to the architecture. For example, a VOD workflow may utilize a mezzanine delivery format such as the CableLabs Video-On-Demand Content Specification [93], while a LIVE workflow may utilize a contribution feed delivery format such as MPEG-2 TS [94], RIST [99].

For any instantiation, it is usually beneficial for the media input to contain descriptive metadata about the media input such that the ABR encoder may provide conditioning of the encoded output and pass-through said information to components later in the ad insertion streaming architecture. As the format of descriptive metadata may be workflow specific, the examples provided below should be considered informational only.

1.2.1.2 LIVE Workflow Descriptive Metadata

In a LIVE workflow, the descriptive metadata may consist of program, segmentation, and splicing information, we will refer to this information as broadcast events. Examples of what broadcast events signal are program start/end, chapter start/end, interstitial, distributor start/end, provider break start/end, content identification, and many others. SCTE-35 or SCTE-104 are examples of standards to insert such broadcast events aligned with the media presentation in IF-0. In Figure 3, we show a segmentation of a live input based on SCTE-104/35 [54].

Figure 3 shows a live broadcast with segmented broadcast information based on broadcast events. In this case broadcast events are used to segment and can optionally be used to signal ad breaks. Nevertheless, more information is carried about the broadcast streams. The placement opportunities are shown in green. For more information relating to the commands supported we refer to [54].

 

Figure 3: segmented live broadcast with broadcast events [54]

1.2.1.3            VOD Workflow Descriptive Metadata

In a VOD workflow, content is delivered to a service provider by a content provider as a package of various assets and metadata that make up the full description of the content. This package contains mezzanine assets that streamable assets may be produced from, but may also contain still image cover art, promotional assets, and preview trailer assets. The metadata provided alongside assets include basic information such as title, genre, and rating, but also includes advanced metadata such as chapter locations, distribution subscriber requirements, and distributor ad preservation requirements. One format of this package is described by the CableLabs Video-On-Demand Content Specification [93], which we defer to for further information about package structure and data.

1.2.1.4            Abstracted Model

Media provided in mezzanine or ingest is assumed to have a continuous media time and the timestamp of the media carries through the ABR encoder for each media type as shown in Figure 2. In addition, splice points are defined and at these splice points, at a specific media time tsplice,  the ABR encoder is expected to prepare the content accordingly in order to permit splicing. The reason and details of each splice point and the conditions may be carried through but are irrelevant for the media preparation.

Figure 2 Abstracted Media Model with splice points.

1.2.2.            IF-1: Packager Ingest

The ABR encoder provides encoded variants of the media input and prepares CMAF conforming headers, chunks and fragments as defined in ISO/IEC 23000-19 [99] , organized in CMAF structures such as CMAF Tracks and Switching Sets. The content may also be provided together with an MPD that follows the DASH Profile for CMAF content as defined in ISO/IEC 23001-9 [101]. This reflects what is documented with IF-1a inFigure 1.

Those CMAF prepared content is assumed to be properly annotated through metadata. The metadata carries information that can be used by the DASH packager for specific information. This reflects what is documented with IF-1a in Figure 1. A recommended protocol for the combination of the two, IF-1a and IF-1b interface, is the DASH-IF Ingest Spec [92] (CMAF ingest interface).

In addition, the service follows certain service configuration options that are provided by external means. The configuration may include information such as the nominal CMAF fragment duration (DASH segment duration), CMAF chunk duration, number and bitrates in a CMAF Switching Set, codec configurations and media profiles, etc. This reflects what is documented with IF-1c in Figure 1.

The definition of this interface IF-1 is outside of the scope of this document, but in the following several assumptions on the generated media being provided to the DASH packager are taken, pre-dominantly that the encoder produces well-formated CMAF conforming content [99]. Note that these assumptions are not a requirement for this specification, but a service provider should understand the downstream system effects if the packager ingest does not follow these assumptions. For example a transcoding or timeline corrections needs to be done in the DASH packager to meet the output requirements for following interfaces, or a specific addressing scheme may have to be used.

The following assumptions are taken:

·         The ABR encoder produces continuous content with a single CMAF Header for each CMAF Track. There may be instances that in between two potential splice points at media times tsplice,i and tsplice,i+1  not all Tracks/Switching Sets are provided. However, at least a minimum set of Switching Sets are always present.

·         For those CMAF tracks that are present for the entire program, the media time is continuous, also across splice points. This means that the subset of continuously present CMAF Switching sets of the entire program conforms to a CMAF presentation as defined in ISO/IEC 23000-19, clause 7.3.6.

o   Note: this assumption may be relaxed, but if done, there needs to be a signaling for such a discontinuity. Input on this subject would be welcome.

·         There are three options for content in between two potential splice points at media times tsplice,i and tsplice,i+1.

o   For Option 1 referred to as "Splice-Conditioned Packaging", the following holds:

§  The output of the ABR encoder in between two potential splice points at media times tsplice,i and tsplice,i+1 is CMAF conforming, i.e. it conforms to a CMAF presentation as defined in ISO/IEC 23000-19 [99], clause 7.3.6.

§  The first splice point at tsplice,i is the timeline origin of all CMAF tracks in the CMAF presentation as defined in ISO/IEC 23000-19 [99], clause 7.3.6.

Note: this does not imply that each splice point resets the timeline. Indeed this would contradict the first assumption above that media is time-continous.

§  The ABR encoder provides content that can be converted to conforming DASH content, for example consistent CMAF Fragment duration to enable proper usage of DASH Segment duration signaling, bitrate characteristics for signaling in the MPD, event messages, etc.

§  The ABR encoder creates a CMAF Fragment boundary for all CMAF Tracks at tsplice,i and resets the CMAF Fragment duration from here on.

Note: This permits Period boundary insertion at tsplice,i without modification of the CMAF content.


 

§  The ABR encoder creates media for all samples of all CMAF tracks in between tsplice,i and tsplice,i+1 with tsplice,i being included and tsplice,i+1 being excluded.

Note: This permits to create a Period that is fully covered by content.

§  The ABR encoder creates a CMAF Fragment boundary for all CMAF Tracks at tsplice,i+1 and resets the CMAF Fragment duration from here on.

Note: This permits Period boundary insertion at tsplice,i+1 without modification of the CMAF content.

o   For Option 2 referred to as "Splice-Conditioned Encoding", the following holds:

§  The ABR encoder creates a SAP type 1 or 2 at tsplice,i with TSAP set to  tsplice,i. The placement of the SAP type 1 or 2 may not and typically does not co-incide with a CMAF Fragment boundary.

§  The ABR encoder creates a SAP type 1 or 2 at tsplice,i+1 with TSAP set to  tsplice,i+1. The placement of the SAP type 1 or 2 typically does not co-incide with a CMAF Fragment boundary.

o   For Option 3 referred to as "Splice Point Signaling", no specific encoding and packaging is done at the splice points.

§  It may be the case that an exact alignment of a SAP type with the splice point may not be possible, for example due to the codec or format properties. However, additional SAP types may be available, or the the media can be accessed quickly by other means, for example by accelerated decoding.

·         The ABR encoder passes through timed metadata (from contribution/production feed IF-0) related to the provided descriptive metadata and content conditioning, including the signaling and timing of each splice point tsplice,i.

·         The content may be provided at once, for example as part of a VoD Asset generation, or the content may be provided by the ABR encoder on a continuous timeline, for which real-time and media time advance in concurrently.

Note: Slice points are defined independent whether you enter or exit the content. Please provide feedback if this differentiation should be added in the final version of the document.

It may be also the case that within one content generation work flows, certain media encoding follows option 1 whereas others may follow option 2 or option 3. For example, video may follow option 1, and audio may be encoded based on option 3.

The three options for encoder and packager configuration are shown in Figure 3. In option 1, CMAF Fragment boundaries are aligned with splice points, and in option 2, splice points may occur in the middle of a CMAF Fragment, but are supported by a SAP type 1/2 for random access. In option 3, no SAP type 1 or 2 is necessarily provided at the splice point.

Note: As an example, please note that CMAF Fragment#3 in Option1 may be shorter or it may be even longer than CMAF Fragment #2 in order to align Splice Points with CMAF Fragment Boundaries.

Figure 3 CMAF Encoder and Packager options

1.2.3.            IF-9: Reference Media Playback and Decryption

Another important assumption in the context of this specification is the availability of a reference playback platform that enables a DASH client to use for media playback and decryption. Without limiting the usage of any DASH player, this assumption permits that content is authored such that platforms with certain restrictions can be used.

The DASH Client interacts with the media pipeline on the reference platform via the IF-9 interface. The definition of this interface is out of the scope of this document, but the general assumption of the DASH-IF IOP is an MSE [96] / EME [97] reference pipeline.

Furthermore, it is assumed, for interoperability and/or robustness of this interface, that the reference playback platform supports the playback requirements defined by the Consumer Technology Association Web Application Video Ecosystem Project (CTA Wave) Device playback specification [98].

Specifically, in the context of this specification, a playback platform is expected to support playback requirements as documented in clause 8 of CTA-WAVE 5003 [98] for any content conforming to a CMAF Switching Set according to CMAF media profile included in an MPD, namely

-    8.2 Sequential Track Playback

-    8.3 Random Access to Fragment

-    8.4 Random Access to Time

-    8.5 Switching Set Playback

-    8.8 Playback over WAVE Baseline Splice Constraints

-    8.13 Restricted Splicing of Encrypted Content

-    8.14 Sequential Playback of Encrypted and Non-Encrypted Baseline Content

If a playback platform wants to consume content authored according to encoding and packaging option 2 or 3, ("Splice-Conditioned Encoding" and "Splice Signaling", respectively) as defined in clause 1.2.2 for content conforming to a CMAF Switching Set according to CMAF media profile included in an MPD, is expected to support the following playback requirements as documented in clause 8 of CTA-WAVE 5003 [98].

-    8.9 Out-Of-Order Loading

-    8.10 Overlapping Fragments

Finally, it is assumed that the reference playback platform can be used in order to query proper capabilities such that MPD information can be transformed into capability queries, e.g. if a codec is supported. Device Capability queries is discussed in clause 6.4 of CTA-WAVE 5003 [98].

Note: A revised version will add more details on device capabilities requirement are expected in an updated version.

1.2.4.            IF-2: Content Preparation

The format and requirements of the DASH manifests and segments output by the DASH Packager / MPD Generator for use later in the ad insertion architecture is described by IF-2. The DASH IOP Guidelines provide the general normative requirements on the DASH output and we will assume those as a baseline set of requirements. Here we will provide further normative requirements for the ad insertion architectures.

Generally, for each known ad splice point, the DASH Packager/MPD Generator should insert a Period boundary.

The recommendation of Period boundary generation at splice points within the DASH Packager / MPD Generator is made such that the downstream Ad Insertion MPD Manipulator can perform replacements and insertions on the MPD-level only without accessing the content segments. Should the DASH Packager / MPD Generator not be aware of what splice points are appropriate for ad insertion, the Period boundaries may be omitted and instead be created by the downstream Ad Insertion MPD Manipulator, further details of this operation are provided as part of IF-5.

In the following it assumed that at least one media type (typically video) follows the content generation according to clause 1.2.2, option 1 ("Splice-Conditioned Packaging") and furthermore it is assumed:

·         Each CMAF fragment generates one DASH Segment

·         The content is provided in a live session, i.e. CMAF fragments are made available to the DASH packager once completed.

o   NOTE: Low-Latency operation will be added in the final version of the document in alignment with the Low-Latency DASH extensions.

·         The minimum splice point advance notice time is known, i.e. the DASH packager gets a pre-notification or ad avail for a splice point that will be added to the media. This allows the DASH Packager to configure the minimum update period of the MPD properly. By this, the DASH client or MPD proxy requests the MPD in high enough frequency such that none of the announced Periods in the MPD are missed. For example, in SCTE-35, it is recommended to provide an advance notice of at least 4s [54]

o   NOTE: please provide feedback on practicability and other examples during community review phase. This is discussed in https://github.com/Dash-Industry-Forum/AdInsertion/issues/46, please join the discussion.

then a DASH packager produces content by (i) generating an initial MPD, and (ii) dynamic operation of the packager including MPD processing/updates and Segment offering.

The initial MPD is generated as follows:

·         The CMAF data is mapped to the MPD using the DASH profile for CMAF content as defined in ISO/IEC 23009-1 [101].

·         For every CMAF Switching Set that is known to be offered in the MPD, an Initialization Set as defined in ISO/IEC 23009-1 [101], clause 5.12 should be added that describes all known static parameters for the CMAF Switching Set, preferably based on the information in the CMAF Master Header (i.e. a CMAF Header that is sufficient to initialize the media pipeline for continuous playback, see CTA WAVE 5003 [98] for details) for this CMAF Switching Set.

o   Every Initialization Set gets assigned a unique id.

o   For every CMAF Switching Set that is not known to be offered on a continuous basis, the @inAllPeriods of the Initalization Set is set to false.

o   For every CMAF Switching Set that is known to be offered on a continuous basis, the @inAllPeriods of the Initalization Set is set to true or the attributed is omitted.

·         The MPD@availabilityStartTime is set to an arbitrary value.

·         The @mininumUpdatePeriod is set sufficiently small such that DASH clients and MPD proxies do not miss Periods created for announced splice points taking into account the minimum splice point advance notice time.

·         The initial MPD follows the Main content in Table 1.

For every splice point i at time tsplice,i, a new Period is generated as follows:

·         If it is the first Period in the presentation and the media is "starting" to be produced, then

o   @start of the Period is set to NOW - @availabilityStartTime with some possible margins to address different Segment availability times, for example due to publication delay on a CDN.

·         If it is not the first Period in the presentation

o   @start of the Period is set to the sum of the value of Period@start of the previous period and the interval between the two splice points (tsplice,i - tsplice,i-1)

o   Period continuity is signaled across all Adaptation Sets that are continuing across the Period boundary. Preferably the same signaling and track structure is used.

·         Every available CMAF Switching Set in the CMAF Presentation is mapped to one Adaptation Set using the DASH profile for CMAF content as defined in ISO/IEC 23009-1 [101]. Within one Adaptation Set the following parameters are set

o   The @timescale attribute is set to the timescale of the CMAF Track

o   The @presentationTimeOffset is set to tsplice,i normalized by the timescale.

o   The @eptDelta and, if applicable, the @startNumber or SegmentTimeline is set to indicate the placement of the first Segment in the Period. Note if the content follows option 1, then @eptDelta is set to 0 and can be absent.

o   Period continuity is signaled to indicate, which Adaptation Set follows continously the previous one.

o   If the CMAF switching set is identical to one for which an Initialization Set was set, then all parameters from the Initialization Set are copied into this Adaptation Set and the @initializationRefId is set to the one referred to.

Note 1: Instead of assumption, this content may be changed into requirements. These requirements may then be signalled with a specific profile.

Note 2: This operation does not consider aspects such as inconsistent/variable segment durations within a CMAF Presentation, upstream losses or errors, etc. Any of such occurrences may result in additional Periods that may be added according to the DASH-IF IOP guidelines.

The mapping is shown in Figure 4.

Figure 4 CMAF Fragment to DASH Mapping for Option 1 and 3

NOTE 3: Signaling of content encoding options 1 and 3 is for further study. Examples are how to signal @eptDelta, @presentationTimeOffset, etc.

Table 1 defines the Main live content MPD. More details need to be added.

 

Table 1 DASH-IF Main live content MPD

Element or Attribute Name

Use

Description

MPD

 

Provides the requirements for DASH-IF main content. Any not specified value is identical to what is provided in ISO/IEC 23009-1 [10x], clause 5.3.1.

 

ServiceDescription

0 … N

 

 

 

Latency@TargetLatency

O

A target latency may be provided

 

@profiles

M

should include a profile indicator signaling http://dashif.org/guidelines/dashif-main-live-content (hopefully v5 defines this) and should include a profile identifier for the DASH CMAF profile "urn:mpeg:dash:profile:cmaf:2019"profile if all content is encoded following option 1.

 

@type

M

Shall be set to dynamic

 

@minBufferTime

M

Shall be present

 

@suggestedPresentationDelay

R

Shall not be present

 

@maxSegmentDuration

R

Shall not be present

 

InitializationSet

1 … N

At least one shall be present

 

ProgramInformation

0…N

This should be used to describe information about the main content. More details may be provided

 

Period

1 … N

One or more Periods shall be present. Provides the requirements for main live content. Any not specified value is identical to what is provided in ISO/IEC 23009-1, clause 5.3.2.

 

 

@xlink:href

R

Shall be absent.

 

 

@xlink:actuate

R

Shall be absent.

 

 

@start

M

Shall be present.

 

 

@duration

R

Shall not be present

 

 

BaseURL

0

Shall not be present

 

 

EventStream

0...N

specifies an event stream.

<What can we say about this? Event Streams that go across Periods?> This is discussed in https://github.com/Dash-Industry-Forum/AdInsertion/issues/45, please join the discussion.

 

 

 

AdaptationSet

1...N

At least one Adaptation Set shall be present.

 

 

 

@xlink:href

R

Shall be absent

 

 

 

@xlink:actuate

R

Shall be absent

 

 

 

 

SegmentBase@presentationTimeOffset

OD
default: 0

shall be set to the correct value of the presentation time of the Adaptation Set at the start of the Period, if the presentation time is not equal to 0.

 

 

 

@contentType

M

Shall be present

 

 

 

SegmentList

0

Shall be absent

 

 

 

Representation

1 … N

specifies a Representation. At least one Representation element shall be present in each Adaptation Set.

Any not specified value is identical to what is provided in ISO/IEC 23009-1, clause 5.3.3.

 

 

Subset

0

Shall be absent.

 

 

EmptyAdaptationSet

0

Shall be absent

 

UTCTiming

1 … N

At least one shall be present

Key

For attributes: M=mandatory, O=optional, R=removed

For elements: <minOccurs>…<maxOccurs> (N=unbounded)

Elements are bold; attributes are non-bold and preceded with an @.

 

1.2.5.            IF-3: Ad Avail Signalling

1.2.5.1            Introduction

Opportunity metadata is made up of the original descriptive metadata of the input media related to signalling of ad opportunities and the content segmentation information generated by the DASH Packager. Carriage of opportunity metadata in the presentation output by the DASH Packager / MPD Generator is done via IF-3.

The following normative statements on opportunity metadata carriage are made:

·         Opportunity metadata shall be carried through DASH MPD Events

The requirement of MPD Events over other carriage mechanisms is made such that the downstream Ad Insertion MPD Manipulator can perform insertions without accessing the content segments.

Note: Please check the above requirement during community review and comment if this overconstrains deployments.

While the carriage method is considered normative, the format of the metadata is workflow dependent. Examples of known schemes are provided in the subsequent sub-sections of this interface. For the purposes of this document, we will consider the carriage of opportunity metadata via SCTE-35 signalling sufficient, but other methods of equivalent means may be used.

1.5.2.2            Opportunity Signalling via SCTE-35

SCTE-35 describes a set of command messages that can be utilized to describe ad opportunities within a presentation. Typically, the broadcast events for a LIVE presentation are already signalled as SCTE-35 commands which may be directly used, but a VOD workflow may optionally synthesis a series of SCTE-35 commands to describe the conditioning and opportunities in a VOD presentation as well.

The SCTE 214 specification defines a set of event schemes for carrying SCTE-35. The appropriate event scheme to use depends on the utilized DASH mechanism, DASH MPD Events may use either urn:scte:scte35:2013:xml or urn:scte:scte35:2014:xml+bin.

An example of carrying SCTE-35 with MPD Events using the urn:scte:scte35:2014:xml+bin scheme is shown in Table 2. The timing described within the SCTE-35 payload provide the event properties @id, @presentationTime, and @duration properties. The id may be used to filter out duplicate events. In this example the payload is encoded using Base64 enclosed in the <Binary> tag per SCTE 214.

Table 2 Example of a SCTE-35 message embedded as an MPD event using SCTE 214

<EventStream

  schemeIdUri="urn:scte:scte35:2014:xml+bin"

  timescale="1">

  <Event

    presentationTime="1540809120"

    duration="24"

    id="1999">

    <Signal xmlns="http://www.scte.org/schemas/35/2016">

      <Binary>/DAhAAAAAAAAAP/wEAUAAAfPf+9/fgAg9YDAAAAAAAA/APOv</Binary>

    </Signal>

  </Event>

  </EventStream>

 

1.2.6.            IF-4: Ad Decisioning and Exchange Interfaces

1.2.6.1.       Introduction

Information about ad content to insert into a presentation is retrieved from the Ad Decision and Ad Content server(s) via the IF-4 interfaces. The request from the Ad Insertion MPD Manipulator for ad content provides all the information needed to perform ad decisioning, including content metadata and opportunity descriptions. The response is then translated by the Ad Insertion MPD Manipulator into the DASH structures detailed in IF-5.

There are many details corresponding to the functions of ad requests and ad decisioning, as such this document identifies 5 different sub-interfaces that outline primary interactions of ad requests and decisioning.

The identified 5 sub-interfaces are:

-       Interface IF-4a: Ad Decision request parameters. For details see 1.2.6.2.

-       Interface IF-4b: Content Conditioning request parameters. For details see 1.2.6.3.

-       Interface IF-4c: Recommended Dynamic Ad Content response format. For details see 1.2.6.7.

-       Interface IF-4d: Recommended Ad Content Storage format. For details see 1.2.6.6.

-       Interface IF-4e: Ad Selection Result format. For details see 1.2.6.5.

1.2.6.2.       IF-4a: Ad Decision request parameters

1.2.6.2.1.        Decisioning Parameters

A decisioning parameter is a piece of information about the content stream, consumption medium, or end user that is used by the Ad Decisioning Server as part of the advertisement qualification and selection process. The Ad Insertion MPD Manipulator collects and sends this information to the Ad Decisioning Server as part of IF-4a.

The transmission of decisioning parameters is highly integration dependent, but examples of commonly used industry parameters are:

·         Content Unique Identifier

·         Content Genre

·         Content Language

·         Service Provider Identifier

·         Device Type (TV, SetTop, Mobile, Computer, etc)

·         Device Manufacturer

·         Device Model

·         End User IP Address

·         End User Zip Code

1.2.6.2.2.        Decisioning Modes

The decisioning mode of an Ad Decisioning Server dictates how the server chooses to fulfill ad requests made by a caller. The Ad Insertion MPD Manipulator must specify the decisioning mode for the Ad Decisioning Server to use via IF-4a based on the implemented ad insertion architecture. There are two general modes of ad decisioning that the SSAI and SGAI architectures respectively enable: stream level decisioning and pod level decisioning.

With stream level decisioning, all advertisement opportunities are decided prior to DASH client receiving the stream. A SSAI architecture accomplishes this by having the Ad Insertion MPD Manipulator send the IF-3 supplied opportunity metadata to the Ad Decision server via IF-4a. The result of the ad decision request will contain advertisements for the entirety of the stream which the Ad Insertion MPD Manipulator transforms into an IF-5 manifest a mixture of content and advertisements.

After a DASH client receives a stream produced from an SSAI architecture, the stream will remain fixed for the duration of the playback session, e.g. the same advertisements will play again should the user choose to rewind the stream.

With pod level decisioning, advertisement opportunities are decided just as the DASH client reaches the opportunity within the stream. A SGAI architecture accomplishes by having the Ad Insertion MPD Manipulator use the IF-3 supplied opportunity metadata to generate an IF-5 manifest with a mixture of content and remote entities that represent opportunities. As the client reaches remote entities during playout, the client utilizes IF-7 to return the opportunity metadata to the Ad Insertion MPD Manipulator which then sends the data to the Ad Decision server via IF-4a. The result of the ad decision request will contain advertisements for this single opportunity which the Ad Insertion MPD Manipulator transforms into an IF-7 response for the client to consume.

After a DASH client receives a stream produced from an SGAI architecture, the stream can continue to change for the duration of the playback session, e.g. the advertisements can be re-decisioned should the user choose to rewind the stream.

1.2.6.3.       IF-4b: Content Conditioning request parameters

A conditioning parameter is a piece of information about the encoding/packaging of the content stream or a client player capability that is used by the Ad Content Server to ensure an ad creative is compatibly encoded for inclusion in the generated presentation. The Ad Insertion MPD Manipulator collects and sends this information to the Ad Content Server as part of IF-4b.

The transmission of conditioning parameters is highly integration dependent, but examples of commonly used industry parameters are:

·         Video / Audio Codecs

·         Player Splice Condition Robustness

·         Encryption Schemes

1.2.6.4.       DASH-IF Query scheme for Decision (IF-4a) and Conditioning Parameters (IF-4b)

Editor's Note: <this will normative, but optional/recommend. Expected to be identical/similar to IF-7, so reference from here to IF-7 expected>

Note: The final version is expected to include well-defined query parameters. Inputs are welcome. This is discussed in https://github.com/Dash-Industry-Forum/AdInsertion/issues/47, please join the discussion.

1.2.6.5.       IF-4e: Ad Selection Result format

1.2.6.5.1.        Overview

The response of the Ad Decision Server identifies the advertisements decisioned by the server and provides information associated with the advertisement such as general metadata, viewability requirements, media files, mezzanines, and tracking events. The actual ad content is provided by the Ad Content Server, preferably following the DASH-IF Ad Content format as defined in clause 1.2.6.6. Depending on the decisioning mode, the decision response may optionally contain the placement and ordering of advertisements as well.

While the general information carried in the response is described above, the explicit format of this response is workflow dependent. Examples of known industry formats are given in the subsequent sub-sections of this interface. For the purposes of this document we will assume VAST/VMAP is used, but other formats with equivalent data communication may be used.

1.2.6.5.2.        IAB VAST and VMAP

An instantiation of IF-4e is standardized by the Interactive Advertising Bureau (IAB) as the Digital Video Ad Serving Template (VAST) [53] and Video Multiple Ad Playlist (VMAP) [52] specifications.

The VAST specification provides structure definitions for representing a variety of ad types, including linear, non-linear, and companion. A single VAST response may contain a stand-alone ad slot or a whole pod of ad slots, and each ad structure can provide general metadata, viewability requirements, media files, mezzanines, and tracking events.

The VMAP specification is a complement to the VAST specification as it describes a playlist structure that wraps one or more VAST ad responses to provide ad decisions for an entire stream. A VMAP response will provide the order and position that ad pods should occur in the content stream and may also provide additional tracking events for pod level tracking.

1.2.6.5.3.        SCTE-130

Another instantiation of IF-4e is standardized by the Society of Cable Telecommunications Engineers (SCTE) as the response of the Ad Decision Service (ADS). The ADS is responsible for determining how advertising content is combined with non-advertising content. The exact format and schema of the ADS response is normatively defined within SCTE-130 Part 3 [103], which we defer to for further information.

1.2.6.6.       IF-4d: Recommended Ad Content Storage format

This interface provides a recommended content format for ad content that is expected to be dynamically inserted into a DASH live or on demand Media Presentation.

Ad content is recommended to follow the DASH-IF Ad content format as defined in the following. This specification does not exclude the use of other content, but the content author should be aware of any differences to the DASH-IF Ad Content format.

DASH-IF Ad follows the restrictions and requirements according of this specification and may produced independently of the main content for insertion into well-formated main content by simple MPD manipulation processes.

If content is offered conforming to the DASH-IF Ad content format and follows the following requirements and recommendations, then it may annotate with a @profiles parameter: "http://dashif.org/guidelines/dashif-ad-content".

The following requirements for DASH-IF Ad Content apply:

-          The content shall be provided as a DASH Media Presentation, i.e. a complete MPD with referenced Segments and shall follow the semantics in Table 2.

-          The DASH Media Presentation shall conform to the DASH profile for CMAF content as defined in ISO/IEC 23009-1 [101].

Note: An important assumption for the above profile is the availability of content for CMAF Tracks over the entire Period. Content may be overlapping at the start of the Period or at the end of the Period.

-          The DASH Media Presentation shall contain exactly one Period.

-          The MPD@type shall be set to 'static'.

NOTE: Please provide comments if additional restrictions and requirements would be considered useful. Please join the discussion here: https://github.com/Dash-Industry-Forum/AdInsertion/issues/48.

The following recommendations for DASH Ad Content apply:

-          The MPD should contain a profile indicator signaling "http://dashif.org/guidelines/dashif-ad-content"

-          The content should be offered using the Segment timeline.

-          The Segment durations within on Adaptation Set should be approximately identical.

-          The content may and typically should include multiple variants for the same ad, for example different codecs, formats and resolutions in order for the Dynamic Conditioning, the MPD proxy or a DASH client to adjust the ad to the current playback conditions.

NOTE: Please provide comments if additional recommendations would be considered useful or if any of the recommendations should be removed or made a requirement. Examples are guidelines for exact duration signaling or encrypted content. Please join the discussion here: https://github.com/Dash-Industry-Forum/AdInsertion/issues/48.

Table 2 DASH-IF Ad content MPD

Element or Attribute Name

Use

Description

MPD

 

Provides the requirements for ad insertion content. Any not specified value is identical to what is provided in ISO/IEC 23009-1 [10x], clause 5.3.1.

 

@profiles

M

should include a profile indicator signaling http://dashif.org/guidelines/dashif-ad-content and shall include a profile identifier for the DASH CMAF profile "urn:mpeg:dash:profile:cmaf:2019"profile. This also means that the content follows the CMAF Profile.

 

@type

M

Shall be set to static

 

@mediaPresentationDuration

R

Shall not be present.

 

@minimumUpdatePeriod

R

Shall not be present, implied by type static.

 

@minBufferTime

M

Shall be present

 

@timeShiftBufferDepth

R

Shall not be present

 

@suggestedPresentationDelay

R

Shall not be present

 

@maxSegmentDuration

R

Shall not be present

 

@maxSubsegmentDuration

R

Shall not be present

 

ProgramInformation

0…N

This should be used to describe information about the ad. More details may be provided

 

BaseURL

0

Shall not be present. If a Base URL is present, then it is as part of the Period

 

Period

1

Exactly one Period shall be present. Provides the requirements for ad insertion content. Any not specified value is identical to what is provided in ISO/IEC 23009-1, clause 5.3.2.

 

 

@xlink:href

R

Shall be absent.

 

 

@xlink:actuate

R

Shall be absent.

 

 

@start

R

Shall be absent, i.e. assumed to be 0.

 

 

@duration

O

is set to the duration of the ad content

 

 

BaseURL

1…N

At least one shall be present and refer to the BaseURL of the ad content.

 

 

EventStream

0...N

Event Streams are permitted in ad content, for example for beaconing.

Note: more details need to be added on specific types. Please join the discussion here: https://github.com/Dash-Industry-Forum/AdInsertion/issues/49

 

 

AdaptationSet

1...N

At least one Adaptation Set shall be present.

 

 

 

@xlink:href

R

Shall be absent

 

 

 

@xlink:actuate

R

Shall be absent

 

 

 

InbandEventStream

0...N

Inband Event Streams are permitted in ad content, for example for beaconing.

Note: more details need to be added on specific types. Please join the discussion here: https://github.com/Dash-Industry-Forum/AdInsertion/issues/49.

 

 

 

CommonAttributesElements

specifies the common attributes and elements (attributes and elements from base type RepresentationBaseType). For details, see subclause

 

 

 

 

SegmentBase@presentationTimeOffset

OD
default: 0

shall be set to the correct value of the presentation time of the Adaptation Set at the start of the Period, if the presentation time is not equal to 0.

<we make a recommendation that we make it zero, ask for community review.>

 

 

 

@contentType

M

Shall be present

 

 

 

SegmentList

0

Shall be absent

 

 

 

Representation

1 … N

specifies a Representation. At least one Representation element shall be present in each Adaptation Set.

Any not specified value is identical to what is provided in ISO/IEC 23009-1, clause 5.3.3.

 

 

Subset

0

Shall be absent.

 

 

EmptyAdaptationSet

0

Shall be absent

 

UTCTiming

0

Shall not be present

 

LeapSecondInformation

0

Shall not be present

Key

For attributes: M=mandatory, O=optional, R=removed

For elements: <minOccurs>…<maxOccurs> (N=unbounded)

Elements are bold; attributes are non-bold and preceded with an @.

 

Figure 5 provides and overview of the DASH-IF Ad content format.

Figure 5 Recommended Ad Content Format

1.2.6.7.       IF-4c: Recommended Dynamic Ad Content Response format based on conditioning parameters

The response format should follow the DASH-IF Ad Content format as defined in clause 1.2.6.2.

Note: Please provide feedback if the dynamic response should permit multiple Periods. If no clear use case is provided, single Period is restricted.

If no conditioning parameters are provided, the response should include multiple content variants, e.g. multiple codecs, resolutions, etc..

If codec conditioning parameters are provided, the response should include content options including at least one of the codecs.

If format conditioning parameters are provided, the response should include content options including at least one of the supported formats.

If encryption conditioning parameters are provided, the response should include content options including at least one of the supported encryption modes.

If the content needs to be obfuscated/blocked, then the response should be adjusted to the main content.

If the content needs to be served through xlink with Period, then only the Period of the main content is extracted.

If the content is used on live and especially low-latency live services, then the content in the Period should be adjusted to enable consistent playback including consistent join times.

Note: This clause will need more refinements and comments are welcome.

1.2.7.            IF-5: MPD and Segments with Ad Placements