Low Latency with Chunked CMAF
In the first part of this content series we discussed latency: what it is and why it is important, and we gave a general overview of the current industry approaches. As media companies are on a quest to lower their latency to only a few seconds (or even lower), we will dedicate some special attention to those protocols which are currently being rolled out at scale most often. The first one to discuss is Low Latency CMAF or Chunked CMAF.
What is CMAF?
In 2016, Apple announced they would start supporting CMAF (or Common Media Application Format) within HLS. The format, as in the name, has as a goal to bring together the HLS and MPEG-DASH formats in order to simplify online video delivery. Contrary to popular belief, CMAF itself does not reduce latency. It provides a low latency mode where media segments can be divided in even smaller chunks.
CMAF leverages an ISOBMFF, fMP4 container (ISO/IEC 14496-12:201). In the past however, HLS would make use of Transport Streams, which have served the purpose of the broadcast and cable industries well in delivering continuous streams of data. Segmented media delivery however, is not one of its strengths, incurring overhead ratios between 5% and 15%, far higher than fMP4. On top of that, the fMP4 format is easily extensible, and already used often in DASH, Smooth and HDS implementations.
Next to joining segments into a common format, CMAF is also linked to the Common Encryption (CENC – ISO/IEC 23001-7: 2016) standard, which dramatically simplifies the protection of content through encryption and DRM systems. These systems are currently still converging to a common system, but there are indications this is going in the right direction.
CMAF requires segments to start with keyframes, which must be precisely aligned across bitrates. This should allow for faster playback as playback can start at an individual segment and its independent off any other segment in the stream. A second advantage of aligning the keyframes: it simplifies bitrate switching. When the player receives a keyframe, it knows it can safely switch to a different bitrate, as a keyframe to start decoding will be available there as well.
HTTP Adaptive Streaming & Chunked Transfer Encoding
CMAF comes with a low latency mode. This low latency mode allows you to split up an individual segment into smaller chunks. One could ask: why would I need to split a segment, and not just make segments smaller? CMAF requires a keyframe at the start of every segment; this is not the case for a chunk. A keyframe tends to be significantly larger than non-keyframes. A reduction in segment size would result in increased bandwidth, while still delivering the same quality.
Segments usually have a duration ranging from 2 to 6 seconds. Most streaming protocols have determined a buffer of about three segments, and a fourth segment usually being buffered, is optimal for avoiding playback stalls. The reasoning here is that segments need to be listed in the manifest, encoded, downloaded and added to the buffer as a whole. This often results in 10-30s latencies.
By splitting segments into chunks, a streaming server can make chunks within a segment available already while the entire chunk has not been completed yet. This changes the situation as players can download the individual chunks ahead of time, and buffers can be filled faster. This allows a reduction of the end-to-end latency significantly.
Just splitting into chunks is not enough to reduce latency. While producing chunks instead of segments allows the packager to produce chunks faster (and already list a segment in the manifest once the first chunk is ready), the following components of the pipeline should be ready as well. In practice this means the origin should expose the chunks using HTTP/1.1 chunked transfer encoding (or a similar technique over an alternative protocol). Similarly, the CDN allowing to scale to a larger number of viewers, should mimic this behaviour, and expose the chunks to the player in the same way.
On the player side, support for chunked transfer encoding should be available as well, accompanied by an internal media pipeline which allows for chunks of media to be added to the buffer and played out. The player should also be able to identify the chunks being made available, and have intelligence to modify its buffers and optimise for reducing latency. If one of the elements within the streaming pipeline is not modified, there will be no benefit from splitting up segments.
Learn how your business can benefit from CMAF
THEOplayer has been improved to identify servers providing segments using chunks across all platforms and devices. On top of this, additional intelligence has been added to configure the player to best suit your business need and allowing to configure how aggressive the player should be in reducing latency. These capabilities work seamlessly together with the other THEOplayer features such as analytics, subtitles, content protection and DRM, preloading, and much, much more, bringing the best viewer experience with the lowest latency possible.
If you have any questions, don’t hesitate to reach out: our technical experts will be happy to help you to overcome your challenges and to discuss your future projects.
Get in contact with us