Core technologies for streaming workflows: a 2026 architectural reassessment // Part 2
The first part of this reassessment examined the consolidation and refinement of the HTTP streaming stack. CMAF has largely unified HLS and DASH at the media layer, while low-latency profiles such as LL-HLS and L3D have matured into operationally viable deployments. Observability has expanded through CMCD and CMSD, extending delivery telemetry into runtime control logic. Even HTTP/3 and QUIC were considered primarily through the lens of operational economics and deployment constraints. The overall picture is one of incremental optimization: HTTP-based ABR delivery is not being displaced, but progressively refined within a stable and deeply entrenched ecosystem.
This second part shifts the perspective. Rather than asking how far HTTP streaming can be optimized, it examines what happens if the delivery abstraction itself changes. Media over QUIC (MoQ) proposes a session-native, object-oriented distribution model built around long-lived subscriptions rather than stateless request cycles. The discussion that follows explores MoQ’s transport semantics, encapsulation approaches, operational implications, and ecosystem maturity, and considers whether streaming may be approaching a more structural transition beyond the HTTP delivery paradigm.
Media over QUIC: A Session-Native Architecture
In parallel with the continued refinement of HTTP-based streaming, a different line of work has explored whether media delivery should remain request-driven at all. Media over QUIC (MoQ) has emerged as the primary focal point for this transport-layer experimentation. Rather than incrementally optimizing HTTP-based adaptive bitrate delivery, MoQ reconsiders the delivery abstraction itself, replacing stateless request/response exchanges with long-lived, session-oriented object transport.
The IETF Media over QUIC Transport (MoQT) specification defines an application-layer publish/subscribe protocol operating over QUIC. MoQT introduces media-native abstractions – namespaces, tracks, groups, and objects – and maps them onto QUIC streams and datagrams. Instead of repeatedly retrieving segments through independent HTTP requests, a client establishes a session, subscribes to one or more tracks, and receives objects as they are published through relays. Delivery becomes push-oriented rather than request-driven.
This session model changes system behavior. Multiple subscribed tracks share congestion context within the QUIC connection, altering fairness dynamics relative to independent HTTP transactions. MoQT exposes prioritization primitives that influence delivery ordering across tracks and objects, but these mechanisms are not equivalent to the standardized adaptive bitrate logic used in HTTP ABR streaming. In HTTP delivery, the player selects a bitrate representation by requesting different segment URLs. In MoQ, prioritization instead determines which objects are transmitted first within an active session—for example favoring the next video frame over lower-priority metadata or alternative tracks. Stream-based and datagram-based delivery introduce different reliability and latency tradeoffs. QUIC streams provide reliable, ordered delivery suited for objects that must arrive intact, while QUIC datagrams allow objects to be transmitted without retransmission, enabling lower latency at the cost of potential loss. Authorization, token enforcement, and revocation are handled within the session itself rather than inferred from manifest expiry or URL invalidation, with emerging mechanisms such as Common Access Tokens for Media (C4M) formalizing this model. Object integrity and encryption can also be applied directly at the object layer. The transport abstraction therefore becomes explicitly stateful and policy-aware.
In browser environments and CDN-integrated deployments, MoQT sessions are typically established via WebTransport over HTTP/3. WebTransport exposes QUIC capabilities through a standardized browser API while remaining compatible with existing HTTP/3 infrastructure, lowering deployment friction and aligning with current CDN routing and security models. Direct QUIC operation remains possible in controlled environments, but WebTransport over HTTP/3 is the practical near-term path for broad client integration. The economic and operational constraints associated with QUIC deployment – discussed in Part 1 – continue to shape adoption pacing. QUIC-based delivery still carries higher CPU overhead than highly optimized TCP-based HTTP delivery, and rising CDN delivery costs across the industry further increase the importance of transport efficiency in large-scale deployments. Beyond the core transport draft, the emerging MoQ ecosystem now includes work on media formats, authorization, relay behavior, and operational telemetry, indicating a protocol family evolving around a common transport foundation rather than a single specification.
Media Encapsulation: CMSF Baseline and LOC for Advanced Workflows
MoQ separates delivery semantics from media encapsulation. Current interoperability guidance converges on CMAF-based object carriage (CMSF – a CMAF compliant implementation of MOQT Streaming Format) as a pragmatic baseline within the broader MoQ Streaming Format (MSF) framework. CMSF preserves compatibility with established encoding, packaging, encryption, and DRM workflows. Existing CMAF assets can be mapped into MoQ objects without redefining the production pipeline, enabling incremental migration rather than architectural replacement.
Alongside CMSF, the Low Overhead Container (LOC) defines an object-native encapsulation model aligned with MoQ’s abstractions. LOC does not assume traditional track hierarchies, GOP-aligned fragment boundaries, or segment-level timeline rigidity. Objects can represent sub-GOP units with explicit dependency signaling, enabling finer-grained composition and enhancement-layer separation. This model aligns well with workflows where media processing occurs at frame or sub-frame granularity, including AI-assisted encoding chains, VC-6–based compression pipelines, frame-level transformation systems, multi-view composition, and WebCodecs-driven rendering paths. In these contexts, CMAF’s segment-oriented structure can introduce constraints that LOC avoids. However, in 2026 LOC should be understood as a forward-looking container optimized for object-native and specialized workflows rather than a near-term replacement for CMAF in mainstream OTT distribution.
Complementing transport and encapsulation, the emerging WARP streaming format defines how catalogs, tracks, and associated metadata are structured prior to subscription. WARP introduces a catalog model describing available tracks, timelines, and adaptation relationships before a client subscribes to delivery. Extensions such as CARP build on this foundation, retaining WARP’s catalog, timeline, ABR switching model, and LOC support while introducing CMAF-oriented streaming profiles. In layered terms: WARP catalogs describe what content is available, MoQT defines how it is delivered, and QUIC provides the underlying transport. While conceptually clean, these catalog and adaptation models remain less semantically mature than DASH MPDs or HLS playlists and are still evolving.
Stateful Relays and Edge Responsibility
Operationally, MoQ shifts more responsibility toward the edge. Relays maintain per-session subscription state, track delivery progress, and coordinate congestion across active viewers. This contrasts with traditional CDN caching, which minimizes per-viewer state and infers demand from request patterns. Push-based fan-out enables tighter latency control and more efficient object dissemination under certain workloads, but it also changes scaling characteristics and cost structures. Relay infrastructure begins to resemble publish/subscribe systems rather than stateless HTTP object caches. Operational considerations around relay behavior, denial-of-service protection, and provisioning are being explored in ongoing MoQ drafts. In this context, the OpenMOQ Software Consortium is developing an open, vendor-neutral MoQ stack centered on a high-performance C++ edge relay suitable for multi-CDN and infrastructure adoption.
As streaming experiences become more interactive and personalized, edge logic shifts from cache orchestration toward per-session computation. Fine-grained personalization, synchronization, object prioritization, and bidirectional participation require programmable control surfaces at the edge rather than static object retrieval. Traditional HTTP CDNs are optimized for stateless caching; customization is typically expressed through manifest mutation or URL indirection. WebRTC provides session semantics but concentrates scaling logic in SFUs rather than CDN-native infrastructure. MoQ’s relay model introduces a different execution surface: relays maintain explicit subscription state and operate on named tracks and objects, enabling per-session policy, prioritization, suppression, or transformation without redefining transport semantics. In this sense, MoQ functions less as a transport upgrade and more as a toolkit for session-aware edge execution. This does not guarantee adoption, but it clarifies why MoQ aligns naturally with advanced, edge-customized experiences that strain the HTTP segment model.
Ecosystem Activity and Hybrid Deployment
The ecosystem has moved beyond conceptual experimentation into early commercial positioning. Player support is beginning to emerge: Bitmovin has demonstrated MoQ playback in preview form, and Shaka Player is actively integrating MoQ capabilities. Red5 maintains a public tracker of player implementation efforts, reflecting the pace of activity across vendors and open source projects. End-to-end low-latency platforms such as Vindral and Nanocosmos have incorporated MoQ transport into their distribution models. At the CDN layer, Cloudflare has publicly previewed MoQ relay capabilities. Norsk is also supporting post-encoding MoQ outputs.
Open-source initiatives – including moq-go, moq-rs, moq.js, MOQtail, moxygen and moq-dev – are exploring reference implementations across multiple languages and deployment models. On the client side, Safari 26.4 beta introduces WebTransport support, joining Chrome and Firefox in exposing the browser primitives required for MoQ-based delivery. Additional commercial signals, including roadmap announcements from Red5 ahead of NAB 2026, indicate accelerating ecosystem interest, even if large-scale production standardization remains in progress.
Hybrid coexistence is therefore the most realistic near-term deployment model. CMAF remains the production media artifact. HLS and DASH continue to serve as compatibility and legacy delivery layers, including in environments constrained by device support or DRM requirements such as FairPlay. MoQ can encapsulate existing CMAF assets at the transport layer while parallel HTTP delivery paths persist where required. Packaging pipelines and origin workflows remain largely intact, with MoQ introduced as an alternative delivery substrate rather than a wholesale replacement of the existing stack.
| Reality check: MoQ streaming stack maturity |
|---|
| Despite visible ecosystem progress, MoQ remains a transport framework rather than a complete streaming stack. Several layers that HTTP-based streaming implicitly standardized over the past decade are still evolving. At the content description layer, catalog structures defined in emerging streaming formats such as WARP provide namespace description and track discovery, but they do not yet match the semantic richness of DASH MPDs or HLS playlists. Adaptation modeling, ladder metadata semantics, alternate presentation composition, and multi-period behavior remain comparatively immature. Adaptive bitrate behavior is likewise undefined at the ecosystem level. The balance between client-driven adaptation and server-assisted control remains an open design question. MoQT exposes prioritization and flow-control primitives, yet no interoperable ABR control model or cross-vendor guidance exists for switching logic, layered stream coordination, or fairness under shared congestion windows. Behavioral convergence comparable to HTTP ABR has not yet emerged. Ad insertion workflows are also still in flight. While CMAF-based SSAI workflows can be delivered using MoQ transport, no standardized object-native splicing model or timeline-level signaling convention exists for MoQ-native delivery. Server-guided insertion semantics equivalent to DASH 6th Edition are not yet defined in catalog or subscription terms. Operationally, relay abstractions are specified, but MoQ still lacks a CDN-grade operating profile. Unlike stateless object caches, relays maintain per-session subscription state and congestion coordination. Federation models, relay-to-relay peering conventions, multi-CDN interoperability, and failover semantics remain ecosystem-specific. Cache semantics for object reuse across sessions are not yet standardized. Media encapsulation shows partial convergence. CMAF carriage via CMSF provides a pragmatic interoperability baseline and preserves existing encoding and DRM workflows. LOC offers a standardized object-native container aligned with MoQ’s abstraction. However, convergence around dominant carriage profiles, encryption granularity conventions, and object-to-timeline composition practices is still emerging across implementations. Authorization and entitlement models are technically compatible but not yet normalized. Authorization artifacts such as CAT or Privacy Pass–based tokens (cryptographic tokens issued through a privacy-preserving blind-signature mechanism) can be bound to session-oriented delivery. The MoQ architecture itself does not mandate a single authorization framework, allowing deployments to integrate existing control systems. However, standardized guidance for token rotation, revocation, and DRM mapping in MoQ-native contexts is absent. Browser integration and telemetry are feasible but not yet standardized at ecosystem scale. WebTransport provides the transport surface, but CDN-native support and operational tooling remain uneven. A MoQ player can emit CMCD v2 in Request and Event modes, yet no integration profile defines how session-based transport should expose structured observability data. Taken together, MoQ demonstrates transport-layer viability and accelerating ecosystem interest. However, the layers above transport – adaptation semantics, composition models, federation patterns, operational profiles, and telemetry conventions – remain in active definition. Independent analyses comparing MoQ with WebRTC across different application categories reach similar conclusions about the current maturity gap. MoQ is therefore best characterized in 2026 as a defined transport abstraction with a still-evolving streaming stack built on top of it. |
Is Streaming Approaching a Deeper Architectural Transition?
The evolution of HLS and DASH over recent years has made certain structural boundaries more visible. Both protocols were designed around a pull-based, asset-centric delivery model derived from VOD workflows. Live streaming, low-latency modes, ad signaling, content steering, and dynamic timeline manipulation were introduced incrementally. Mechanisms such as low-latency profiles, patch manifests, interstitial signaling, server-guided ad insertion, and expanded control metadata address genuine operational requirements. Collectively, however, they expand the role of manifests beyond media description toward increasingly control-oriented functions within the delivery workflow. At scale, this introduces tighter cross-component coupling and greater coordination overhead. Incremental semantic expansion remains effective, but the architectural leverage of each additional extension appears to decrease.
CMAF followed a different trajectory. It succeeded where earlier convergence attempts did not, aligning HLS and DASH ecosystems and establishing itself as the common media container across devices, CDNs, and tooling. CMAF is now infrastructural for mainstream OTT distribution. At the same time, its design reflects structural assumptions around track hierarchies, fragment boundaries, and GOP-oriented framing. Emerging representation models – enhancement-layer codecs, multi-view compositions, scene-adaptive coding, and potential hybrid or neural approaches – do not always map cleanly onto a strictly segment- and GOP-oriented abstraction. CMAF remains foundational, but it embodies the architectural priorities of the asset-centric era in which it was defined.
These considerations become more relevant as streaming requirements shift from asset-centric delivery toward session-level control models. Personalization, interaction, synchronized multi-view experiences, dynamic content substitution, and tighter feedback loops between client, edge, and backend introduce bidirectional signaling needs and finer-grained timeline control than segment-based workflows natively provide. These are not merely incremental feature requests; they reflect a different abstraction centered on sessions rather than files. In current architectures, such requirements are typically addressed by combining multiple delivery stacks – most commonly WebRTC for interaction and HLS or DASH for scalable distribution. While operationally viable, this approach increases system complexity, orchestration overhead, and architectural fragmentation across the delivery stack.
Media over QUIC challenges the assumption that discovery, transport, and control must primarily be expressed through HTTP manifests. Subscription-based object delivery, bidirectional communication, and timeline-oriented semantics allow contribution, ingest, processing, and distribution to be modeled within a single protocol family. This does not require replacing existing encodings or containers, but it reduces the need for distinct transport abstractions across workflow stages. Object-oriented encapsulation such as LOC aligns naturally with this model. Although MoQ can transport existing CMAF assets, LOC avoids several structural constraints inherent to segment- and GOP-based representations and is therefore more adaptable to emerging codec and layout paradigms. In this framing, MoQ addresses delivery semantics, while LOC addresses representation structure.
The architectural implications become clearer when examined through concrete user experience categories and deployment scenarios. The table below illustrates several experience classes that remain structurally complex under legacy HTTP ABR and WebRTC hybrids, and how a session-oriented, object-based transport model changes those constraints.
| Experience Category | Why It Is Hard with HLS/DASH + WebRTC | How MoQ Changes the Model |
| Broadcast-Scale Interactive Experiences | Requires HLS/DASH for downstream distribution and WebRTC for upstream interaction. Separate congestion domains, SFU scaling constraints, TURN overhead, and dual orchestration stacks increase complexity at scale. | Unified publish/subscribe model. Viewers publish lightweight tracks (camera, mic, reactions, metadata) within the same session. Relay-based fan-out replaces SFU bottlenecks and removes transport-layer fragmentation. |
| Object-Level Personalization | Per-viewer ads, overlays, commentary, or substitutions require manifest rewriting or segment splicing. The playlist becomes a control bottleneck, and cache efficiency degrades under personalization density. | Object-level subscription and suppression without manifest regeneration. Overlays and alternates become first-class tracks, enabling per-session variation without structural playlist mutation. |
| Fine-Grained Adaptation and Congestion Management | Adaptation occurs at segment boundaries. In-flight segments cannot be selectively dropped, and enhancement-layer suppression requires ladder switching. | Object prioritization allows selective dropping of non-critical objects. Audio can be preserved while video degrades, and enhancement layers can be suppressed dynamically within the same session. |
| Real-Time Multi-Angle and Tiled Experiences | Parallel playlists per angle or region introduce buffer alignment challenges, request overhead, and drift risk. Switching is coarse and segment-boundary–dependent. | Parallel track subscription within a single session enables instant switching without playlist reload. Shared timeline semantics reduce drift, and only visible regions need to be delivered. |
| Large-Scale Synchronization | Pull-based clients drift over time due to independent request cadence and buffer variance. Synchronization across large audiences remains probabilistic. | Push-based object propagation with shared congestion context enables tighter skew control across relays, improving cross-viewer alignment without external synchronization overlays. |
These examples illustrate where session-oriented transport simplifies system design. However, translating this architectural potential into production deployments still faces several practical constraints.
Several constraints prevent a fully session-native MoQ delivery architecture from being deployed end to end today. QUIC has not yet achieved parity with legacy transports in delivery efficiency at scale. Hardware offload for QUIC encryption remains limited, and only a subset of platforms operate HTTP/3 with efficiencies comparable to HTTP/1.1 or HTTP/2. CDN architectures are still adapting to QUIC-native workload patterns. WebTransport illustrates the gap: although implemented in modern browsers, the specification is not yet finalized and CDN-native support remains uneven. Protocol maturity is also a factor. MoQ remains in draft status, with open questions around buffering coordination, ABR interaction, error recovery semantics, and ad insertion workflows.
Ecosystem constraints further limit immediate deployment. Apple FairPlay remains tied to HLS, requiring parallel delivery paths for services targeting Apple devices. This is a structural platform constraint rather than a short-term inconvenience. Timing also plays a role. CMAF Ingest and REAP workflows are only now stabilizing and entering broader adoption at scale (Netflix example), while CDNs are concurrently progressing on MoQ implementations. These transitions overlap but do not align, making a clean architectural break impractical.
The near-term architecture is therefore likely to remain hybrid across most deployments. CMAF persists as the production artifact, encapsulated into MoQ streams and transported to the edge. From there, the same media can be exposed over HTTP for legacy clients while MoQ-native delivery is enabled for capable devices. In such a model, MoQ catalogs define the authoritative timeline, while HLS and DASH manifests function as compatibility layers generated at origin or edge. This approach avoids duplicating ingest and encoding pipelines while enabling gradual client and CDN migration.
Large-scale, fully end-to-end MoQ workflows are not imminent. They represent a convergence trajectory rather than an immediate replacement path. The HTTP-based architecture remains viable, but it increasingly relies on layered extensions to accommodate requirements that are fundamentally session-centric. MoQ does not guarantee architectural convergence, but it is the first transport abstraction designed around those requirements from the outset.
Conclusion
Reassessing the 2021 stack does not reveal structural failure; it reveals consolidation and maturation across the delivery stack. CMAF became infrastructural to mainstream OTT distribution. HLS and DASH evolved incrementally rather than fragmenting. Low-latency modes stabilized into deployable profiles. Observability moved from request enrichment toward structured feedback mechanisms. Trust mechanisms expanded beyond DRM into edge-validated tokens, provenance frameworks, and standardized watermarking. AI-oriented codecs demonstrated measurable efficiency gains in emerging workflows.
The system did not collapse under new requirements. It expanded through additional layers and capabilities. What changed between 2021 and 2026 is less the existence of these layers than the definition of what constitutes the architectural core. Five years ago, core technologies were primarily understood as codecs, containers, adaptive protocols, and DRM reach. In 2026, those components remain foundational, but they no longer define where the streaming architecture is evolving most rapidly.
The center of gravity is shifting: from asset retrieval toward session semantics; from fixed segment abstraction toward object-level granularity; from exclusively origin-defined logic toward edge-resident control; from passive telemetry toward closed-loop observability. These shifts do not invalidate the existing model, but they change how streaming architectures are evaluated.
The HTTP segment paradigm continues to function and can still evolve. However, as requirements expand – higher personalization density, synchronized interaction, object-level substitution, AI-driven selective access, tighter steering feedback – the model increasingly relies on layered extensions to absorb fundamentally session-centric demands. Incremental manifest expansion remains effective, but its capacity to accommodate new requirements becomes progressively more limited.
Media over QUIC and object-oriented encapsulation like LOC should therefore be interpreted not as simple protocol substitutions, but as structural alternatives. They align transport semantics, representation granularity, and session-level control within a single conceptual framework. That framework is not yet dominant in operational terms – it remains shaped by ecosystem inertia, device lifecycles, CDN economics, and protocol maturity.
The near-term trajectory is consequently hybrid. CMAF remains the production artifact. HTTP ABR persists as the compatibility surface. MoQ catalogs and session-based delivery evolve alongside it. Edge logic becomes progressively state-aware. Migration proceeds through staged coexistence rather than abrupt replacement.
Streaming in 2026 is not undergoing a fundamental break, its architecture is gradually evolving. The stack that defined the previous decade remains operational, but it is increasingly evaluated against requirements it was not originally designed to absorb. The deeper transition outlined here reflects accumulated technical pressure across representation, delivery, and control layers. The shift will be gradual. It is already underway.




