Media Streaming Mesh is a new concept for supporting real-time applications (such as media production and multiplayer online gaming) in Kubernetes.
The goal of Kubernetes is to provide a “platform for automating deployment, scaling, and operations of application containers across clusters of hosts”. Most applications deployed in Kubernetes today are web-based, and so much of the effort around networking in kubernetes is optimised for web applications. One example of this is the service mesh architecture (exemplified by Istio), where applications communicate with each other via web proxies rather than directly over IP.
Media Streaming Mesh will enable developers of real time applications to focus on their business logic whilst the Media Streaming Mesh infrastructure facilitates real-time connectivity for microservices.
Today’s service meshes generally only support TCP-based applications (and in fact are optimised for HTTP-based web applications). Any support for UDP that is added to service meshes is likely to be focussed on enabling QUIC (since HTTP/3 runs over QUIC).
Real-time applications generally run over UDP rather than TCP. Media Streaming applications typically rely on RTP (the Real-time Transport Protocol) – which runs on top of UDP, and hence RTP will be the initial focus of Media Streaming Mesh.
Service meshes bring many benefits to web applications such as:
Our goal is to extend all these to real-time media streaming applications, whilst also enabling additional capabilities such as:
Interactive real-time apps (e.g. games) generally use de-facto standard protocols (such as RakNet, KCP and netcode) which run over UDP. UDP itself is connectionless, so to support these protocols we can either rely on timer heuristics etc. or implement per-protocol proxies.
Streaming apps are generally RTP-based as noted above. RTP enables measurement of loss and jitter as it carries sequence numbers and timestamps in the packet header.
One challenge with RTP is that it often runs on ephemeral UDP ports which are assigned by a TCP-based control channel such as SIP or RTSP. However proxying these TCP-based protocols will enable us to implement URL/URI-based routing and to avoid using timer heuristics.
Many cloud-native applications involve a mixture of ‘east-west’ traffic between microservices (generally within the same cluster) and ‘north-south’ traffic between the application and external entities.
This will be equally true for real-time applications.
For example in a game there might be traffic between game players and the game infrastructure running in the cloud. However for large game instances the game itself might be spread over multiple compute nodes (possibly even distributed geographically), and these will need to communicate with each other.
Equally for media applications there might be multiple camera feeds into a news-room where one feed is selected, various data (e.g. breaking news) is overlaid, and then the resulting stream is sent out for broadcast.
The exact architecture for Media Streaming Mesh is still very much up for discussion.
Our current demo implementation relies on a simple Go-based proxy that runs as a pod sidecar (plus an init container that directs RTSP, RTP and RTCP traffic into the proxy).
Longer term our expectation is that we’ll implement:
With that baseline we will then be able to implement other protocols (such as SIP, RIST, SMTPE 2110, “raw” RTP etc.)
In order to keep footprint light one key will be to deploy only the required components for the service being implemented.
For inter and extra-cluster traffic the per-node RTP/RTSP proxies will act as data-plane gateways, and the per-cluster proxies will act as control-plane gateways.
We’re looking for potential users of Media Streaming Mesh to help us define the solution, and for developers to help us create it!
Please do join our Slack channel.