Scaling Real-Time Video on AWS: How We Keep WebRTC Latency Below 150ms with Kubernetes Autoscaling | HackerNoon
Briefly

A planet-scale WebRTC Selective Forwarding Unit (SFU) was built on AWS using Kubernetes, achieving end-to-end latency below 150 ms. Key features include latency-based DNS, multiple EKS clusters, and geo-sticky JWTs. The SFU allowed for auto-scaling based on video track count and responded effectively to network challenges, maintaining CPU usage under high load. The architecture reduced global data egress costs significantly and enforced security through encryption and compliance measures. Single-region SFUs present challenges such as high round-trip time and increased egress bandwidth costs, making them unsustainable globally.
When a WebRTC SFU is hosted in only one region, long-distance calls suffer high round-trip time (RTT) and jitter, degrading quality.
Geo-sharding cut our global data egress costs by ~70%. We also enforce end-to-end DTLS/SRTP encryption, mutual TLS between nodes, private ALBs, and regular cert rotation for compliance.
Read at Hackernoon
[
|
]