Netflix

•

9 min read

•October 21, 2025

Behind the Streams: Real-Time Recommendations for Live Events Part 3

Summary

The article details Netflix's engineering approach to delivering real-time recommendations for live events, highlighting the unique challenges posed by simultaneous viewership demands. It describes a two-phase system that includes prefetching data to mitigate traffic spikes and broadcasting real-time updates to connected devices. The authors emphasize the importance of balancing constraints such as time, request throughput, and compute cardinality to ensure a seamless user experience during high-stakes live events. Additionally, they discuss the implications of traffic management strategies and the need for adaptive prioritization to handle unpredictable load patterns effectively.

Key Learnings

1Real-time recommendations for live events require a two-phase approach to manage data prefetching and dynamic updates effectively.
2Balancing constraints such as request throughput and compute cardinality is crucial for optimizing system performance during peak loads.
3Implementing adaptive traffic prioritization can help manage unexpected surges in demand, ensuring critical updates are delivered reliably.
4Jittering cache expiration times can smooth out traffic spikes, preventing system overload during high-traffic events.
5A robust pub/sub architecture is essential for minimizing latency and managing communication between services and devices.

Who Should Read This

Senior Distributed Systems Engineers designing scalable architectures for real-time data delivery in high-traffic environments.

Test Your Knowledge

What are the trade-offs between prefetching data and real-time broadcasting in the context of live event recommendations?

How does the system ensure high availability and reliability during peak loads without overwhelming cloud services?

What design decisions were made to handle the thundering herd problem, and why were they necessary?

In what scenarios might the adaptive traffic prioritization strategy fail, and how could those failures be mitigated?

How does the use of a GraphQL schema enhance the efficiency of device queries and broadcast payloads?

Topics

Backpressure High Availability Load Shedding Service Discovery Traffic Management

Read Full Article at Netflix

More from Netflix Engineering

View Netflix engineering blogs →

Netflix

10m

ML Observability: Bringing Transparency to Payments and Beyond

The article explores the critical role of ML observability in enhancing the performance and reliability of machine learning models, particularly in payment processing at Netflix. It emphasizes the...

Netflix

From Facts & Metrics to Media Machine Learning: Evolving the Data Engineering Function at Netflix

The article outlines the transformation of data engineering at Netflix, emphasizing the shift from traditional data practices to a new specialization known as Media ML Data Engineering. This...

Netflix

Empowering Netflix Engineers with Incident Management

The article outlines Netflix's journey to democratize incident management, shifting from a centralized model to empowering engineering teams across the organization. It emphasizes the importance of a...

Netflix

10m

Scaling Muse: How Netflix Powers Data-Driven Creative Insights at Trillion-Row Scale

The article discusses Netflix's Muse application, which aims to deliver data-driven insights for content discovery. It highlights the evolution of Muse's architecture from a simple dashboard to a...

Netflix

15m

Building a Resilient Data Platform with Write-Ahead Log at Netflix

The article details Netflix's approach to building a resilient data platform using a Write-Ahead Log (WAL) system to address challenges such as data loss, corruption, and system entropy across...

Behind the Streams: Real-Time Recommendations for Live Events Part 3

Summary

Key Learnings

Who Should Read This

Test Your Knowledge

Topics

More articles about Backpressure

Scaling Jira cloud Migrations, One Bottleneck at a Time

From Static Rate Limiting to Adaptive Traffic Management in Airbnb’s Key-Value Store

More from Netflix Engineering

ML Observability: Bringing Transparency to Payments and Beyond

From Facts & Metrics to Media Machine Learning: Evolving the Data Engineering Function at Netflix

Empowering Netflix Engineers with Incident Management

Scaling Muse: How Netflix Powers Data-Driven Creative Insights at Trillion-Row Scale

Building a Resilient Data Platform with Write-Ahead Log at Netflix

Related topics