Embedding-based Retrieval with Two-Tower Models in Spotlight
Read Full ArticleSummary
The article details Snap's implementation of an embedding-based retrieval (EBR) system for its Spotlight video platform, utilizing a two-tower model architecture to enhance video recommendations based on user interests and engagement history. It describes the challenges faced in real-time content retrieval and the optimization strategies employed, including the use of dense and sparse features, ResNet-style neural networks, and advanced training techniques like cosine annealing. The system is designed to efficiently generate user and story embeddings, ensuring low-latency responses while maintaining high personalization in content delivery.
Key Learnings
- 1The two-tower model architecture allows for scalable and flexible embedding generation for user and story interactions.
- 2Combining dense and sparse features enhances the representation of user interests, leading to improved recommendation accuracy.
- 3The use of advanced optimization techniques, such as Adam optimizer and cosine annealing, facilitates faster convergence and better model performance.
- 4Implementing in-batch negative sampling helps the model learn effectively from user-story combinations, improving retrieval quality.
- 5The separation of feed processing and retrieval services enhances scalability and allows for handling multiple request types efficiently.
Who Should Read This
Senior Machine Learning Engineers developing scalable recommendation systems using embedding techniques.
Test Your Knowledge
What are the trade-offs between using dense and sparse features in the two-tower model architecture?
How does the cosine annealing technique contribute to the training efficiency of the model?
What failure scenarios could arise from the embedding generation process, and how can they be mitigated?
Why is it important to keep user and story features independent in the two-tower model?
How does the implementation of in-batch negative sampling affect the model's learning process?
Topics
More articles about Embedding
Explore Embedding engineering →Unified Context-Intent Embeddings for Scalable Text-to-SQL
The article outlines Pinterest's evolution from basic Text-to-SQL systems to a sophisticated Analytics Agent that leverages unified context-intent embeddings for enhanced query understanding and SQL...
Asynchronous Verified Semantic Caching for Tiered LLM Architectures
The article introduces 'Krites', an innovative asynchronous caching policy designed for large language models (LLMs) that enhances semantic caching efficiency without compromising critical path...
Engineering VP Josh Clemm on how we use knowledge graphs, MCP, and DSPy in Dash
In this article, Josh Clemm discusses the technical architecture behind Dropbox Dash, focusing on the integration of knowledge graphs, retrieval methods, and the use of large language models (LLMs)....
PinLanding: Turn Billions of Products into Instant Shopping Collections with Multimodal AI
The article presents PinLanding, an innovative pipeline designed to generate shopping collections from vast product catalogs using multimodal AI techniques. It emphasizes the transition from...
A More Powerful, Code-First Knowledge Base Experience on the DigitalOcean Gradient™ AI Platform
The article introduces significant improvements to the DigitalOcean Gradient AI Knowledge Base platform, emphasizing a code-first approach that allows developers to manage knowledge bases directly...
More from Snap (Snapchat) Engineering
View Snap (Snapchat) engineering blogs →Spectacles - EyeConnect
The article discusses EyeConnect, a feature designed to facilitate shared augmented reality experiences by allowing users to connect their Spectacles through a novel motion tracking algorithm. Unlike...
Universal User Modeling (UUM): A Foundation Model for User Understanding at Snapchat
The article discusses Universal User Modeling (UUM) at Snapchat, a foundational model designed to enhance user understanding across various product surfaces. UUM captures user behaviors over time by...
From Monolith to Multicloud Micro-Services: Inside Snap’s Service Mesh - Snap Engineering
The article outlines Snap Engineering's transition from a monolithic application architecture to a microservices architecture deployed across multiple cloud providers, specifically AWS and Google...
Don't Rewrite Your App, Unless You Have To - Snap Engineering
The article discusses the Snapchat Engineering team's experience in rewriting their Android app to enhance performance and reduce bugs. It outlines the challenges faced due to the app's complexity...
Making The Most of a Rewrite - Snap Engineering
The article outlines the process and considerations involved in rewriting the Snapchat application, focusing on architectural improvements to enhance performance and maintainability. It emphasizes...