Snap (Snapchat)
8 min read

Embedding-based Retrieval with Two-Tower Models in Spotlight

Read Full Article

Summary

The article details Snap's implementation of an embedding-based retrieval (EBR) system for its Spotlight video platform, utilizing a two-tower model architecture to enhance video recommendations based on user interests and engagement history. It describes the challenges faced in real-time content retrieval and the optimization strategies employed, including the use of dense and sparse features, ResNet-style neural networks, and advanced training techniques like cosine annealing. The system is designed to efficiently generate user and story embeddings, ensuring low-latency responses while maintaining high personalization in content delivery.

Key Learnings

  • 1The two-tower model architecture allows for scalable and flexible embedding generation for user and story interactions.
  • 2Combining dense and sparse features enhances the representation of user interests, leading to improved recommendation accuracy.
  • 3The use of advanced optimization techniques, such as Adam optimizer and cosine annealing, facilitates faster convergence and better model performance.
  • 4Implementing in-batch negative sampling helps the model learn effectively from user-story combinations, improving retrieval quality.
  • 5The separation of feed processing and retrieval services enhances scalability and allows for handling multiple request types efficiently.

Who Should Read This

Senior Machine Learning Engineers developing scalable recommendation systems using embedding techniques.

Test Your Knowledge

?

What are the trade-offs between using dense and sparse features in the two-tower model architecture?

?

How does the cosine annealing technique contribute to the training efficiency of the model?

?

What failure scenarios could arise from the embedding generation process, and how can they be mitigated?

?

Why is it important to keep user and story features independent in the two-tower model?

?

How does the implementation of in-batch negative sampling affect the model's learning process?

Topics

Read Full Article at Snap (Snapchat)