Identify User Journeys at Pinterest
Read Full ArticleSummary
The article outlines Pinterest's innovative approach to understanding user journeys by leveraging machine learning techniques to enhance recommendation systems. It introduces the concept of user journeys as sequences of interactions that reflect user intent and context, moving beyond immediate interests. The authors detail their engineering philosophy, which emphasizes starting small with high-quality datasets and utilizing pretrained models to maximize efficiency. The system architecture includes dynamic keyword extraction and clustering, journey naming with LLMs, and a ranking model to ensure diverse and relevant journey recommendations. The results demonstrate significant improvements in user engagement through journey-aware notifications.
Key Learnings
- 1User journeys are defined as sequences of interactions that reveal user intent and context, enabling more personalized recommendations.
- 2Dynamic keyword extraction allows for greater flexibility and adaptability in identifying user journeys compared to predefined taxonomies.
- 3Leveraging pretrained models and LLMs enhances the efficiency and effectiveness of the journey identification process.
- 4Journey ranking and diversification strategies are crucial to prevent monotony in recommendations and ensure relevance.
- 5The integration of LLMs in journey naming and expansion can significantly improve personalization and user experience.
Who Should Read This
Senior Machine Learning Engineers designing adaptive recommendation systems using user interaction data
Test Your Knowledge
What are the trade-offs between using a predefined journey taxonomy versus dynamic keyword extraction in user journey identification?
How does the choice of clustering algorithm impact the accuracy of journey extraction?
What failure scenarios might arise from relying solely on user engagement data for journey classification?
Why is it important to balance personalization and simplicity in journey naming, and how can LLMs assist in this process?
How does the implementation of a diversifier in the journey ranking model affect user engagement metrics?
Topics
More articles about Machine Learning
Explore Machine Learning engineering →Decoupled by Design: Billion-Scale Vector Search
The article discusses the challenges and solutions in building a billion-scale vector search system at Databricks. It highlights the limitations of traditional vector databases that couple storage...
Introducing Kasal
Kasal is a low-code platform developed by Databricks Labs for designing, deploying, and orchestrating agentic AI systems. It provides a visual interface that allows users, regardless of their...
Business Intelligence Analytics: A Complete Guide for the AI Era
The article discusses the evolution of business intelligence (BI) analytics, emphasizing the need for organizations to bridge the gap between data collection and actionable insights. It outlines the...
Engineering Platform Trust: Cutting Customer Case Volume 20x with Petabyte-Scale Health Signals
The article details the development of a Technical Health Score system at Salesforce, aimed at quantifying platform trust through analytics pipelines that handle petabytes of telemetry data. By...
Building What’s Next. Together. Introducing the Brickbuilder Partner Network for the Agentic AI Era
The Brickbuilder Partner Network is a newly established global partner program aimed at fostering growth and innovation among consulting firms, independent software vendors (ISVs), and data providers...
More from Pinterest Engineering
View Pinterest engineering blogs →Unified Context-Intent Embeddings for Scalable Text-to-SQL
The article outlines Pinterest's evolution from basic Text-to-SQL systems to a sophisticated Analytics Agent that leverages unified context-intent embeddings for enhanced query understanding and SQL...
Unifying Ads Engagement Modeling Across Pinterest Surfaces
The article presents a comprehensive approach to unify ads engagement modeling across different surfaces at Pinterest, addressing the challenges posed by previously independent models. It outlines...
Bridging the Gap: Diagnosing Online–Offline Discrepancy in Pinterest’s L1 Conversion Models
The article discusses the challenges faced by Pinterest in reconciling offline and online performance metrics of their L1 conversion models. It highlights the discrepancies observed between strong...
Piqama: Pinterest Quota Management Ecosystem
The article introduces Piqama, Pinterest's comprehensive quota management ecosystem designed to oversee resource quotas across various systems. It outlines the architecture of Piqama, emphasizing its...
Drastically Reducing Out-of-Memory Errors in Apache Spark at Pinterest
This article details Pinterest's approach to significantly reduce out-of-memory (OOM) errors in their Apache Spark applications through a feature called Auto Memory Retries. By automatically...