My Starter Project on the Lyft Rider Data Science Team

Summary

The article outlines a data science project undertaken by a new hire at Lyft, focusing on the Rider Experience Score (RES) tool to analyze the long-term effects of rider experiences on retention. It discusses the challenges of estimating causal effects without A/B testing and introduces the Augmented Inverse Propensity Score Weighting (AIPW) methodology to mitigate bias in observational data. The author shares insights gained from working with internal data sources, selecting confounders, and collaborating with colleagues to enhance the RES pipeline, ultimately contributing to improved rider experiences.

Key Learnings

1Understanding the limitations of A/B testing for long-term effect measurement and the necessity of alternative methodologies like AIPW.
2The importance of identifying and managing confounders in causal inference to avoid biased estimates.
3How to leverage machine learning models to estimate treatment effects in complex scenarios.
4The role of collaboration and internal resources in enhancing data science projects within an organization.

Who Should Read This

Data Scientists with experience in causal inference methodologies looking to enhance their understanding of practical applications in a real-world setting.

Test Your Knowledge

What are the trade-offs between using A/B testing and observational data for estimating causal effects?

How does the AIPW methodology address the issue of selection bias in observational studies?

What challenges might arise when selecting confounders, and how can they impact the validity of causal estimates?

Why is it important to model complex non-linear relationships in causal inference, and how can machine learning facilitate this?

What steps can be taken to ensure that the treatment effect estimation is robust to errors in model assumptions?

Topics

Causal Inference Machine Learning Data Governance Data Quality

Read Full Article at Lyft

More from Lyft Engineering

View Lyft engineering blogs →

Lyft

From Python3.8 to Python3.10: Our Journey Through a Memory Leak

This article chronicles the experience of upgrading Python services from version 3.8 to 3.10 at Lyft, highlighting a significant memory leak issue encountered during the transition. The author...

Lyft

FacetController: How we made infrastructure changes at Lyft simple

The article discusses Lyft's implementation of FacetController, a tool designed to streamline the management of Kubernetes deployments through the use of Custom Resource Definitions (CRDs). By...

Lyft

11m

From manual fixes to automatic upgrades — building the Codemod Platform at Lyft

The article outlines the development of the Codemod Platform at Lyft, aimed at automating the process of upgrading libraries and managing code transformations across numerous frontend microservices....

Lyft

16m

Real-Time Spatial Temporal Forecasting @ Lyft

The article discusses the implementation of real-time spatial temporal forecasting models at Lyft, focusing on their application for predicting market conditions critical for operational efficiency....

Lyft

15m

Beyond Query Optimization: Aurora Postgres Connection Pooling with SQLAlchemy & RDSProxy

The article explores the importance of efficient database connection management, particularly in the context of PostgreSQL and SQLAlchemy. It emphasizes the benefits of connection pooling to reduce...

My Starter Project on the Lyft Rider Data Science Team

Summary

Key Learnings

Who Should Read This

Test Your Knowledge

Topics

More from Lyft Engineering

From Python3.8 to Python3.10: Our Journey Through a Memory Leak

FacetController: How we made infrastructure changes at Lyft simple

From manual fixes to automatic upgrades — building the Codemod Platform at Lyft

Real-Time Spatial Temporal Forecasting @ Lyft

Beyond Query Optimization: Aurora Postgres Connection Pooling with SQLAlchemy & RDSProxy

Related topics