Building a Next-Generation Key-Value Store at Airbnb
Read Full ArticleSummary
The article discusses the complete rearchitecture of Airbnb's storage engine, Mussel, transitioning from version 1 to version 2. It highlights the challenges faced with the original architecture, such as operational complexity and consistency limitations, and details the solutions implemented in the new NewSQL backend. Key features of Mussel v2 include dynamic range sharding, a stateless Dispatcher service, and a robust migration strategy utilizing Kafka for data consistency. The migration process is elaborated, emphasizing the blue/green strategy and dual-write mechanisms that ensured zero data loss and no downtime during the transition.
Key Learnings
- 1Mussel v2 addresses operational complexity by leveraging Kubernetes for automated deployments, significantly reducing manual overhead.
- 2Dynamic range sharding in v2 mitigates latency spikes and improves performance for large datasets compared to static hash partitioning in v1.
- 3The migration strategy employed a blue/green rollout with dual writes, allowing for seamless data transition and consistency checks without impacting service availability.
- 4Kafka plays a critical role in maintaining data consistency during migration, serving as a reliable replication log.
- 5The architecture of Mussel v2 integrates features of various systems, providing a scalable and efficient solution for handling both real-time and bulk data workloads.
Who Should Read This
Senior Database Engineers implementing scalable key-value stores and managing complex data migrations.
Test Your Knowledge
What are the key operational challenges that Mussel v1 faced, and how does v2 address them?
How does the dynamic range sharding in Mussel v2 improve performance over the static hash partitioning used in v1?
What specific features were introduced to handle write conflicts during the migration from v1 to v2?
How does the blue/green migration strategy ensure zero downtime and data loss during the transition?
In what ways does Kafka enhance the reliability of the migration process, and what role does it play in the architecture of Mussel v2?
Topics
More from Airbnb Engineering
View Airbnb engineering blogs →It Wasn’t a Culture Problem: Upleveling Alert Development at Airbnb
The article outlines Airbnb's transformation of its Observability as Code (OaC) alert review process, which significantly reduced development cycles from weeks to minutes. By implementing a system...
Academic Publications & Airbnb Tech: 2025 Year in Review
The article discusses Airbnb's significant advancements in AI and machine learning throughout 2025, particularly in the context of academic conferences such as KDD, CIKM, and EMNLP. It highlights the...
Safeguarding Dynamic Configuration Changes at Scale
The article outlines Airbnb's dynamic configuration platform, Sitar, which enables safe and reliable runtime behavior changes without service interruptions. It emphasizes the importance of a coherent...
My Journey to Airbnb — Anna Sulkina
Anna Sulkina's journey to Airbnb highlights her extensive experience in engineering, particularly in application and cloud infrastructure. She transitioned from hardware diagnostics to software...
Pay As a Local
The article outlines Airbnb's initiative to implement over 20 locally relevant payment methods across various global markets within a year. It details the architectural changes made to their payment...