Next Gen Data Processing at Massive Scale At Pinterest With Moka (Part 2 of 2)

Summary

The article discusses Pinterest's transition from a Hadoop-based data processing platform to Moka, a next-generation system designed for massive-scale data processing. It highlights the deployment of Moka on AWS Elastic Kubernetes Service (EKS), detailing the use of Terraform for infrastructure management and the implementation of a comprehensive logging and observability framework. The article also covers the challenges and solutions in managing container images and ensuring effective metrics collection and analysis for operational efficiency.

Key Learnings

1Moka's deployment on AWS EKS is structured into multiple environments (test, dev, staging, production) to ensure isolation and security.
2The logging infrastructure leverages Fluent Bit for efficient log management, enabling the aggregation of Spark application logs and system pod logs in Amazon S3.
3Observability is enhanced through a combination of Prometheus and OpenTelemetry, allowing for detailed insights into the performance of EKS clusters.
4The article emphasizes the importance of containerization in Moka, ensuring full isolation and compatibility across different architectures (Intel and ARM).
5The use of Terraform modules facilitates a modular and reusable approach to infrastructure as code, streamlining the deployment process.

Who Should Read This

Senior Data Engineers implementing scalable data processing solutions on cloud platforms like AWS

Test Your Knowledge

What are the trade-offs of using AWS EKS for deploying Moka compared to traditional Hadoop clusters?

How does Fluent Bit enhance the logging capabilities of Spark applications running on Moka?

What design decisions were made to ensure observability in the Moka platform, and what challenges did they address?

In what ways does the containerization strategy in Moka differ from the previous Monarch platform, and why is this significant?

How does the architecture of Moka support scalability and reliability in data processing workloads?

Topics

AWS Terraform Kubernetes Logging Observability

Read Full Article at Pinterest

More from Pinterest Engineering

View Pinterest engineering blogs →

19m

Next Gen Data Processing at Massive Scale At Pinterest With Moka (Part 2 of 2)

Summary

Key Learnings

Who Should Read This

Test Your Knowledge

Topics

More articles about AWS

Complexity is a choice. SASE migrations shouldn’t take years.

AWS Weekly Roundup: Amazon Connect Health, Bedrock AgentCore Policy, GameDay Europe, and more (March 9, 2026)

Native .NET Buildpack Support is Now Available on App Platform

Introducing OpenClaw on Amazon Lightsail to run your autonomous private AI agents

See risk, fix risk: introducing Remediation in Cloudflare CASB

More from Pinterest Engineering

Unified Context-Intent Embeddings for Scalable Text-to-SQL

Unifying Ads Engagement Modeling Across Pinterest Surfaces

Bridging the Gap: Diagnosing Online–Offline Discrepancy in Pinterest’s L1 Conversion Models

Piqama: Pinterest Quota Management Ecosystem

Drastically Reducing Out-of-Memory Errors in Apache Spark at Pinterest

Related topics