Engineering posts about Data Lake

Curated summaries and key learnings for engineers working with Data Lake.

Databricks
6m

How World Bank Group uses databricks to eradicate poverty through shared knowledge

The World Bank Group has developed a unified data and AI platform on Databricks to integrate structured operational data with unstructured documents, thereby eliminating manual research bottlenecks....

Databricks
7m

Scaling for MHHS: how Octopus Energy achieved a 50x cost reduction in margin data engineering

The article discusses the significant data engineering challenges faced by Octopus Energy as the UK transitions to a Market-wide Half-Hourly Settlement (MHHS) model, which increases the frequency of...

Databricks
7m

Unlock seamless and cost-effective marketing campaigns with Lakebase

The article discusses the implementation and benefits of Lakebase, an architecture that combines the advantages of transactional databases with the flexibility of data lakes. It highlights the...

Databricks
10m

How to Build Real-Time Fraud Detection using Spark Real-Time Mode and Lakebase

This article discusses the implementation of a real-time fraud detection system leveraging Apache Spark's Real-Time Mode (RTM) and Lakebase on the Databricks platform. It highlights the challenges of...

Airbnb
8m

Scaling Airbnb’s identity graph with a unified knowledge graph infrastructure

The article outlines Airbnb's shift from a PaaS model to an internally managed knowledge graph infrastructure, focusing on the identity graph that captures user relationships. It details the...

Databricks
4m

Announcing the Databricks analytics engineer learning pathway

The Databricks Analytics Engineer Learning Pathway is designed to equip SQL practitioners with the skills necessary to transform raw data into governed, AI-ready semantic models and metrics. The...

Databricks
8m

Backstage with Lakebase, part 2

In this second part of the series, the article discusses the integration of Backstage with Databricks Lakebase, emphasizing the transformation of database management from a complex, multi-service...

Databricks
6m

Expanded interoperability with Unity Catalog Open APIs

The article elaborates on the advancements brought by Unity Catalog's Open APIs, which enhance interoperability in data management by allowing enterprises to maintain a single copy of data while...

Databricks
8m

Clinical operations intelligence belongs on the Lakehouse

The article presents the Site Feasibility Workbench, an open-source application designed to enhance clinical operations intelligence by integrating data, models, and applications within a single...

Databricks
11m

The Rosetta stone of CPS: Claroty’s AI-powered library

The article presents Claroty's AI-Powered CPS Library, a groundbreaking solution designed to address the identity crisis in Cyber-Physical Systems (CPS). It highlights the challenges faced by...

Databricks
10m

Data quality is the AI strategy

The article emphasizes the critical role of data quality in leveraging AI effectively within healthcare systems. It highlights NYU Langone Health's strategic approach to data management, where the...

Databricks
9m

How CFOs in consulting can recover margin with Databricks

The article outlines the financial challenges faced by consulting firms, particularly in managing data across disparate systems, which leads to inefficiencies and margin pressures. It emphasizes the...

Databricks
11m

The Rise of Sports Intelligence: How the Lakehouse Turns Tracking Data into Competitive Advantage

The article explores the transformative impact of the Databricks Data Intelligence Platform on professional sports through the integration of vast amounts of tracking and biomechanical data. It...

AWS
4m

Amazon Redshift introduces AWS Graviton-based RG instances with an integrated data lake query engine

Amazon Redshift has launched RG instances powered by AWS Graviton, enhancing performance for data warehouse workloads and integrating a data lake query engine. This new instance type offers up to...

Meta (Facebook)
11m

Migrating Data Ingestion Systems at Meta Scale

The article outlines the comprehensive migration of Meta's data ingestion system, which was essential for maintaining the efficiency and reliability of their social graph data processing. It details...

Databricks
6m

Growth Analytics Is What Comes After Growth Hacking

The article explores the evolution of growth analytics as a critical component in modern user acquisition strategies. It highlights the shift from tactical growth hacking to a more analytical...

Databricks
3m

Why telecom churn prediction misses the intervention window

The article explores the challenges faced by telecommunications companies in effectively predicting and intervening in customer churn. Despite the sophistication of churn propensity models,...

Databricks
3m

Operating room utilization is hiding in your scheduling data

The article highlights the critical importance of operating room (OR) utilization in healthcare systems, emphasizing that underutilized ORs represent significant revenue losses and unmet patient...

Databricks
3m

Energy trading analytics in a real-time market

The article highlights the challenges faced in energy trading analytics due to the fast-paced nature of price changes and the limitations of traditional batch processing methods. It emphasizes the...

Databricks
6m

Peril Predicts: Precision Payouts for a Volatile World

The article explores the implementation of parametric insurance, which automates payouts based on predefined conditions triggered by objective event data. It highlights the role of modern catastrophe...