Engineering posts about Transformers

Curated summaries and key learnings for engineers working with Transformers.

This article details the development and functionality of DigitalOcean's Inference Router, a system designed to optimize AI model selection based on specific task requirements. It highlights the...

Google

Announcing Genkit Middleware: Intercept, extend, and harden your agentic apps

The article introduces Genkit, an open-source framework designed for building full-stack, AI-powered applications across multiple programming languages, including TypeScript, Go, Dart, and Python. It...

Databricks

MCP Marketplace Brings Real-Time Intelligence to Agentic Applications

The MCP Marketplace serves as a pivotal platform for integrating real-time intelligence into agentic applications, allowing them to leverage external data sources to enhance decision-making...

Databricks

How Superhuman and Databricks built a 200K QPS inference platform together

The article describes the collaboration between Superhuman and Databricks in developing a high-performance inference platform capable of handling over 200,000 queries per second (QPS) with stringent...

Apple

SpecMD: A Comprehensive Study on Speculative Expert Prefetching

The article presents SpecMD, a standardized framework designed for benchmarking caching strategies in Mixture-of-Experts (MoE) models. It highlights the importance of an expert caching mechanism to...

Databricks

Databricks partners with OpenAI on GPT-5.5

Databricks has announced a partnership with OpenAI to leverage GPT-5.5, the latest iteration of their frontier model, which significantly enhances capabilities in enterprise tasks, including complex...

Cloudflare

26m

Orchestrating AI Code Review at scale

The article discusses the implementation of an AI-driven code review system at Cloudflare, addressing the inefficiencies of traditional code review processes. By leveraging a composable plugin...

Cloudflare

Cloudflare’s AI Platform: an inference layer designed for agents

Cloudflare’s AI Platform introduces an inference layer that allows developers to access multiple AI models from various providers through a unified API. This platform addresses the challenges of...

AWS

Introducing Anthropic’s Claude Opus 4.7 model in Amazon Bedrock

The article introduces Anthropic's Claude Opus 4.7 model, which is integrated into Amazon Bedrock, emphasizing its advancements in coding, long-running tasks, and professional workflows. The model...

Google

Subagents have arrived in Gemini CLI

The article introduces subagents in the Gemini CLI, a feature that allows the CLI to delegate complex tasks to specialized agents. Each subagent operates independently with its own context, tools,...

Figma

10m

Turning prompts into five scalable workflows with Figma Weave

The article introduces Figma Weave, a platform that combines generative AI with professional editing tools to enhance media production workflows. It outlines five scalable workflows that leverage AI...

Meta (Facebook)

17m

KernelEvolve: How Meta’s Ranking Engineer Agent Optimizes AI Infrastructure

KernelEvolve is an advanced AI system developed by Meta to optimize the performance of machine learning models across diverse hardware platforms, including NVIDIA and AMD GPUs, as well as Meta's...

Google

ADK Go 1.0 Arrives!

ADK Go 1.0 introduces significant enhancements for developing AI agents, emphasizing observability, security, and extensibility. Key features include native integration with OpenTelemetry for tracing...

Google

Closing the knowledge gap with agent skills

The article explores the limitations of large language models (LLMs) in keeping up with rapidly changing software engineering practices and introduces agent skills as a solution to close the...

Databricks

Databricks recognized as a Gartner® Peer Insights™ Customers’ Choice for Analytics and BI

Databricks has been recognized as a Customers’ Choice in the Gartner Peer Insights Voice of the Customer for Analytics and Business Intelligence Platforms, achieving a high customer rating of 4.8 out...

Cloudflare

11m

Powering the agents: Workers AI now runs large models, starting with Kimi K2.5

The article introduces Cloudflare's Workers AI platform, which now supports the Kimi K2.5 model, a large language model (LLM) designed for agentic tasks. It highlights the importance of a robust...

Dropbox

13m

How we optimized Dash's relevance judge with DSPy

The article details how Dropbox Dash optimized its relevance judging system using DSPy, an open-source framework designed for systematic prompt optimization. It highlights the challenges faced when...

Snap (Snapchat)

12m

Engineering posts about Transformers

How We Built DigitalOcean Inference Router

Announcing Genkit Middleware: Intercept, extend, and harden your agentic apps

MCP Marketplace Brings Real-Time Intelligence to Agentic Applications

How Superhuman and Databricks built a 200K QPS inference platform together

SpecMD: A Comprehensive Study on Speculative Expert Prefetching

Databricks partners with OpenAI on GPT-5.5

Orchestrating AI Code Review at scale

Cloudflare’s AI Platform: an inference layer designed for agents

Introducing Anthropic’s Claude Opus 4.7 model in Amazon Bedrock

Subagents have arrived in Gemini CLI

Turning prompts into five scalable workflows with Figma Weave

KernelEvolve: How Meta’s Ranking Engineer Agent Optimizes AI Infrastructure

ADK Go 1.0 Arrives!

Closing the knowledge gap with agent skills

Databricks recognized as a Gartner® Peer Insights™ Customers’ Choice for Analytics and BI

Powering the agents: Workers AI now runs large models, starting with Kimi K2.5

How we optimized Dash's relevance judge with DSPy

Building the Spatial Interaction and Interface Frameworks for Specs

DigitalOcean Gradient™ AI Platform Now Integrates with LlamaIndex

Run Multiple OpenClaw AI Agents with Elastic Scaling and Safe Defaults — without Managing Infrastructure