Engineering posts about Transformers
Curated summaries and key learnings for engineers working with Transformers.
How We Built DigitalOcean Inference Router
This article details the development and functionality of DigitalOcean's Inference Router, a system designed to optimize AI model selection based on specific task requirements. It highlights the...
Announcing Genkit Middleware: Intercept, extend, and harden your agentic apps
The article introduces Genkit, an open-source framework designed for building full-stack, AI-powered applications across multiple programming languages, including TypeScript, Go, Dart, and Python. It...
MCP Marketplace Brings Real-Time Intelligence to Agentic Applications
The MCP Marketplace serves as a pivotal platform for integrating real-time intelligence into agentic applications, allowing them to leverage external data sources to enhance decision-making...
How Superhuman and Databricks built a 200K QPS inference platform together
The article describes the collaboration between Superhuman and Databricks in developing a high-performance inference platform capable of handling over 200,000 queries per second (QPS) with stringent...
SpecMD: A Comprehensive Study on Speculative Expert Prefetching
The article presents SpecMD, a standardized framework designed for benchmarking caching strategies in Mixture-of-Experts (MoE) models. It highlights the importance of an expert caching mechanism to...
Databricks partners with OpenAI on GPT-5.5
Databricks has announced a partnership with OpenAI to leverage GPT-5.5, the latest iteration of their frontier model, which significantly enhances capabilities in enterprise tasks, including complex...
Orchestrating AI Code Review at scale
The article discusses the implementation of an AI-driven code review system at Cloudflare, addressing the inefficiencies of traditional code review processes. By leveraging a composable plugin...
Cloudflare’s AI Platform: an inference layer designed for agents
Cloudflare’s AI Platform introduces an inference layer that allows developers to access multiple AI models from various providers through a unified API. This platform addresses the challenges of...
Introducing Anthropic’s Claude Opus 4.7 model in Amazon Bedrock
The article introduces Anthropic's Claude Opus 4.7 model, which is integrated into Amazon Bedrock, emphasizing its advancements in coding, long-running tasks, and professional workflows. The model...
Subagents have arrived in Gemini CLI
The article introduces subagents in the Gemini CLI, a feature that allows the CLI to delegate complex tasks to specialized agents. Each subagent operates independently with its own context, tools,...
Turning prompts into five scalable workflows with Figma Weave
The article introduces Figma Weave, a platform that combines generative AI with professional editing tools to enhance media production workflows. It outlines five scalable workflows that leverage AI...
KernelEvolve: How Meta’s Ranking Engineer Agent Optimizes AI Infrastructure
KernelEvolve is an advanced AI system developed by Meta to optimize the performance of machine learning models across diverse hardware platforms, including NVIDIA and AMD GPUs, as well as Meta's...
ADK Go 1.0 Arrives!
ADK Go 1.0 introduces significant enhancements for developing AI agents, emphasizing observability, security, and extensibility. Key features include native integration with OpenTelemetry for tracing...
Closing the knowledge gap with agent skills
The article explores the limitations of large language models (LLMs) in keeping up with rapidly changing software engineering practices and introduces agent skills as a solution to close the...
Databricks recognized as a Gartner® Peer Insights™ Customers’ Choice for Analytics and BI
Databricks has been recognized as a Customers’ Choice in the Gartner Peer Insights Voice of the Customer for Analytics and Business Intelligence Platforms, achieving a high customer rating of 4.8 out...
Powering the agents: Workers AI now runs large models, starting with Kimi K2.5
The article introduces Cloudflare's Workers AI platform, which now supports the Kimi K2.5 model, a large language model (LLM) designed for agentic tasks. It highlights the importance of a robust...
How we optimized Dash's relevance judge with DSPy
The article details how Dropbox Dash optimized its relevance judging system using DSPy, an open-source framework designed for systematic prompt optimization. It highlights the challenges faced when...
Building the Spatial Interaction and Interface Frameworks for Specs
The article provides an in-depth exploration of the Spectacles Interaction Kit (SIK) and Spectacles UI Kit (UIKit), two frameworks designed for building spatial interactions and interfaces in...
DigitalOcean Gradient™ AI Platform Now Integrates with LlamaIndex
DigitalOcean has announced the integration of its Gradient AI Platform with LlamaIndex, a framework designed for building Retrieval-Augmented Generation (RAG) applications. This integration allows...
Run Multiple OpenClaw AI Agents with Elastic Scaling and Safe Defaults — without Managing Infrastructure
The article discusses the deployment of OpenClaw, an open-source framework for building AI assistants, on DigitalOcean's App Platform. It highlights the challenges of managing multiple AI agents in...