Engineering posts about Transformers

Curated summaries and key learnings for engineers working with Transformers.

DigitalOcean
17m

How We Built DigitalOcean Inference Router

This article details the development and functionality of DigitalOcean's Inference Router, a system designed to optimize AI model selection based on specific task requirements. It highlights the...

Google
5m

Announcing Genkit Middleware: Intercept, extend, and harden your agentic apps

The article introduces Genkit, an open-source framework designed for building full-stack, AI-powered applications across multiple programming languages, including TypeScript, Go, Dart, and Python. It...

Databricks
6m

MCP Marketplace Brings Real-Time Intelligence to Agentic Applications

The MCP Marketplace serves as a pivotal platform for integrating real-time intelligence into agentic applications, allowing them to leverage external data sources to enhance decision-making...

Databricks
9m

How Superhuman and Databricks built a 200K QPS inference platform together

The article describes the collaboration between Superhuman and Databricks in developing a high-performance inference platform capable of handling over 200,000 queries per second (QPS) with stringent...

Apple
3m

SpecMD: A Comprehensive Study on Speculative Expert Prefetching

The article presents SpecMD, a standardized framework designed for benchmarking caching strategies in Mixture-of-Experts (MoE) models. It highlights the importance of an expert caching mechanism to...

Databricks
3m

Databricks partners with OpenAI on GPT-5.5

Databricks has announced a partnership with OpenAI to leverage GPT-5.5, the latest iteration of their frontier model, which significantly enhances capabilities in enterprise tasks, including complex...

Cloudflare
26m

Orchestrating AI Code Review at scale

The article discusses the implementation of an AI-driven code review system at Cloudflare, addressing the inefficiencies of traditional code review processes. By leveraging a composable plugin...

Cloudflare
8m

Cloudflare’s AI Platform: an inference layer designed for agents

Cloudflare’s AI Platform introduces an inference layer that allows developers to access multiple AI models from various providers through a unified API. This platform addresses the challenges of...

AWS
5m

Introducing Anthropic’s Claude Opus 4.7 model in Amazon Bedrock

The article introduces Anthropic's Claude Opus 4.7 model, which is integrated into Amazon Bedrock, emphasizing its advancements in coding, long-running tasks, and professional workflows. The model...

Google
6m

Subagents have arrived in Gemini CLI

The article introduces subagents in the Gemini CLI, a feature that allows the CLI to delegate complex tasks to specialized agents. Each subagent operates independently with its own context, tools,...

Figma
10m

Turning prompts into five scalable workflows with Figma Weave

The article introduces Figma Weave, a platform that combines generative AI with professional editing tools to enhance media production workflows. It outlines five scalable workflows that leverage AI...

Meta (Facebook)
17m

KernelEvolve: How Meta’s Ranking Engineer Agent Optimizes AI Infrastructure

KernelEvolve is an advanced AI system developed by Meta to optimize the performance of machine learning models across diverse hardware platforms, including NVIDIA and AMD GPUs, as well as Meta's...

Google
4m

ADK Go 1.0 Arrives!

ADK Go 1.0 introduces significant enhancements for developing AI agents, emphasizing observability, security, and extensibility. Key features include native integration with OpenTelemetry for tracing...

Google
4m

Closing the knowledge gap with agent skills

The article explores the limitations of large language models (LLMs) in keeping up with rapidly changing software engineering practices and introduces agent skills as a solution to close the...

Databricks
7m

Databricks recognized as a Gartner® Peer Insights™ Customers’ Choice for Analytics and BI

Databricks has been recognized as a Customers’ Choice in the Gartner Peer Insights Voice of the Customer for Analytics and Business Intelligence Platforms, achieving a high customer rating of 4.8 out...

Cloudflare
11m

Powering the agents: Workers AI now runs large models, starting with Kimi K2.5

The article introduces Cloudflare's Workers AI platform, which now supports the Kimi K2.5 model, a large language model (LLM) designed for agentic tasks. It highlights the importance of a robust...

Dropbox
13m

How we optimized Dash's relevance judge with DSPy

The article details how Dropbox Dash optimized its relevance judging system using DSPy, an open-source framework designed for systematic prompt optimization. It highlights the challenges faced when...

Snap (Snapchat)
12m

Building the Spatial Interaction and Interface Frameworks for Specs

The article provides an in-depth exploration of the Spectacles Interaction Kit (SIK) and Spectacles UI Kit (UIKit), two frameworks designed for building spatial interactions and interfaces in...

DigitalOcean
3m

DigitalOcean Gradient™ AI Platform Now Integrates with LlamaIndex

DigitalOcean has announced the integration of its Gradient AI Platform with LlamaIndex, a framework designed for building Retrieval-Augmented Generation (RAG) applications. This integration allows...

DigitalOcean
7m

Run Multiple OpenClaw AI Agents with Elastic Scaling and Safe Defaults — without Managing Infrastructure

The article discusses the deployment of OpenClaw, an open-source framework for building AI assistants, on DigitalOcean's App Platform. It highlights the challenges of managing multiple AI agents in...