Engineering posts about Pytorch
Curated summaries and key learnings for engineers working with Pytorch.
Google Tensor SDK Beta with LiteRT
The Google Tensor ML SDK has transitioned from an Experimental Access Program to Beta, enabling developers to leverage the capabilities of the Google Tensor System-on-Chip (SoC) and its dedicated...
Accelerating on-device AI: A look at Arm and Google AI Edge optimization
The article explores advancements in on-device AI through the integration of Arm's Scalable Matrix Extension 2 (SME2) and Google's AI Edge framework. It highlights how SME2 enhances CPU performance...
Announcing Genkit Middleware: Intercept, extend, and harden your agentic apps
The article introduces Genkit, an open-source framework designed for building full-stack, AI-powered applications across multiple programming languages, including TypeScript, Go, Dart, and Python. It...
Speeding Up AI: Bringing Google Colossus to PyTorch via GCSFS and Rapid Bucket
This article announces a significant performance enhancement for AI/ML workloads within the PyTorch ecosystem on Google Cloud, achieved through the integration of Rapid Storage powered by Google's...
How we built the most performant DeepSeek V3.2, MiniMax-M2.5 and Qwen 3.5 397B on DigitalOcean Serverless Inference
The article discusses the launch of DeepSeek V3.2, MiniMax-M2.5, and Qwen 3.5 397B on DigitalOcean's Serverless Inference platform, highlighting their performance benchmarks and the engineering...
Mastering the 600B+ Frontier: Optimizing Large Model Deployments on the Inference Cloud
The article explores the challenges and solutions associated with deploying large AI models, particularly those exceeding 600 billion parameters, in cloud environments. It highlights the importance...
TorchTPU: Running PyTorch Natively on TPUs at Google Scale
The article discusses TorchTPU, an integration that allows PyTorch to run natively on Google's Tensor Processing Units (TPUs). It emphasizes the challenges of modern AI infrastructure and the need...
ADK Go 1.0 Arrives!
ADK Go 1.0 introduces significant enhancements for developing AI agents, emphasizing observability, security, and extensibility. Key features include native integration with OpenTelemetry for tracing...
Powering the agents: Workers AI now runs large models, starting with Kimi K2.5
The article introduces Cloudflare's Workers AI platform, which now supports the Kimi K2.5 model, a large language model (LLM) designed for agentic tasks. It highlights the importance of a robust...
Introducing AI Runtime: Scalable, Serverless NVIDIA GPUs on Databricks for Training and Finetuning
The article introduces AI Runtime, a new offering from Databricks that enables scalable, serverless access to NVIDIA GPUs for training and fine-tuning various AI models, including computer vision and...
Announcing the Colab MCP Server: Connect Any AI Agent to Google Colab
The article introduces the Colab MCP Server, an open-source tool designed to enhance the integration of AI agents with Google Colab. By allowing any MCP-compatible agent to access Colab's cloud...
DigitalOcean at NVIDIA GTC 2026: Building the AI Factory for the Agentic Era
DigitalOcean is positioning itself as a leader in AI infrastructure by launching an AI Factory designed for dynamic, long-running agentic workflows. The partnership with NVIDIA aims to enhance...
Building the Spatial Interaction and Interface Frameworks for Specs
The article provides an in-depth exploration of the Spectacles Interaction Kit (SIK) and Spectacles UI Kit (UIKit), two frameworks designed for building spatial interactions and interfaces in...
A Declarative Standard for AI Agents
The article outlines the development of a declarative standard for AI agents, inspired by Kubernetes, to address the fragmentation and inefficiencies in agent implementation across different...
What's new in TensorFlow 2.21
TensorFlow 2.21 introduces significant enhancements, particularly with the LiteRT stack, which is designed for high-performance on-device inference. This new runtime offers improved GPU performance,...
Supercharge your AI agents: The New ADK Integrations Ecosystem
The article introduces significant enhancements to the Agent Development Kit (ADK), an open-source framework designed for building and deploying AI agents. It highlights new integrations with various...
depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers
The article introduces depyf, a tool designed to demystify the PyTorch compiler, which operates at the Python bytecode level. This tool allows machine learning researchers to decompile bytecode...
DigitalOcean Gradient™ AI GPU Droplets Optimized for Inference: Increasing Throughput at Lower the Cost
The article discusses the development of DigitalOcean's Inference Optimized Image for GPU Droplets, specifically designed to enhance the performance of large language model (LLM) inference. It...
Run Multiple OpenClaw AI Agents with Elastic Scaling and Safe Defaults — without Managing Infrastructure
The article discusses the deployment of OpenClaw, an open-source framework for building AI assistants, on DigitalOcean's App Platform. It highlights the challenges of managing multiple AI agents in...
LiteRT: The Universal Framework for On-Device AI
LiteRT is a modern on-device AI framework that builds upon the foundations of TensorFlow Lite, offering significant enhancements in performance, simplicity, and flexibility for deploying AI models...