Engineering posts about Prompt Engineering
Curated summaries and key learnings for engineers working with Prompt Engineering.
Accelerating LLM Inference with Prompt Caching for Open‑Source Models on Databricks
The article outlines the significance of prompt caching in accelerating inference for large language models (LLMs) on Databricks. It explains how repeated prompts can lead to inefficiencies in...
How to safeguard AI workloads with Unity AI Gateway Guardrails
The article outlines the importance of implementing guardrails in AI applications to protect sensitive information and ensure compliance with security standards. It details how Unity AI Gateway...
Databricks context engineer associate: the industry’s first certification for reliable AI agent systems
The article introduces the Databricks Certified Context Engineer Associate certification, the first of its kind aimed at enhancing the reliability of AI agent systems through effective context...
The JavaScript AI Build-a-thon Season 2 starts today!
The JavaScript AI Build-a-thon is a comprehensive program aimed at bridging the gap in AI development for JavaScript and TypeScript developers. Spanning four weeks, the event includes self-paced...
Securing MCP: A Control Plane for Agent Tool Execution
The Model Context Protocol (MCP) is emerging as a standard for AI agents to access tools, but it lacks governance mechanisms to ensure secure execution. This article outlines the risks associated...
Amazon Bedrock introduces new advanced prompt optimization and migration tool
Amazon Bedrock has introduced an advanced prompt optimization tool that allows users to enhance their prompts for various models simultaneously. This tool facilitates migration to new models or...
What the design-to-code loop unlocks
The article explores the evolving relationship between design and code facilitated by AI technologies, particularly within the Figma platform. It emphasizes how AI is transforming traditional...
Build Long-running AI agents that pause, resume, and never lose context with ADK
This article presents a comprehensive guide to building long-running AI agents that can pause, resume, and maintain context using the Agent Development Kit (ADK). It highlights the limitations of...
Generative AI for Business: A Complete Strategy and Implementation Guide
The article discusses the transformative potential of generative AI in business, highlighting its ability to create significant economic value across various sectors. It emphasizes the importance of...
LLM Vs AI: A Practical Guide to Differences, Use Cases, and Tools
This article serves as a comprehensive guide to understanding the distinctions between large language models (LLMs) and the broader field of artificial intelligence (AI). It outlines the scope, core...
PORTool: Importance-Aware Policy Optimization with Rewarded Tree for Multi-Tool-Integrated Reasoning
The article introduces PORTool, an importance-aware policy optimization algorithm designed for multi-tool-integrated reasoning in large language model (LLM) empowered agents. It addresses the...
Building with Gemini Embedding 2: Agentic multimodal RAG and beyond
The article introduces Gemini Embedding 2, a multimodal embedding model that integrates various data types, including text, images, video, and audio, into a unified embedding space. This model...
AI App Development: Guide To Building AI-Powered Apps
This article serves as a detailed guide for developers looking to build AI-powered applications, emphasizing the importance of structured planning and execution. It outlines the phases of AI app...
How to transform document activation workflows with Genie and Agent Bricks
The article outlines the challenges organizations face in managing document workflows, emphasizing the need for a unified data foundation to leverage AI effectively. It introduces Databricks'...
Production-Ready AI Agents: 5 Lessons from Refactoring a Monolith
The article outlines the challenges of developing production-ready AI agents, particularly focusing on the transition from monolithic architectures to orchestrated sub-agents. It details a case study...
Orchestrating AI Code Review at scale
The article discusses the implementation of an AI-driven code review system at Cloudflare, addressing the inefficiencies of traditional code review processes. By leveraging a composable plugin...
Introducing Genie Agent Mode
The article introduces Agent mode, a new feature in Genie that enhances data analysis capabilities by allowing users to ask complex questions and receive meaningful insights. Agent mode operates...
Building the foundation for running extra-large language models
The article discusses the foundational work required to run extra-large language models (LLMs) effectively, particularly focusing on Cloudflare's Workers AI platform. It highlights the challenges of...
Load Balancing and Scaling LLM Serving
The article explores the unique challenges of load balancing in large language model (LLM) serving, emphasizing the importance of prompt caching to optimize resource utilization and reduce latency....
The TL;DR on MCP: Why context matters and how to put it to work
The article introduces the Model Context Protocol (MCP), a framework designed to enhance the integration of design systems with AI tools, particularly in the context of product development. MCP...