Engineering posts about Reinforcement Learning
Curated summaries and key learnings for engineers working with Reinforcement Learning.
Making User-Sequence Data More Cost-Efficient, Faster, and Easier to Use
This article discusses the redesign of a user-sequence platform aimed at improving the efficiency, speed, and usability of user data for machine learning applications. It addresses the challenges...
How Salesforce Built an AI Security Agent for Autonomous Threat Triage
The article outlines how Salesforce developed the SATA agent, an AI-driven system designed to enhance cybersecurity by autonomously triaging threats across complex environments. It highlights the...
From manual to autonomous: how AI agents are transforming electric grid operations
The electric utility industry is facing unprecedented operational challenges due to increasing demand and aging infrastructure, necessitating the adoption of AI agents to enhance grid reliability and...
Creating a Multi-Tenant AI Agent Platform Handling 7K+ Sessions Without Cross-Team Interference
The article outlines the development of the Bring Your Own Planner (BYOP), a multi-tenant AI agent platform designed to enhance team autonomy and scalability within Salesforce. It addresses the...
The JavaScript AI Build-a-thon Season 2 starts today!
The JavaScript AI Build-a-thon is a comprehensive program aimed at bridging the gap in AI development for JavaScript and TypeScript developers. Spanning four weeks, the event includes self-paced...
Reel Friends: Building Social Discovery that Scales to Billions
In the Meta Tech Podcast episode featuring Pascal Hartig, the engineering intricacies behind the 'Friend Bubbles' feature of Facebook Reels are explored. The discussion highlights the evolution of...
Build Long-running AI agents that pause, resume, and never lose context with ADK
This article presents a comprehensive guide to building long-running AI agents that can pause, resume, and maintain context using the Agent Development Kit (ADK). It highlights the limitations of...
Enhancing Ad Relevance: Integrating Real-Time Context into Sequential Recommender Models
The article presents a novel approach to enhancing ad relevance by integrating real-time context into sequential recommender models. It highlights the limitations of previous models that relied...
PORTool: Importance-Aware Policy Optimization with Rewarded Tree for Multi-Tool-Integrated Reasoning
The article introduces PORTool, an importance-aware policy optimization algorithm designed for multi-tool-integrated reasoning in large language model (LLM) empowered agents. It addresses the...
Reinforced Agent: Inference-Time Feedback for Tool-Calling Agents
The article introduces the concept of a Reinforced Agent that enhances tool-calling agents by incorporating inference-time feedback. This approach aims to address the limitations of traditional...
How AI-Driven Kubernetes Optimization Reclaimed Millions from 47% Idle Capacity
The article discusses Salesforce's challenges with infrastructure scaling on its Hyperforce platform, particularly regarding over-provisioning and idle capacity in Kubernetes services. It introduces...
DSO: Direct Steering Optimization for Bias Mitigation
The article presents Direct Steering Optimization (DSO), a novel approach aimed at mitigating bias in vision-language models (VLMs) and large language models (LLMs). It highlights the challenges...
From Clicks to Conversions: Architecting Shopping Conversion Candidate Generation at Pinterest
The article discusses Pinterest's development of a shopping conversion candidate generation model aimed at optimizing offsite conversion events, which are typically sparse and noisy. It details the...
Beyond the Abyss Project Poseidon’s Quest for Zero-Downtime Reliability
The article outlines the development of Project Poseidon, a predictive monitoring system designed to enhance reliability in large-scale cloud environments by leveraging machine learning and...
ParaRNN: Large-Scale Nonlinear RNNs, Trainable in Parallel
The article presents ParaRNN, a novel framework developed by Apple researchers that significantly enhances the training efficiency of Recurrent Neural Networks (RNNs) by enabling parallelization....
How to transform document activation workflows with Genie and Agent Bricks
The article outlines the challenges organizations face in managing document workflows, emphasizing the need for a unified data foundation to leverage AI effectively. It introduces Databricks'...
Are LLM agents good at join order optimization?
This article explores the innovative application of large language models (LLMs) in improving join order optimization in SQL queries, a long-standing challenge in database management. Traditional...
Production-Ready AI Agents: 5 Lessons from Refactoring a Monolith
The article outlines the challenges of developing production-ready AI agents, particularly focusing on the transition from monolithic architectures to orchestrated sub-agents. It details a case study...
Agents that remember: introducing Agent Memory
The article introduces Agent Memory, a managed service designed to enhance AI agents by providing them with persistent memory capabilities. This service addresses the challenge of context management...
International Conference on Learning Representations (ICLR) 2026
The International Conference on Learning Representations (ICLR) 2026 showcases significant advancements in deep learning research, with Apple presenting multiple papers and technical demos. The...