Building an AI Agent to Remove Feature Flags
Read Full ArticleSummary
The article outlines the creation of an AI agent designed to automate the removal of feature flags at Duolingo, leveraging Temporal for workflow orchestration and Codex CLI for AI-driven code manipulation. The agent operates by initiating workflows that interact with GitHub to clone repositories, execute tasks, and create pull requests. Key design considerations include the separation of activities within Temporal to maintain code accessibility and the challenges faced with Codex CLI's output format. The article emphasizes the rapid development cycle and the potential for future enhancements in agent capabilities.
Key Learnings
- 1Utilizing Temporal for orchestrating workflows allows for easy local testing and strong retry logic, essential for handling AI non-determinism.
- 2Codex CLI can be effectively used for agentic operations, although its current limitations in output formatting require careful prompt engineering.
- 3The design pattern of separating tasks into activities is crucial for maintaining code accessibility within Temporal's architecture.
- 4Rapid prototyping and iterative development can lead to quick successes in AI tool development, as demonstrated by the agent's deployment timeline.
- 5Future improvements should focus on integrating testing frameworks to ensure the reliability of code changes made by the AI agent.
Who Should Read This
Senior Software Engineers specializing in AI tool development and workflow automation
Test Your Knowledge
What are the trade-offs of using Temporal for workflow orchestration compared to traditional CI/CD tools like Jenkins?
How does the separation of activities in Temporal impact the overall design and execution of the AI agent?
What failure scenarios could arise from using Codex CLI in its current form, and how might they be mitigated?
Why is it important to sandbox the Codex CLI operations, and what are the implications of bypassing approvals?
How can the integration of testing frameworks enhance the robustness of the AI agent's output before creating pull requests?
Topics
More articles about Codex
Explore Codex engineering →More from Duolingo Engineering
View Duolingo engineering blogs →Solving database contention with optimistic locking
The article explores the challenges of database contention in the context of the Duolingo app's notification system, which faced delays due to locking issues during high traffic. It details the...
Automating Golden Path upgrades at scale: A journey from manual upgrades to an AI-powered workflow
The article outlines a project undertaken by the Engineering Studio team to automate the upgrade process of multiple Java services to adhere to a defined 'Golden Path' of technology standards. By...
Agentic Workflows: Scale AI Prompts Beyond Cursor—No Code Required
The article introduces 'agentic workflows' at Duolingo, designed to streamline the creation and deployment of AI-driven coding agents for routine tasks. These workflows enable users, including...
2025 Duolingo Highlights: our biggest leaps in learning, play, and connection
In 2025, Duolingo made significant strides in enhancing its platform, introducing a variety of new features and courses aimed at improving user engagement and learning outcomes. Notably, the launch...
Dear Duolingo: How do I support someone who’s learning a language?
The article provides advice on how to support someone who is learning a new language. Dr. Emilie Zuniga offers five practical tips for being a supportive language-learning companion: allowing the...