Pull request intervention for infrastructure-as-code risks with Bitbucket custom merge checks
Read Full ArticleSummary
The article discusses Atlassian's approach to mitigating risks associated with infrastructure-as-code through the implementation of Bitbucket custom merge checks. It highlights the importance of progressive deployments to control the impact of changes and the necessity of monitoring key metrics to ensure reliability. The use of anomaly detection and a structured review process for pull requests allows for early identification of potential issues, thereby reducing the likelihood of service disruptions. The integration of these checks into the developer workflow aims to enhance the reliability of deployments while minimizing friction in the development process.
Key Learnings
- 1Implementing progressive deployments can significantly reduce the blast radius of changes in a cloud-native environment.
- 2Custom merge checks can effectively identify and mitigate known risks associated with infrastructure-as-code changes before they are merged.
- 3Anomaly detection is critical for monitoring the impact of changes and automating rollbacks in case of failures.
- 4Understanding the trade-offs between risk management and developer experience is essential for maintaining a productive workflow.
- 5Regular audits of known risks can help in refining deployment strategies and improving overall system reliability.
Who Should Read This
Senior Site Reliability Engineers implementing infrastructure-as-code practices to enhance deployment reliability and manage risks effectively.
Test Your Knowledge
What are the key characteristics that define a 'Known Risk' in the context of infrastructure-as-code?
How does the integration of Bitbucket custom merge checks enhance the reliability of deployments at Atlassian?
What trade-offs must be considered when implementing progressive deployment strategies for less frequent changes?
In what ways can anomaly detection contribute to minimizing the impact of change-related incidents?
Why is it important to balance risk oversight with developer experience in a high-frequency deployment environment?
Topics
More articles about Incident Management
Explore Incident Management engineering →Cloudflare outage on February 20, 2026
On February 20, 2026, Cloudflare experienced a significant outage affecting customers using its Bring Your Own IP (BYOIP) service due to a misconfiguration in the Border Gateway Protocol (BGP)...
2025 Q4 DDoS threat report: A record-setting 31.4 Tbps attack caps a year of massive DDoS assaults
The 2025 Q4 DDoS threat report by Cloudflare reveals a significant escalation in DDoS attacks, with a record-setting attack of 31.4 Tbps marking a year of unprecedented assaults. The report...
Route leak incident on January 22, 2026
On January 22, 2026, a misconfiguration in Cloudflare's routing policy led to a significant BGP route leak, affecting both Cloudflare customers and external networks. The incident, which lasted 25...
When protections outlive their purpose: A lesson on managing defense systems at scale
The article outlines the challenges faced by GitHub in managing defense mechanisms that protect the platform from abuse while ensuring legitimate users are not adversely affected. It highlights the...
Securing the Grid: A Practical Guide to Cyber Analytics for Energy & Utilities
The article outlines the critical cybersecurity challenges faced by the Energy & Utilities sector, particularly due to the convergence of IT and operational technology (OT) systems. It emphasizes the...
More from Atlassian Engineering
View Atlassian engineering blogs →Scaling Jira cloud Migrations, One Bottleneck at a Time
The article chronicles the Jira Migrations team's journey in scaling their migration platform from handling 20,000 to 50,000 Monthly Paid Enabled Users (PEUs). It discusses the transition from an...
How we catch and mitigate performance regressions at scale in Jira Cloud
The article discusses the complexities of detecting and mitigating performance regressions in Jira Cloud, a multi-tenant product. It highlights the challenges posed by diverse tenant configurations...
Get started on your work 30% faster with Rovo in Jira
The article discusses the implementation and analysis of Rovo, an AI tool integrated within Jira, aimed at enhancing user productivity. It presents a quasi-experimental study comparing two cohorts of...
How Rovo solves search challenges through entity linking
The article discusses how Atlassian addresses search challenges through advanced entity linking, transforming unstructured text into actionable knowledge. It highlights the importance of accurately...
How We Unlocked Performance at Scale with Jira Platform
The article discusses the significant rearchitecture of the Jira Cloud platform, transitioning from a single-tenant database to a cloud-native, multi-tenant architecture designed for scalability,...