Pull request intervention for infrastructure-as-code risks with Bitbucket custom merge checks

Summary

The article discusses Atlassian's approach to mitigating risks associated with infrastructure-as-code through the implementation of Bitbucket custom merge checks. It highlights the importance of progressive deployments to control the impact of changes and the necessity of monitoring key metrics to ensure reliability. The use of anomaly detection and a structured review process for pull requests allows for early identification of potential issues, thereby reducing the likelihood of service disruptions. The integration of these checks into the developer workflow aims to enhance the reliability of deployments while minimizing friction in the development process.

Key Learnings

1Implementing progressive deployments can significantly reduce the blast radius of changes in a cloud-native environment.
2Custom merge checks can effectively identify and mitigate known risks associated with infrastructure-as-code changes before they are merged.
3Anomaly detection is critical for monitoring the impact of changes and automating rollbacks in case of failures.
4Understanding the trade-offs between risk management and developer experience is essential for maintaining a productive workflow.
5Regular audits of known risks can help in refining deployment strategies and improving overall system reliability.

Who Should Read This

Senior Site Reliability Engineers implementing infrastructure-as-code practices to enhance deployment reliability and manage risks effectively.

Test Your Knowledge

What are the key characteristics that define a 'Known Risk' in the context of infrastructure-as-code?

How does the integration of Bitbucket custom merge checks enhance the reliability of deployments at Atlassian?

What trade-offs must be considered when implementing progressive deployment strategies for less frequent changes?

In what ways can anomaly detection contribute to minimizing the impact of change-related incidents?

Why is it important to balance risk oversight with developer experience in a high-frequency deployment environment?

Topics

Incident Management Service Level Objectives Resilience Engineering Continuous Integration Deployment

Read Full Article at Atlassian

More from Atlassian Engineering

View Atlassian engineering blogs →

Atlassian

13m

Scaling Jira cloud Migrations, One Bottleneck at a Time

The article chronicles the Jira Migrations team's journey in scaling their migration platform from handling 20,000 to 50,000 Monthly Paid Enabled Users (PEUs). It discusses the transition from an...

Atlassian

14m

23m

How We Unlocked Performance at Scale with Jira Platform

The article discusses the significant rearchitecture of the Jira Cloud platform, transitioning from a single-tenant database to a cloud-native, multi-tenant architecture designed for scalability,...

Pull request intervention for infrastructure-as-code risks with Bitbucket custom merge checks

Summary

Key Learnings

Who Should Read This

Test Your Knowledge

Topics

More articles about Incident Management

Cloudflare outage on February 20, 2026

2025 Q4 DDoS threat report: A record-setting 31.4 Tbps attack caps a year of massive DDoS assaults

Route leak incident on January 22, 2026

When protections outlive their purpose: A lesson on managing defense systems at scale

Securing the Grid: A Practical Guide to Cyber Analytics for Energy & Utilities

More from Atlassian Engineering

Scaling Jira cloud Migrations, One Bottleneck at a Time

How we catch and mitigate performance regressions at scale in Jira Cloud

Get started on your work 30% faster with Rovo in Jira

How Rovo solves search challenges through entity linking

How We Unlocked Performance at Scale with Jira Platform

Related topics