Atlassian
17 min read

Removing dependency tangles in the Atlassian Platform for increased reliability and recoverability

Read Full Article

Summary

The article outlines Atlassian's Continuous PaaS Recovery (CPR) program, which aims to enhance platform reliability and recoverability by addressing complex service dependencies. It details the identification and elimination of circular dependencies and other architectural tangles that impede recovery efforts. The CPR initiative involved surveying service owners, categorizing dependencies, and implementing a layered architecture to minimize hard dependencies. By rearchitecting the platform and fostering a culture of dependency awareness, Atlassian has significantly improved its cloud resilience and operational practices.

Key Learnings

  • 1Understanding the impact of circular dependencies on platform reliability and the necessity of addressing them for effective recovery.
  • 2The importance of categorizing dependencies into hard and soft types to prioritize risk reduction efforts.
  • 3Implementing a layered architecture to isolate dependencies and improve recoverability across services.
  • 4The role of education and cultural shifts in minimizing future dependency tangles within engineering teams.

Who Should Read This

Senior Platform Engineers focusing on enhancing service reliability and recoverability in cloud architectures.

Test Your Knowledge

?

What are the trade-offs involved in prioritizing hard dependencies over soft dependencies in a large-scale platform?

?

How does the layered architecture approach mitigate the risks associated with circular dependencies?

?

What specific strategies were employed to educate engineers about dependency risks and foster a culture of awareness?

?

In what scenarios might 'break glass' solutions be necessary, and how can they be integrated into normal operations?

?

What metrics or indicators can be used to assess the effectiveness of the CPR program in improving platform reliability?

Topics

Read Full Article at Atlassian