Why Your Architecture Is Falling Apart: Drift vs. Erosion Explained
A deep dive into an often-overlooked dimension of technical debt.
👋 Hi, this is Thomas, with a new issue of “Beyond Code: System Design and More”, where I geek out on all things system design, software architecture, distributed systems and… well, more.
QUOTE OF THE WEEK:
“When building a software system, engineers recognize three fundamental truths:
(1) All software has an architecture, whether it is designed and implemented intentionally, or it emerges by coincidence.
(2) System complexity grows over time as requirements shift, use cases evolve, team members come and go, and technologies advance.
(3) Without deliberate management of the system’s evolution, its architecture may stray from its original objectives and blueprint, leading to unintended architectural drift.” - Vladi Stevanovic
In my previous issue, I explored the real causes behind technical debt. Today, let’s focus on a specific category that often flies under the radar: architectural technical debt.
Architectural debt refers to compromises and trade offs made in designing or evolving a system's architecture. Over time, these decisions compound, accruing “interest” in the form of increased complexity, reduced performance, scaling challenges, and maintainability issues.
For example, a decision to postpone implementing a modular design to save time, may later require significant refactoring when new features need to be added.
There are two subcategories in architectural technical debt: architectural drift and architectural erosion.
Architectural Drift: A Gradual Deviation
Architectural drift refers to the gradual, unintentional deviation of a software system's architecture from its original design over time.
These deviations create inconsistencies like tangled dependencies, redundant components, or misaligned coding practices. As undocumented decisions accumulate, the system becomes harder to maintain (and understand), and —if teams are relying on manual docs and diagrams—architecture plans or documentation no longer accurately reflect the state of the system.
Ultimately, this erodes trust in the architecture and its documentation.
Example:
Imagine an e-commerce platform initially designed for online transactions. As new features like inventory or warehouse management are added under tight deadlines, the system accumulates ad-hoc fixes, shortcuts, duplicate components, etc. While these changes don’t directly conflict with the core design, they lead to inefficiencies that hinder scalability and maintenance.
Architectural Erosion: Undermining the Foundations
Erosion is more severe than drift, as it directly compromises the architecture’s foundational principles. This happens when teams introduce tightly coupled modules, bypass security standards, or ignore scalability constraints, leading to a fragile, inefficient system.
Example:
For example, a microservices-based platform might start with isolated services communicating via APIs 👇
But urgent feature requests lead to shortcuts, such as services directly accessing each other’s databases, breaking modularity and causing ripple effects across the system. 👇
These shortcuts violate the fundamental principle of isolation and lead to tightly coupled services, in turn causing changes in one service to impact others. The result?Increased likelihood of system failures and bugs.
In Short: The Difference Between Drift and Erosion
Drift: Gradual misalignment with the intended design, introducing inefficiencies but maintaining the system's core integrity.
Erosion: Direct violations of architectural principles, leading to structural fragility and significant risks.
Architectural drift and erosion are natural byproducts of system evolution. By understanding their distinctions and adopting proactive measures, engineering teams can mitigate their effects: this isn’t just about technical decisions—it’s about empowering teams to align short-term goals with long-term system health.
I originally wrote about this topic in this article:
I explored these topics:
What is technical debt?
Types of technical debt (e.g. Architectural debt, Code-level debt, Test debt, Documentation debt)
Real-world technical debt example
📚 Interesting Articles & Resources
10 Data Structures That Make Databases Fast and Scalable -
Scalable databases rely on foundational data structures like B-Trees for balanced indexing, LSM trees for write-heavy loads, and hash-based structures for efficient lookups. The article explains how these structures optimize read and write operations, enabling high performance for large-scale systems.
Predicting the Future of Distributed Systems - Colin Breck
This article explores the trajectory of distributed systems, emphasizing the growing complexity in managing state, consistency, and coordination as systems scale. It highlights trends like serverless computing, edge architectures, and AI's role in optimizing distributed workflows.
What’s an event and what should be inside? -
Events are the backbone of event-driven systems, encapsulating state changes in distributed architectures. This article breaks down event structure, covering metadata like timestamps and payloads, and emphasizes designing events to be idempotent, structured, and self-contained for robust system interactions.