Seven Key Distributed Systems Design Patterns
An overview of the Ambassador, Circuit Breaker, Pub/Sub, CQRS, Event Sourcing, Leader Election, and Sharding Patterns.
👋 Hi, this is Thomas, with a new issue of “Beyond Code: System Design and More”, where I geek out on all things system design, software architecture, distributed systems and… well, more.
As with many IT terms there is some ambiguity as to what exactly the term “design patterns” refers to. For example, it’s often used as a synonyms to “architecture styles”.
I’m opting to distinguish between these two terms and I define them as follows:
Architectural Design Styles are the overall structure and organization of a software system from a 10k feet view (i.e. monolithic architecture, microservices, etc.). This term answers the fundamental question of how the system components communicate, how data flows, and how the system is divided into modules or layers.
Architectural Design Patterns indicate proven, frequently used solutions to common problems related to data storage, messaging, system management, and compute capability. They allow developers to leverage best practices and avoid reinventing the wheel, streamlining the development process.
NB. In this article I focus on system architectural design patterns, which I consider different from software architectural design patterns (i.e. the latter are specific coding and implementation solutions for issues with the design of classes, objects, and their interactions).
Ambassador Pattern
The Ambassador design pattern utilizes a dedicated component, known as the ambassador, to act as a mediator for communication between services.
This pattern improves the system's resilience by implementing retry mechanisms and circuit breakers. The ambassador component can also handle tasks such as logging, monitoring, and retry handling, allowing individual services to focus on their core business logic. By employing the Ambassador pattern, developers can create more modular and maintainable systems.
Circuit Breaker Pattern
The Circuit Breaker pattern enhances fault tolerance and resilience by preventing failures from cascading through the system.
It monitors the number of failures within a given time frame and, if the number of failures exceeds a predefined threshold, redirects requests to a fallback mechanism for a specified timeout period before allowing requests to reach the system once again.
This pattern helps prevent the system from being overwhelmed by a high volume of failed requests, ensuring better stability and performance.
Publisher/Subscriber Pattern
The Publisher/Subscriber (Pub/Sub) pattern enables asynchronous communication between system components by decoupling message producers (publishers) from message consumers (subscribers).
Messages from publishers are routed and delivered to the appropriate subscribers via a message broker. This pattern promotes better scalability, flexibility, and modularity within distributed systems, making it easier to develop and maintain complex applications.
Command Query Responsibility Segregation (CQRS)
The CQRS pattern separates the read and write operations for data storage. The command model handles updates and writes, while the query model handles read operations.
By decoupling commands from queries, CQRS enables more robust and flexible system design, facilitates parallel development, and enhances performance, especially in complex and high-traffic applications.
Event Sourcing
Event Sourcing captures all changes to an application state as a sequence of events, rather than storing only the current state. Each event represents a state change and is stored immutably.
This pattern provides a detailed audit log and the ability to reconstruct past states. By replaying these events, the current state can be rebuilt at any point in time. Event Sourcing enhances traceability, simplifies debugging, and supports complex business processes with historical context.
Leader Election
The Leader Election pattern is used in distributed systems to designate a single node as the leader to coordinate tasks and manage state.
This ensures consistency and avoids conflicts. The leader node is dynamically chosen through an election process, which is typically triggered during system initialization or when the current leader fails.
This pattern improves fault tolerance and helps maintain system stability, as the elected leader can make authoritative decisions on behalf of the cluster.
Sharding
Sharding is a database partitioning technique that divides a large dataset into smaller, more manageable pieces called shards. Each shard is stored on a separate database server, which distributes the load and improves performance. This pattern enhances scalability and allows for parallel processing of queries. By evenly distributing data and queries across multiple servers, sharding minimizes response times and reduces the risk of bottlenecks, making it ideal for handling large-scale, high-traffic applications.
However, the data model must be able to support sharding and cross-shard operations can be more complex.
By integrating these design patterns into the planning and implementation phases of system development, developers can create more modular, maintainable, and resilient systems.
I originally wrote about this topic in this article:
I explored these topics:
Foundations of system analysis and design
System analysis phases and steps
System analysis and design patterns
Architecture evaluation
The CAP theorem
Best practices in system analysis
📚 Interesting Articles & Resources
“Distributed System Design Patterns” - ByteByteGo
6 minute video that covers the seven design patterns I listed above with visuals.
“Design Patterns” - Refactoring Guru
A complete list of software design patterns, which are reusable solutions to commonly occurring problems in software design- e.g. pre-made blueprints that you can customize to solve a recurring design problem in your code.
“5 Strategies for High-Availability Systems” - Saurabh Dashora
A well explained overview of Load Balancing, Data Redundancy with Isolation, Failover, Auto Scaling, and Rate Limiting.