Architecture Decision Record
ADR-0042: Adopt Event Sourcing for Order Management
This sample demonstrates ADR documentation: context, decision drivers, options analysis, consequences, and implementation guidance.
Metadata
- Status: Accepted
- Date: July 15, 2024
- Decision makers: Platform Architecture Team
- Consulted: Backend Team, Data Team, Product
- Informed: Engineering All-Hands
- Supersedes: ADR-0018 (CRUD-based order storage)
Context
Our order management system currently uses a traditional CRUD approach with a relational database. As the business has grown, we've encountered several pain points:
- Audit requirements: Compliance now requires a complete history of all order state changes, including who made them and when
- Debugging complexity: When orders end up in unexpected states, we lack visibility into how they got there
- Analytics limitations: Business intelligence needs access to historical order states, not just current state
- Integration challenges: Multiple systems need to react to order changes, leading to tightly coupled synchronous calls
- Temporal queries: Customer support frequently needs to see "what did this order look like at time X?"
Current Architecture
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ API │────▶│ Order │────▶│ PostgreSQL │
│ Gateway │ │ Service │ │ (orders) │
└─────────────┘ └──────────────┘ └─────────────┘
│
┌──────┴──────┐
▼ ▼
┌──────────┐ ┌──────────┐
│ Payments │ │ Inventory│
│ Service │ │ Service │
└──────────┘ └──────────┘
Decision Drivers
- DR-1: Must maintain complete, immutable audit trail for compliance (SOC 2, PCI-DSS)
- DR-2: Must support temporal queries for customer support workflows
- DR-3: Should enable loosely coupled integration with downstream services
- DR-4: Should improve debugging capabilities for order state issues
- DR-5: Must not significantly degrade write latency (<100ms p99)
- DR-6: Should leverage existing team expertise where possible
Options Considered
Option A: Enhanced Audit Logging
Add a separate audit_log table that records all changes to the orders table via database triggers.
- Pros: Minimal changes to existing code; team already familiar with approach
- Cons: Doesn't solve integration coupling; audit table becomes very large; temporal queries require complex joins; triggers add write latency
Option B: Change Data Capture (CDC)
Use Debezium to capture changes from the orders table and publish to Kafka for downstream consumers.
- Pros: Decouples downstream services; works with existing schema; operational tooling exists
- Cons: CDC events reflect database rows, not business events; temporal queries still difficult; adds operational complexity (Kafka, Debezium, Schema Registry)
Option C: Event Sourcing
Store order state as a sequence of immutable events. Derive current state by replaying events.
- Pros: Complete audit trail by design; natural temporal queries; business-meaningful events; enables event-driven architecture; supports CQRS for read optimization
- Cons: Paradigm shift for team; eventual consistency complexity; requires event store infrastructure; migration effort
Option D: Bi-temporal Database
Use a database with native bi-temporal support (e.g., PostgreSQL with temporal tables extension).
- Pros: SQL-based; temporal queries built-in; minimal code changes
- Cons: Still CRUD-based (loses business event semantics); doesn't solve integration coupling; limited ecosystem support
Decision
We will adopt Event Sourcing (Option C) for the Order Management domain.
This decision best addresses our core requirements around audit trails, temporal queries, and loose coupling, while also providing a foundation for future event-driven capabilities.
Decision Matrix
Audit Temporal Loose Debug Latency Team
Trail Queries Coupling Support Impact Familiarity
────────────────────────────────────────────────────────────────────────────────
A: Audit Logging ◐ ○ ○ ◐ ◐ ●
B: CDC + Kafka ◐ ○ ● ◐ ● ◐
C: Event Sourcing ● ● ● ● ◐ ○
D: Bi-temporal DB ● ● ○ ◐ ● ◐
● Fully meets | ◐ Partially meets | ○ Does not meet
Architecture
Proposed Design
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ API │────▶│ Order │────▶│ EventStore │
│ Gateway │ │ Service │ │ (events) │
└─────────────┘ └──────────────┘ └─────────────┘
│ │
▼ │
┌──────────────┐ │
│ Read Model │◀───────────┘
│ Projector │ (subscribes)
└──────────────┘
│
▼
┌──────────────┐
│ PostgreSQL │
│ (read model) │
└──────────────┘
Event Bus (Kafka)
▲
│ (publishes)
│
┌──────┴──────┬──────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Payments │ │ Inventory│ │ Analytics│
│ Service │ │ Service │ │ Service │
└──────────┘ └──────────┘ └──────────┘
Event Schema
// Example events for an order lifecycle
{
"eventId": "evt_abc123",
"eventType": "OrderCreated",
"aggregateId": "order_xyz789",
"aggregateType": "Order",
"version": 1,
"timestamp": "2024-07-22T10:30:00Z",
"payload": {
"customerId": "cust_123",
"items": [
{"sku": "WIDGET-001", "quantity": 2, "price": 29.99}
],
"shippingAddress": {...}
},
"metadata": {
"correlationId": "req_abc",
"causationId": null,
"userId": "user_456"
}
}
// Subsequent events
OrderItemAdded (version 2)
PaymentReceived (version 3)
OrderShipped (version 4)
OrderDelivered (version 5)
Consequences
Positive
- Complete audit trail: Every state change is recorded as an immutable event with timestamp and actor
- Temporal queries: "Show me this order as of last Tuesday" becomes trivial—replay events up to that point
- Loose coupling: Downstream services subscribe to events; no synchronous dependencies
- Debugging: Can replay events to understand exactly how an order reached its current state
- Business insight: Events capture business intent, not just data changes
- Future flexibility: Can build new read models, analytics, or integrations by replaying history
Negative
- Learning curve: Team needs training on event sourcing patterns and eventual consistency
- Eventual consistency: Read models may be milliseconds behind; must design for this
- Schema evolution: Changing event schemas requires careful versioning strategy
- Storage growth: Event store grows indefinitely; need snapshot strategy for long-lived aggregates
- Operational complexity: New infrastructure components (event store, projectors, read models)
Risks and Mitigations
- Risk: Team unfamiliar with paradigm → Mitigation: Training sessions, pair programming, start with non-critical aggregate
- Risk: Event schema mistakes → Mitigation: Schema review process, use schema registry, design events carefully upfront
- Risk: Read model consistency issues → Mitigation: Implement idempotent projectors, monitor lag metrics
- Risk: Performance degradation → Mitigation: Implement snapshots for orders with >50 events, benchmark before launch
Implementation Plan
Phase 1: Foundation (Weeks 1-4)
- Set up EventStoreDB cluster in staging
- Implement event sourcing framework (aggregate base, event handlers)
- Create Order aggregate with core events
- Build read model projector
- Write migration tool for existing orders
Phase 2: Integration (Weeks 5-8)
- Integrate with Kafka for downstream event publishing
- Update Payments and Inventory services to consume events
- Implement temporal query API for customer support
- Load testing and performance tuning
Phase 3: Migration (Weeks 9-12)
- Migrate historical orders (bulk event import)
- Parallel run: write to both old and new systems
- Validate data consistency
- Cutover to event-sourced system
- Decommission legacy order tables
Compliance Notes
- Events are append-only and immutable, satisfying audit trail requirements
- Event metadata includes actor (userId), timestamp, and correlation IDs
- GDPR: Customer data in events will be encrypted; deletion handled via crypto-shredding
- Retention: Events retained for 7 years per financial regulations; archival strategy TBD
Related Decisions
- ADR-0018: CRUD-based order storage (superseded by this ADR)
- ADR-0035: Kafka as enterprise event bus
- ADR-0039: EventStoreDB for event storage
- ADR-0041: CQRS pattern for high-read services
References
- Martin Fowler: Event Sourcing
- Greg Young: CQRS and Event Sourcing
- Internal RFC: Event Sourcing Evaluation (2024-Q2)
- EventStoreDB documentation
Related Samples
This is a sample article to demonstrate how I write.