8/18/2021By Misbahul Munir8 min read1625 words

Distributed Transactions: 2PC & Saga Pattern

Software Architecture

microservices

event driven architecure

In a Microservices Architecture, each service typically manages its own database to ensure loose coupling and autonomy. However, this creates a challenge when a business process spans multiple services — how do you ensure data consistency across multiple databases? This is where distributed transactions come into play.

What is a Distributed Transaction?

A distributed transaction is a transaction that involves multiple independent resources or services, often with separate databases. The main goal is to ensure atomicity — either all operations succeed or all fail — even when spread across multiple services.

Why are Distributed Transactions Difficult in Microservices?

In a monolith:

You can use a traditional ACID-compliant transaction that spans multiple tables.

In microservices:

Services may use different databases, data stores, or even different technologies.
Services might run in different networks or regions.
Coordinating rollback across multiple services is hard.

Approaches to Distributed Transactions

1. Two-Phase Commit (2PC) – Not Recommended

A classic method using a transaction coordinator.
Phase 1: Ask each service to prepare to commit.
Phase 2: If all agree, send a commit; otherwise, send a rollback.
✅ Ensures strong consistency.
❌ Blocking, poor fault tolerance, performance bottlenecks.

Client → Transaction Coordinator (TC)

Then:

Phase 1 – Prepare

TC sends
prepare
to Hotel Booking Service

→ It reserves the hotel, logs it, but does not confirm

→ Sends vote: YES
TC sends
prepare
to Flight Booking Service

→ It reserves the flight seat, logs it, but does not confirm

→ Sends vote: YES
TC sends
prepare
to Payment Service

→ It verifies card + authorizes amount, but does not charge yet

→ Sends vote: YES

Phase 2 – Commit/Rollback

If all said YES: TC sends
commit
to all services → they finalize actions.
If any said NO: TC sends
rollback
→ each service undoes/cleans up.

This is model 2PC Phase 1 (prepare + vote) and Phase 2 (commit):

✔ Transaction Coordinator sends prepare and receives
YES
votes.
✔ Sends
COMMIT
to all services.
✔ Notification is sent after successful commits.

This reflects rollback triggered by any NO vote, as expected in 2PC:

✔ One service returns
NO
.
✔ Coordinator issues
ROLLBACK
to all.
✔ Notification Service is informed of failure.

Problems with 2PC

Services must support prepare/commit semantics.
Locks held during Phase 1 = low scalability.
If TC crashes mid-phase → recovery is hard.
Doesn't work well with external services/APIs (e.g., airline or bank APIs).

2. Saga Pattern – Recommended in Microservices

A Saga is a sequence of local transactions. Each step has a compensating action to undo its effect if the saga fails.

a. Orchestration (Command-Driven Saga)

A central service (orchestrator) coordinates the saga.
Sends commands to each service and awaits replies.
✅ Easier to understand, debug, monitor.
❌ Tightly couples logic to the orchestrator.

Flow in Orchestrated Saga

Let’s say we use an orchestrator service:

Step-by-Step:

Orchestrator → Hotel Booking Service
- Try booking hotel.
- If success, proceed.
- Else: exit saga.
Orchestrator → Flight Booking Service
- Try booking flight.
- If success, proceed.
- Else:
  - Send
    CancelHotel
    to Hotel Service.
  - End saga.
Orchestrator → Payment Service
- Charge customer.
- If success, proceed.
- Else:
  - CancelFlight
    ,
    CancelHotel
  - End saga.
Orchestrator → Notification Service
- Send booking confirmation.

If something fails, run Compensating Transactions

E.g., if payment fails:

Call
CancelFlightBooking()
Then
CancelHotelBooking()

Orchestrated Saga with sequential calls and success flow:

Orchestrator sends
Hotel → Flight → Payment
, waits for success at each step.
✔ Notification Service is only triggered at the end.

This is how orchestrator coordinates compensating actions (cancel steps) on failure:

✔ Hotel and Flight are successful.
❌ Payment fails → Orchestrator sends
cancel
commands to Flight and Hotel.
✔ Notification Service is informed of failure.

a. Choreography (Event-Driven Saga)

Services emit events and react to events.
No central orchestrator.
✅ Simple for small systems.
❌ Hard to monitor or manage flow in complex systems.

Flow in Choreographed Saga (Event-Based)

Each service emits events and listens to previous ones:

Hotel Booking emits
HotelBooked
Flight Booking listens to
HotelBooked
, does flight booking, emits
FlightBooked
Payment listens to
FlightBooked
, charges card, emits
PaymentSuccess
Notification listens to
PaymentSuccess
, sends confirmation

🧯 On failure, each emits a

Failed

event and downstream services trigger their own compensation.

Services are loosely coupled, react via events, success flow is clear:

✔ Each service emits a
SuccessEvent
to event bus (
HotelBooked
,
FlightBooked
,
PaymentSuccess
).
✔ Next service listens and triggers itself.
✔ Notification service listens at the end.

Failure propagates via event chain, and compensating events are triggered

A service (e.g. Payment) emits a
Failed
event.
✔ Downstream services listen and emit
Canceled
events.
✔ Notification Service responds to failure.

Compensating Transactions

Instead of rolling back a distributed transaction, a compensating transaction performs the logical opposite. For example:

Booking a flight → cancel the booking.
Charging a credit card → issue a refund.

This is eventual consistency — the system is temporarily inconsistent but will become consistent after the saga finishes.

Saga steps are typically sequential — especially when:

There's a dependency (e.g., don’t pay unless hotel+flight are available).
You want simpler rollback (e.g., cancel hotel only if payment failed).

You can do them in parallel if:

The operations are independent.
You're willing to tolerate a more complex compensation flow.

Patterns to Support Distributed Transactions

Idempotency: Ensure repeated requests (e.g., retries) don't cause duplicate side effects.
Outbox Pattern: Store events in a local table, then publish asynchronously to ensure atomicity.
Retry & Timeout Handling: Handle failures gracefully.
Message Deduplication: Prevent double processing in asynchronous flows.

When to Use Distributed Transactions?

When a business operation truly spans multiple services and must be consistent.
But prefer redefining service boundaries to minimize the need for them.

Summary

Aspect	2PC	Saga Pattern
Consistency	Strong	Eventual
Performance	Slow (blocking)	High (non-blocking)
Fault Tolerance	Poor	Good
Complexity	Simple to implement	Complex business logic
Best for	tightly coupled systems	microservices