Distributed Transaction Patterns

Handling transactions correctly is definitely not an easy task. Adding a distributed part there will for sure not make it easier. In my opinion distributed transactions are in fact a thing that makes distributed systems so complex. Thus, distributed transactions patterns are probably the most important topic to cover in a distributed system.

In this blog I will cover the theory behind distributed transactions, patterns to handle them, when and why you should follow a particular way. As I have already covered distributed systems in great lengths in my previous articles, I will not spend too much time on them here.

Nonetheless, if you would like to get more familiar with them before starting this text is a good starting point for you.

What Is A Distributed Transaction?

I would say it is just a normal transaction but well – distributed. Jokes aside, this statement is both true and not descriptive enough at the same time. In my opinion the following is the best definition of Distributed Transaction.

Distributed Transaction is a single logical unit, which spans multiple components (several databases or microservices) and aims to keep their state consistent despite the failures.

The goal is to keep the whole system consistent: either all the updates take effect, or none – atomicity. We have to avoid “half‑done” outcomes like charging a customer but not creating their order.

This – despite the failures – is the root of all the problems. When one starts to think of this part we start to notice a multitude of things that may break and each of these failures is making managing the transitions harder and harder. Getting several independent components to either commit or even rollback together may prove impossible.

Core traits of Distributed Transactions:

  • Atomicity requirement
  • Global consistency
  • Partial progress

Core challenges in Distributed Transactions:

  • Failures are normal
  • Coordination is slow
  • There is no such thing like Exactly‑once
  • Rollbacks are non-trivial

Distributed Transaction Patterns

Let's start our dive into the patterns from plain old 2PC.

2PC

Two Phase Commit or 2PC is probably the oldest way to handle distributed transactions. It is a distributed atomic commit protocol that makes a single transaction spanning multiple resources (databases/services) either commit everywhere or roll back everywhere. It does this by using a coordinator that talks to all participants in two ordered phases: prepare and commit (or abort).

The phases look like the following:

  • Phase 1 – The Prepare Phase – the transaction coordinator starts the process by sending a prepared request to all participating nodes. Each participant checks if they can complete the transaction and responds to the coordinator with a vote. There are two possible responses: YES and NO.
  • Phase 2 – The Commit/Abort Phase – the coordinator collects the votes from all participants. If all participants voted YES, the coordinator decides to commit the transaction. If any participant voted NO, the coordinator decides to abort the transaction. Based on the votes, the coordinator sends commit or abort requests to all the participants. Each participant performs the required action and releases any acquired locks. After that, each participant has to inform the coordinator that it has completed the commit or abort operation:

While it sounds pretty simple, 2PC actually has a lot of drawbacks. The most significant one is that 2PC is a completely blocking pattern.

Each of the participants need to lock all the resources involved in the transaction during the Prepare Phase. These resources are only released after the Commit Phase, what may take a while – the time increases with the increase in the number of transaction participants.

As you may have already guessed, such an approach will not scale no matter how hard we try.

Another drawback is the 2PC coordinator itself. It introduces the deliberate single point of failure into the system, but compared to the problems introduced by locking this one lesser of two evils.

Outbox pattern

Outbox pattern or transactional outbox pattern is probably the simplest pattern here. It is a reliability related pattern that lets you update your local database and publish an event/message as a single atomic action.

Of course, event publishing is just an example. Transaction outbox can be used for any other operation following the same pattern – local ACID transaction mixed with the external call. For example sending HTTP requests to other services.

To work properly the transaction outbox requires two components:

  • Local store – handles normal service related writes and is used to store outbox rows describing the event/message (type, payload, other metadata) to send. The crucial part is that saving of both the data and related outbox events must be done in a single DB transaction.
  • Background publishing job – repeatedly fetch unsent outbox rows and publish them usually with some kind of retry mechanism. After successfully publishing the event the job marks the related outbox row as publish/sent, effectively excluding it from further processing.

Following this two-step pattern we can easily cover a distributed transactions like problem without having to deal with all of its challenges.

To add a pinch of salt to this tasty cake, this pattern is not without difficulties. Higher cognitive complexity is chief among them.

As for the other disadvantages we have:

  • Eventual publish – potential delay between DB commit and event publishing
  • Additional boilerplate – managing new DB model and background job

Inbox pattern

Inbox pattern (Idempotent Consumer) is, as the name suggests, stands in opposition to outbox pattern. At least in terms of naming, as conceptually it is also the same.

The inbox pattern is also focused on reliability, but it serves a somewhat different purpose. In case of outbox pattern the problem in question was: local ACID transaction mixed with the external call

In the case of inbox pattern the problem in question is: fault prove processing of incoming messages. Simpling ACK-ing the message does not provide enough guarantee for reliability.

Even if we move ACK-ing of the message to the last step of processing it – after all needed side effects and faulty steps were done. We still may end up with partial results and potential deduplication issues. Not to mention that the producer will probably try to deliver a message at least once again during the time we process it. Depends how long processing it may take on our side.

Same as in the Transactional Outbox the Inbox also consists of two steps working in similar fashion. We have:

  • The first step – we save just "inbox row" and all related data to local data and ACK the message. The important part is that we do not perform any side effects and operations on our side in this step.
  • The second step – Background job periodical picks and process the rows from the "inbox" table. Same as in the case of outbox pattern after successful processing the job marks related outbox row as publish/sent. Excluding it from further processing.

Due to the small difference between an inbox and outbox patterns the pros and cons for both of them are almost similar. You can find it in the outbox pattern section or in the summary table at the end of this text.

Saga

Saga is probably the most famous of the microservice patterns. What is more, describing Saga is probably one of the 50 most FAQ on senior software engineer interviews. Saga pattern describes the way to implement a long‑running transaction that spans multiple services without distributed locks or 2PC. Making it right away picks for almost any use case in a system following microservice architecture.

In Saga each transaction is split into a sequence of locally executed transactions. Each component involved in Saga executes its own local transaction. If any of the steps in Saga fails all the participation up to that point must execute compensating translations. Compensating transaction is a transaction that effectively rolls back the change made in normal Saga transaction of this step.

Think of it as the following: Saga consists of four participants A, B, C, and D.
Participant A done, Participant B done, Participant C fails thus Participant B runs compensating transaction then Participant A runs compensating transaction

It sounds nice and easy but in fact implementing a correct Saga can generate an infinite number of headaches. Let's consider a few things:

  • What to do if running the compensating transaction for Participant B fails. Should we retry right away or should we wait?
  • What if the next Saga already started with data in A that would be rolled back?
  • What was the previous step in Saga was business critical – like payment.

The list goes on and one, in Saga we can see an eventual consistency at its finest – or lowest depends on who you ask. Despite these problems Saga still keeps the system in coherence state, and provides ACID like behavior.

There are two main approaches to building Sagas: Orchestration and Choreography. Let's dive into both of them starting from Orchestration.

Orchestration

In Orchestration based Sagas we have the central brain cell – the Orchestrator. It essentially tells each service what to do. Either by triggering normal or compensating transactions and waiting for replies. Besides that, the orchestrator also keeps track of Saga state. We can think of Orchestrator based Sagas as of Finite State Machines with each step representing a new state inside the machine.

Choreography

In Choreography based Sagas there is no central processor. The Saga’s progress is the composition of independent handlers. Usually Saga participants communicate via events to exchange information about the result of their step. For choreography based Sagas, we can also use state machines like analogy. However, the closest one would be a distributed state machine.

Practical takeaways:

  • orchestration – complex/branching flows, many compensations, strict SLAs
  • choreography – small/linear flows

Consensus Commit

Consensus commit is quite similar in its inception to 2PC. However, it is much more reliable and fault tolerant. Instead of relying on locks, blocking communication and asking every participant twice as in 2PC.

The system writes the commit decision itself into a replicated log protected by a consensus algorithm such as Paxos or Raft. Once the transaction result is accepted by the quorum, the transaction is considered committed everywhere. Typically, the quorum size is equal to n/2+1 where n is the number of participants.

What is more, we can tune the performance of Consensus Commit – at least to a degree – by adjusting the quorum size. To simplify it greatly the bigger the quorum size the longer it takes to commit a change. On the other hand the smaller the quorum the faster we are able to handle commits.

The main pros of this pattern are:

  • High reliability – event in case of leader failure we are still able to access data,
  • Non-blocking nature – locking is much more fine grained then in case of 2PC.

Nonetheless, it is probably the most complex of all the patterns here. Carrying the biggest ops overhead even bigger then Orchestrator based Sagas. Despite existing Raft or Paxos implementations, fine tuning and managing such a system is still not an easy task.

TCC – Try, Confirm, Cancel

Next we have TCC or Try, Confirm Cancel. In this approach we split the work into three idempotent operations each participant must support. Every operation forms a separate phase in the overall pattern.

Going from the start:

  • Try – each participant performs all necessary validation of the operation and reserves required resources (pseudo-isolation from ACID). Making no changes visible to the user.
  • Confirm – if the Try phase was successful the participant actually performed the operation, using the resources reserved in Try phase. This phase needs to be idempotent as we should be able to retry it in case of failure.
  • Cancel – this phase is a kind of catch-finally block. If for some reason one of the previous phases failed we release all the reserved resources and do the cleanup. Same as in the case of Confirm this phase should also be idempotent to be easily retryable.

The main selling point of this approach is – non-blocking resource reservation. It scales much better than 2PC while providing acceptable isolation guarantees. On the other hand the main drawback comes from idempotency of the steps, which usually is not a feat we can achieve easily.

Which, When and Why

PatternProsConsShines When…Avoid When…
2-Phase Commit (2PC)• Provides strong atomicity across participants
• Simple mental model
• Coordinator is a single point of failure
• Participants hold locks until commit/abort
• Few participants on a "reliable" network
• Strict consistency is a hard requirement
• Services need high availability or reliability
• Workflows span many microservices
Outbox Pattern• Guarantees “external request + DB write” happen together
• Leverages existing DB durability
• Extra storage and cleanup job
• Adds latency between write and read
• Possible doubling of data in flight
• A service must execute an action whenever it commits to its own DB• Very high throughput filling up the storage
Inbox Pattern• Easy failure recovery—replay is safe
• Localizes idempotency logic inside the consumer
• Extra storage and cleanup job
• Adds latency between write and read
• Possible doubling of data in flight
• Downstream service need at most guarantee once processing• Occasional duplicates are acceptable
• Message volume is extreme
• Deduplication cost dominates
Saga Orchestration• Single place to model long-running workflow
• Easier to enforce ordering, time-outs, retries
• Clear monitoring/tracing and compensation logic
• Orchestrator can become bottleneck or SPOF
• Tighter coupling between orchestrator and steps
• More chatty traffic (request/response per step)
• Complex, multi-step business transactions with strict order• You favor full service autonomy and loose coupling
• Workflows are small/simple enough that events alone suffice
Saga Choreography• No central brain—high service independence
• Naturally scales with number of services
• Aligns with event-driven design
• Harder to see “the whole transaction”
• Risk of event storms or cyclic dependencies
• Compensation logic scattered across services
• Each step can act on its own event and publish next
• Business flow is straightforward and tolerant of eventual consistency
• Roll-back rules are intricate, or you need global ordering/time-outs
• You need a single view of progress for auditing/regulation
Consensus Commit• Non-blocking
• Tolerates coordinator/replica failure
• Keeps strong consistency across shards/regions
• Very complex to build/operate
• Higher write latency
• Often ties you to a specific vendor
• Cross-shard, strongly-consistent transactions are essential• Eventual consistency is acceptable
• Simpler techniques are good enough
TCC• No locks during long-running work
• High availability
• Flexible, fine-grained compensation per service
• Operations are idempotent by design
• Every service must expose compatible APIs
• Window of “tentative” state until confirm
• Complex to reason about cascade of cancels
• You need and can to reserve resources then finalize later
• Latencies are human-scale not "real-time"
• Actions can’t be safely compensated

Summary

With this big (hopefully readable) table, it is time to part ways for today. However, before we move forwards some takeaways for you:

  • Distributed Transactions are really hard.
  • 2PC is blocking and will kill your availability.
  • Outbox and Inbox patterns will help us greatly in certain use-cases.
  • Orchestration Sagas for complex/branching flows, many compensations, strict SLAs.
  • Choreography Sagas for small/linear flows.
  • TCC is much simpler to operate than Consensus Commit.
  • When you need high availability use TCC or Consensus Commit instead of 2PC.

Thank you for your time.

Might interest you:

Table of Contents