There are two main approaches to handling transactions: either we follow ACID or BASE principles. All other approaches are just variations of the two; we can even say that, to a certain degree, BASE is a variation of ACID. Furthermore, some databases may pick to support ACID transactions for part of operations, while not providing the same quarantine for others – just like MongoDB here.
In today’s text, I will cover the description of both abbreviations and their use cases, closing with an in-depth summary of the differences between them. For now, let’s say that the biggest difference between the two is: that ACID prioritizes consistency over availability, while BASE prioritizes availability over consistency.
Why Databases Are Important?
Databases are all around us, just as distributed systems. Whether a particular database is more of a relational or non-relational kind does not matter, as we barely see them. We only see our graph of friends on Facebook, our history of messages in Messenger, or “just” our money in a bank account. We do not see all the processing and storage behind them. Essentially, we are focused on some pretty things that are not always stored and processed in such a pretty fashion.
Transactions, and more specifically the transaction models that we will cover today, are the cornerstone of how our data are being processed and to a degree stored. But what is even a transaction? The definition here is straightforward. From the database perspective, the transaction is any set of operations that can be performed as a single logical unit of work. Besides the classic case of bank transfer between accounts, processing the order, or submitting the review can also be valid transaction examples.
Transaction models are one of the most important things for database engines. It is a kind of very low level (or high level) API of our database. Following a particular set of principles will result in our database exposing a certain set of features and behaviors. In the end, a particular approach will limit its use cases and make our system more complex than it should be. Knowing the trade-offs of transaction models can be viewed as an advantage, as we can pick the best tool for a job.
Moreover, while ACID is quite common and most software engineers have at least heard of it, BASE is quite the opposite – and knowledge is power.
What Is ACID?
ACID is an abbreviation for Atomicity, Consistency, Isolation, Durability.
Atomicity
A transaction must always succeed or fail, with no intermediate results. If a transaction contains any changes of state in multiple tables, then in case of failure none of them should be persisted.
Following the example from above, assume that we want to process an order for a customer. To do this, we need to perform two operations: update the number of items from the order and create a new order record. If we fail on the second task of the transaction, then the results of the first should be rolled back.
For example, if after updating the item quantity we fail to create a new order record in our database, then the first part of the transaction should be rolled back – item quantity should remain unchanged.
Consistency
Each transaction should preserve all integrity constraints, entity relationships, and business rules present in the database. All these mechanisms should work regardless of the number of concurrent transactions or possible failures that may occur in the meantime.
It means that any transaction should move the database from one valid version of state to the other.
For example, before the order creation, the quantity of items was: A=7, B=10, and the customer ordered 5 pieces of A and 3 pieces of B. Then after the transaction, we should have 2 pieces of A and 7 pieces of B in store. The total amount after the transaction should be the same as before.
Isolation
In a concurrent environment, transactions should not interfere with one another. It means that changes from one transaction will not be visible to another transaction before it commits. Such an approach gives us the impression that the transactions are executed one by one, while underneath they are executed simultaneously.
Additionally, there is one more thing that needs to be mentioned here: Isolation Level. It is a database-wide setting that describes how much a transaction is affected by other transactions. In most of the databases, we can tune up (or down) the Isolation level. Typically, the higher we go up with Isolations, the more we degrade the database performance, and vice versa.
The official SQL documentation describes four different Isolation Levels:
- READ UNCOMMITTED – Transaction can view data changed but yet uncommitted by other transactions. In such a case, when rollback occurs, we might use data that does not exist in our database. Usually, you should not use this Isolation Level. It only makes sense when you are querying a dataset that will not change at all in any way.
- READ COMMITTED – Transactions can only read data that are committed at the moment they are read. It offers a good balance between consistency and performance in most of the use cases. Thus, it is the most common default Isolation Level used in multiple databases.
- REPEATABLE READ – This Isolation Level aims to address the issue of different read values of the same queries within a single transaction. It is an ideal Isolation Level for read-only transactions, as it guarantees that if the row is read twice in the same transaction, it will return the same value each time.
- SERIALIZABLE – It is the highest Isolation Level, here all the transactions are completely isolated from one another, supporting the most demanding consistency guarantees. However, it effectively makes database reads serial and may produce more transaction retry errors coming from interference between them. In most cases, the combination of these two factors results in a significant performance decrease.
Each of the Levels after the READ UNCOMMITTED aims to cover specific Read phenomena, with the higher Levels also covering the issues of predecessors.
Isolation Level | Read Phenomena |
---|---|
READ COMMITTED | Dirty Reads |
REPEATABLE READ | Dirty Reads, Repeatable Read |
SERIALIZABLE | Dirty Reads, Repeatable Read, Phantom Read |
The detailed description of these problems is kind of out of the scope of this text, but you can read more here.
Durability
After the commit of the transaction, its results are guaranteed to be permanent. Even in the case of system failure, crash, or power loss, there should be no after-commit data loss. Usually, it is done by saving the results on some form of persistent storage; in most cases, it is a plain old hard drive.
What Is BASE?
BASE is an abbreviation for Basically Available, Soft State, and Eventually Consistent.
The acronym serves to highlight their difference. In chemistry, ACID is the opposite of BASE.
Basically Available
In a concurrent environment, the database guarantees availability. It will try to respond to the request even with somewhat stale or incomplete data. What this means is that in the case of a sudden spike of requests, it may choose to focus on handling read requests while slowing down updates and inserts.
Soft State
The state of the system may change even without immediate reason, due to the approach built over the eventual consistency model. When multiple applications update databases, the particular record (or records) is in an intermediate state. The final consistent state will be calculated when all transactions for a particular record are complete. All intermediate states are also visible to all interloping reads or writes.
Eventually Consistent
The database will eventually reach a consistent state when no more updates will be present. At some random point, we may see some inconsistencies, but they should be resolved in a longer time span. Consistency is not guaranteed at a transaction level.
For example, in the case of a geo-distributed database, when some record is updated, the new state may not initially be visible to users in different regions due to network latency. However, after some time, the new state will be propagated to all nodes and thus the change becomes visible.
ACID vs BASE
Data Integrity
- ACID properties provide a more strict guarantee as to the consistency of our state. Random occurrences of some data inconsistency are very rare. Also, the potential chances for data loss are relatively low. Though they still exist, it is just quite hard to cause them – at least without some bad luck.

- BASE, on the other hand provides, little to no guarantees as to the state of our database. We only know that at some point in the future, it will be consistent, but when it will be – nobody knows. All intermediate state changes are still visible and may impact other queries. The BASE also does not guarantee the data durability by principle. Without such a guarantee, the event of partial or total data loss is more likely to happen.
Integration Complexity
ACID, because of its high consistency guarantees, it takes a lot of burden off of software engineers’ shoulders. With ACID, we usually do not have to worry too much about inconsistency between transactions, “seemingly random” data changes, and or losing part of our data on the fly.
BASE in this case, we have to cover most of such cases by ourselves. We have to know that some inconsistency may – and thus will – occur. Thus, we must implement their resolution accordingly.
Performance
ACID-complaints databases usually use locks to synchronize access to particular resources in a concurrent environment. As the transactions on conflicting records are processed in a strict order, you should be ready for some delays in transaction resolution.
It gives better guarantees as to the final state of objects and, on a small scale, has a reasonably fast resolution time. However, time tends to grow with the scale. The bigger the scale, the longer it may take to resolve concurrent transactions.
- BASE-compliant systems are somewhat simpler – there are no locks. The database is synchronizing the operations eventually without hard time guarantees. Nevertheless, the time of transaction resolution may also become quite long.
Scalability
By principle, the BASE compliant database scale is noticeably better than ACID ones – mostly due to a more relaxed consistency model. In general, most BASE databases are designed to support horizontal scaling, while ACID ones are limited by vertical scaling with the possibility for read replicas.
BASE And ACID In A Single Database
Such behavior is virtually impossible due to the vastly different consistency models of both approaches. Additionally, in the case of a network partition, we can choose to be either available or consistent; we cannot be both at the same time.
Despite this fact, there are a couple of NoSQL databases that try to mix both paradigms. Thus, they expose ACID properties in certain parts, or for certain features. This list is quite long, so I will not put the whole, but it includes databases like MongoDB, Cassandra (ACID support for single row operations), and Couchbase.
Summary
As always, the summary is quick and simple — just look below.
Feature | ACID | BASE |
---|---|---|
Data Integrity | High consistency and durability | Eventual consistency, no durability guarantee |
Integration Complexity | Lower, most of the inconsistencies are handled on the DB side | Higher, we have to be aware of inconsistencies and their impact |
Performance | May degrade rapidly in concurrent environment | Should handle more load without performance degradation |
Scalability | Vertical scaling with read replicas | Horizontal scaling |
Please keep in mind that this comparison is only focused on the theoretical difference between both approaches. It aims to deepen your knowledge of both of them.
When choosing an exact database, I would recommend focusing more on a product-to-product comparison. Most of the features and behaviors may differ depending on the exact database implementation.
Thank you for your time.
Comments are closed.