In an earlier post (Eventual State Consistency vs. Eventual Path Consistency), I wrote about the difference between two types of Eventual Consistency. While Eventual State Consistency is the more familiar variety, there are applications that require Eventual Path Consistency.
We are of course discussing Eventually Consistent application configurations that relax Consistency in favor of Availability and Partition tolerance per the CAP theorem. If either of Availability and Partition tolerance can be relaxed, Consistency is easily provided, for example, by ensuring that a transaction is committed at the remote instance as part of committing it at the local instance. The case where Consistency can be relaxed is not just more desirable in an “always on” business environment where the competition is just mouse clicks away - it is also technically more challenging (and hence more interesting to a geek like me).
In order to provide Eventual Consistency, an application has to (a) detect when two instances have diverged - i.e., are mutually Inconsistent - and (b) bring them back to a Consistent state.
The distinction between Eventual State Consistency and Eventual Path Consistency applies to all types of databases, relational (“SQL”) as well as non-relational (“NoSQL”). Consistency, however, is relevant only to databases that support Atomic, Consistent, Isolated and Durable (ACID) transactions where a transaction has a well defined beginning (transaction start) and a well defined end (transaction commit). Without transactions, every update is in theory Consistent and relies on all data read from the very first update to a newly created database - in other words, while Consistency may be theoretically meaningful without transactions, it is not a useful concept in practice.
The problem to be solved is summarized thus:
- Two instances, A and B, independently execute logically equivalent application code and commit Durable updates to their local databases without checking for Consistency with respect to the state of the other instance.
- In conjunction with committing updates locally, each instance transmits information to the other instance to replicate the transaction. The local commit is required (otherwise Durability is not satisfied), but Eventual Consistency means that remote receipt and commit is not required for a local commit.
- The remote instance must determine from the transmitted information whether applying the transaction would make its database Inconsistent.
The brute force approach is simply to re-execute application code for the transaction on the remote instance. If the results on the remote instance are different from those originally computed, an Inconsistency has occurred. But brute force has its limitations and is overkill for most real world problems that I know of.
To determine Inconsistency in a replicated environment, an application needs to look at relative times. Consider a transaction an computed and committed by A and replicated to B. In reverse chronological order:
- T0 is the time at which B performs the check for Consistency to commit an.
- T-1 is the time at which an was committed on A. A provides ACID properties for an at T-1.
- At time T-1 the database on A is current and Consistent with transactions committed on B up to time T-2.
In order for B to determine whether an can be safely committed, it must compare an with transactions it computed and committed between T-2 and T0. Transactions committed before T-2 do not need to be examined because they were replicated to A by T-1, and hence application logic A would have ensured Consistency of an with respect to them. Transactions replicated from A and committed before T0 also do not need to be examined, because database serialization on A would have ensured their Consistency with respect to all updates computed and committed on A prior to an.
In order to determine Consistency, the instances therefore need a common measure of time. This does not need to be time as we intuitively understand it - for example, they could have a common understanding of the number of steps through state space that they have taken. But the understanding must be common (fuzziness should be bounded and its limits known), and the measure must increase monotonically.
Detecting Inconsistency for Eventual State Consistency is a straightforward proposition: check whether any update to the database state in B computed and committed by transactions between T-2 and T0 conflicts with any update in an (a “Consistent update check”). Updates in the state of B replicated from A and further modified by an should be ignored because (a) the application logic on A would have ensured Consistency and (b) the fact that the replicated updates were committed means that they did not conflict with any update on B.
Detecting Inconsistency for Eventual Path Consistency is a little harder. Instance B must check:
- whether any elements of the database state that A used or read to compute an were changed by B between T-2 and T-1 (this “Consistent use check” is not required for Eventual State Consistency), and
- whether any update to the database state in B computed and committed by transactions between T-2 and T0 conflicts with any update in an (this “Consistent update check” is the same as for Eventual State Consistency).
- The Consistent use check needs to ensure Consistency not just of data whose values were read, but also data whose absence affects results.
- Any transactions from A following a replicated transaction that cannot be committed on B can also not be committed, because that would violate the serialization of those transactions.
- Generalizing from two mutually replicating instances to three is possible in theory but hard in practice. Depending on replication delays, it is possible for an instance C, which has computed and committed no transactions of its own to detect Inconsistency in transactions computed by instances A and B, neither of which detects Inconsistency in the replication stream from the other.
When Inconsistency is detected, steps must be taken to restore Consistency.
The problem to be solved is summarized thus:
- An instance A computed and committed a transaction an (perhaps with an externally visible action, an ATM withdrawal authorization or a ticket issued to a concert or sporting event).
- Transaction an was replicated to instance B.
- Instance B has detected that committing the transaction would violate Consistency with respect to at least one transaction in its database, bp (if there are multiple such transactions, bp is the earliest).
A requirement for any solution is that every transaction must have a correcting (or inverse) transaction. Otherwise, the application is deadlocked because A and B have both committed conflicting updates that may be externally visible. Ensuring an inverse transaction for every transaction may require changes to application Consistency rules.
The simplest approach to achieving Eventual State Consistency is for B to compute and commit a transaction BnX which corrects the committing of an by instance B. In turn, BnX will get replicated to instance A. Such an approach should suffice for many Eventual State Consistency needs.
But the simplest approach will not produce Eventual Path Consistency since instances A and B will have different paths through state space. Even for State Consistency, approach may not always work. For example, in some applications, an Inconsistency with an may mean that the probability increases that an+1, an+2, etc. will also be detected as Inconsistent, causing an upward spiral of correcting entries.
A more useful approach is to designate one instance as the liege and the other as the vassal - the designations can be swapped at any time, of course. To resolve Inconsistencies, the liege is always right. In this example, assume that A is the liege and B the vassal. Corrective action occurs thus:
- Using its journal files, B rolls back its database to the state it was in when transaction bp-1 was committed. Because of update serialization, transactions rolled off would include transactions computed and committed by B as well as those replicated from A. The rolled off transactions are saved.
- The database state of B is now no longer Inconsistent with the state of A. Replication can now resume, catching up from the rollback point, as can normal transaction processing on B.
- B creates a reconciliation log from the file with the rolled off transactions with transactions originally computed and committed by A and replicated to B filtered out - these are already processed and committed on A and do not need to be reprocessed. It sends this reconciliation log to A.
- A reads and reprocesses the transactions in the reconciliation log. If the result of A reprocessing a transaction generates the same result, it simply commits the transaction to its database. If the result of A reprocessing a transaction results in a different result, A computes and commits the correcting transaction.
- The results of A reprocessing the reconciliation log are replicated to B in the normal course.
It is of course not always the case that B detects an Inconsistency. It is just as possible for A to detect an Inconsistency when replicating from B. In this case, A as the liege does not rollback its database. Instead, A commands B to rollback its database to the state immediately prior to the transaction that would cause an Inconsistency at A if committed. In either case, B rolls back its database - the only difference is whether it determines the rollback point or whether A sends it the rollback point. Also, A discards the replication stream from B until after B rolls back its database.
For Eventual Path Consistency to be successfully deployed, the application and database engine must cooperate. The application must be well behaved: in particular, the probability of Inconsistencies detected on replication must be low. It may be possible to reconfigure some otherwise ill behaved applications to make them behave better, for example, by preferential routing to the liege instance of transactions that are likely to “collide”, since transactions processed and committed by the liege are not rolled back and reprocessed.
In a future blog post, I will describe the pragmatic trade-offs of a real-world application that implements Eventual Consistency.
- The application code at the two instances does not have to be identical. What is required is that given the same set of inputs, they generate database state changes that are logically identical (they must be identical for state elements material to the application, such as the balance in an account; there may be state that is not material, such as the process id on the instance that computed the state change).
- The absence of data may be as important as its presence. Consider the case where a database has nodes (or rows) where the first subscript (or primary key component) is the last name of US Presidents. An application may test for the presence of “Obama” (testing for the presence of data) or it may check whether or not the successor to “Nixon” is “Pierce” (testing for the absence of data).
- While this is a sufficient condition, it is not a necessary condition. In theory, only those transactions committed on A that relied on the database updates committed by the conflicting transaction cannot be committed. In practice, tracking the dependencies and would create a forest of dependency trees that could in theory spiral out of control and lead to computational meltdown.
- In order to implement Eventual Consistency, changes to application Consistency rules are likely to be mandatory. Consider an example where the application implements a Consistency rule such as “no seat to a concert can have more than one owner”. If instances A and B sell the same seat at the same time, there is no way to restore Consistency. So, application Consistency rule has to be changed to something like “no seat to a concert can have more than one owner unless an action item to correct the situation has been put in the Customer Service work queue.” If two concurrent ATM withdrawals have depleted the balance in an account, the inverse transaction may be to put the account into overdrawn status and perhaps charge the owner(s) a service fee.
- As an example of an upward spiral of correcting entries, in the hypothetical sports seating application, the probability of “collisions” increases as the venue starts to sell out, and if the correcting action is to generate a ticket for Customer Service, the volume of calls needed could soon get out of hand, let alone the problem of irate customers.