(Deprecated) Lisa's Tech Blog: Transactions and Replications

Sunday, January 10, 2016

Transactions and Replications

One Copy Serializability

Replicated transactional service

Each replica manager providers concurrency control and recovery of its own data items in the same way as it would for non-replicated data

Effects of transactions performed by various clients on replicated data items are the same as if they has been performed one at a time on a single data item
Additional complications

Failures should be serialized w.r.t. transactions, i.e., any failure observed by a transaction must appear to have happened before a transaction started

Replication Scheme

Primary copy

All client request are directed to a single primary server

Read one -- write all

cannot handle network partitions
Each write operation sets a write lock to each replica manager
Each read sets a read lock at one replica manager

Schemes that can handle network partitions

Available copies with validation
Quorum consensus
Virtual Partition

Available Copies Replication

Can handle the case when some replica managers are unavailable because they failed or communication failure
Reads can be performed by any available replica manager but writes must be performed by all available replica managers
Normal case is like one/write all

As long as the set of available replica managers does not change during a transaction

RM failure case

One copy serializability requires that failures and recovery be serialized w.r.t transactions
This is not achieved when different transactions make conflicting failure observations
Examples shows local concurrency control is not enough
Additional concurrency control procedure (called local validation) has to be performed to ensure correctness

Available copies with local validation assume no network partition

i.e, functioning replica managers can communicate with one another.

Local validation

Before a transaction commits it checks for failures (and recoveries) of replica managers of data items it has accessed

Handling Network Partititons

Network partitions separate replica managers into two or more subgroups, in such a way that the members of a group can communicate with one another but members of different subgroups cannot communicate
Optimistic approaches

Available copies with validation

Pessimistic approaches

Quorum consensus

Available Copies with Validation

Available copies algorithm applied within each partition

Maintains availability for read operations

When partition is repaired, possibly conflicting transactions in separate partitions are validated

The effects of a committed transactions that is now aborted on validation will have to be undone

Only feasible for applications where such compensating actions can be taken

Validation

Versions vector (write-write conflicts)
Precedence graphs (each partition maintains a log of data items affected by the Read and Write operations of transactions)
Log used to construct a precedence graph whose nodes are transactions and whose edges represent conflicts between Read and Write operations

No cycle in graph corresponding to each partition

If there are cycles in graph, validation fails.

Quorum Consensus

A quorum is a subgroup of replica managers whose size gives it the right to carry out operations
Majority voting one instance of a quorum consensus scheme

R+ W > total number of votes in group
W > half f the total votes
Ensure that each read quorum intersects a write quorum, and two write quorum will intersect

Each replica has a version number that is used to detect if the replica is up to date.

Virtual Partition Scheme

Combines available copies and quorum consensus
Virtual partition = set of replica managers that have a read and write quorum
If a virtual partition can be formed, available copies is used

Improve performance of Reads

If a failure occurs, and virtual partition changes during a transaction, it is aborted
Have to ensure virtual partitions do not overlap.

CAP Conjecture

Is it possible to achieve consistency, availability, and partition tolerance?.
Classic distributed systems: focused on ACID semantics

Atomic
Consistent
Isolated
Durable

Modern internet system: focused on BASE

Basically Available
Soft-state (or scalable)
Eventually consistent

ACID v.s. BASE

Why the Divide

What goals might you want for a shared-data system

consistency
availability
partition tolerance

Strong consistency

all clients see the same view, even in the presence of updates

High availability

All clients can find the same replica of the data, even in the presence of failures

Partition-tolerance

The system properties hold even when the system is partitioned.

CAP Conjecture

You can only have two out of these three properties.
The choice of which feature to discard determines the nature of your system.

Consistency and Availability

Comment

Providing transaction semantics requires all nodes to be in contact with each other

Example

Single-site clustered databases

Typical features

Two-phase commit
Cache invalidation protocol
Classic distributed system style

Consistency and Partition Tolerance

Comment

If one is willing to tolerate system-wide blocking, then can provide consistency even when there are temporary partitions

Examples

Distributed databases
Distributed locking
Quorum (majority) protocols

Typical features

Pessimistic locking
Minority partitions unavailable
Also common distributed system

voting vs. primary replicas

Partition-Tolerance and Availability

Comment

sacrifice consistency

Examples

DNS
Web cache
Coda
Bayou

Typical features

TTLs and lease cache management
Optimistic updating with conflict resolution

Techniques

Expiration-based caching: AP
Quorum/majority algorithm: PC
Two phase commit: AC

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)