Serial Execution, Serializability, Locking, Inconsistent Analysis

<< Incremental Log with Deferred, Immediate Updates, Concurrency Control

Locking Idea, DeadLock Handling, Deadlock Resolution, Timestamping rules >>

Database Management System (CS403)

Lecture No. 44

Reading Material

"Database Systems Principles, Design and Implementation"

Chapter 12

written by Catherine Ricardo, Maxwell Macmillan.

"Database Management Systems" Second edition written by

Chapter 18

Raghu Ramakrishnan, Johannes Gehkre.

Overview of Lecture

o Concurrency control problems

o Serial and interleaved schedules

o Serializability theory

o Introduction to locking

We are studying the concurrency control (CC) mechanism. In the previous lecture we

discussed that concurrent access means the simultaneous access of data from database

by multiple users. The concurrency control (CC) concerns maintaining the

consistency of the database during concurrent access. We also studied that if

concurrent access is not controlled properly then database may encounter three

different problems which may turn database into an inconsistence state. One of these

problems we discussed in the previous lecture, now we will discuss the remaining two.

Uncommitted Update Problem

It reflects a situation when a transaction updates an object and another transaction

reads this updated value but later the first transaction is aborted. The problem in this

situation is that the second transaction reads the value updated by an aborted

transaction. This situation is shown in the table below:

310

Database Management System (CS403)

TIME

BAL

Read (BAL)

.......

1000

BAL=BAL+100

........

1000

Write (BAL)

........

1100

.......

Read (BAL)

1100

.......

BAL=BAL*1.5

1100

.......

Write (BAL)

2650

ROLLBACK

........

Table 1: Uncommitted update problem

As is shown in the table, at time t2, transaction TA updates the value of object `BAL'

by adding 100 in its initial value 1000 and at time t3 it writes this updated value to

database. So `BAL' is written as 1100. Now at time t4, TB reads the value of `BAL'

and it gets 1100 then TB updates the value multiplying it by 1.5 and writes it at time

t6. So the new value written for `BAL' is 2650. However at time t7 transaction TA is

rolled back as a result of this rollback the changes made by TA stand cancelled, but

TB has already performed certain processing on the value set by TA. This is an

inconsistent state of the database.

Inconsistent Analysis

This problem of concurrent access occurs when one transaction operates on many

records, meanwhile another modifies some of them effecting result of first. This

problem is expressed in the following table:

311

Database Management System (CS403)

TIME

Read (BAL_A) (5000)

.......

INT = BAL_A * .05

........

TOT = TOT + INT

........

.......

Read (BAL_A)

.......

BAL_A=BAL_A 1000

.......

Write (BAL_A)

.......

Read (BAL_E) (5000)

.......

BAL_E = BAL_E + 1000

t10

.......

Write (BAL_E)

t11

Read (BAL_E) (6000)

........

t12

INT = BAS_E * .05

........

t13

TOT = TOT + INT

........

Table 2: Inconsistent analysis problem

Suppose Tr-A computes interest on all accounts balances. For this, TA reads the

balance of each account multiplies it with 0.05 and adds this interest amount into a

variable TOT. Now by time t5, TA has read the balance of account `A' (BAL_A) and

computed interest on it, it has to perform same process on all the accounts. In the

meanwhile from time t5 to t10 another transaction TB subtracts a value from balance

of account `A' and adds this value to the balance of account `E'. When transaction TA

reaches the account `E' to compute interest on it, the value added in it by TB is also

considered which as a matter of fact is being considered second time. First time from

account `A' and now from account `E'. The analysis obtained through transaction TA

will be wrong at the end.

We have discussed three problems of concurrent access; the CC mechanism has to

make sure that these problems do not occur in the database. In the following, we will

discuss how it is done. We will start by studying some basic concepts.

Serial Execution

Serial execution is an execution where transactions are executed in a sequential order,

that is, one after another. A transaction may consist of many operations. Serial

execution means that all the operations of one transaction are executer first, followed

by all the operations of the next transaction and like that. A Schedule or History is a

list of operations from one or more transactions. A schedule represents the order of

execution of operations. The order of operations in a schedule should be the same as

in the transaction. Schedule for a serial execution is called a serial schedule, so in a

serial schedule all operations of one transactions are listed followed by all the

312

Database Management System (CS403)

operations of another transactions and so on. With a given set of transactions, we can

have different serial schedules. For example, if we have two transactions, then we can

have two different serial schedules as is explained in the table below:

BAL

Serial Sched 1

Serial Sched 2

BAL

TA, TB

TB, TA

Read (BAL)A

Read (BAL)B

(BAL = BAL 10)A

(BAL=BAL * 2)B

Write (BAL)A

Write (BAL)B

100

Read (BAL)B

Read (BAL)A

100

(BAL=BAL * 2)B

(BAL = BAL 10)A

100

Write (BAL)B

Write (BAL)A

Table 3: Two different serial schedules, TA, TB and TB, TA

The table shows two different schedules of two transactions TA and TB. The

subscript with each operation shows the transaction to which the operation belongs.

For example, Read (BAL)A means the read operation of TA to read the object `BAL'.

By looking at the table that in each serial schedule; all the operations of one

transaction are executed first followed by those of the other transaction. An important

point to be noted here is that different serial schedules of the same transactions may

result in different final state of the database. As we can see in the table 3, final value

of the object `BAL' is 80 with the serial schedule TA, TB and it is 90 with serial

schedule TB, TA. However, it is guaranteed that a serial schedule always leaves the

database in a consistent state. In other words, the three problems of concurrency that

we studies earlier will not occur if a serial schedule is followed.

The serial schedule ensures the consistency of the database, so should we prefer serial

schedule? The answer is no. Because serial execution is badly under utilization of

resources and users will not like it at all since they will have to wait a lot for the

transactions' execution. Suppose we are following a serial execution and a transaction

Tn has to executed completely before any other transaction may start. Lets say Tn

needs an input from the user who initiated it, and by chance users gets stuck

somewhere or is thinking, unless and until the users does not enter a value, transaction

Tn cannot proceed and all other transactions are waiting for their turn to execute. If

313

Database Management System (CS403)

allowed to happen, this will be a much disliked situation. Therefore serial schedules

are not preferred for execution.

Contrary to serial schedule is an interleaved schedule in which transactions' execution

is interleaved, that is, operations of different transactions are intermix with each other.

However, the internal order of the operations is not changed during this intermixing.

Interleave schedule provides more efficient execution since control is switched among

transaction, so on one side it makes a better use of the resources and on the other hand

if one transaction is stuck or halted due to some reasons, the processing can go on

with other transactions. Following figure presents serial and interleaved schedules

among two transactions TA and TB

TA = {Read(X), Write(X), commit}

TB = {Read(Y), Read(X), Write(Y), commit}

S1 = {RA(X), WA(X), CA, RB(Y), RB(X), WB(Y), CB}

S2 = {RB(Y), RB(X), WB(Y), CB, RA(X), WA(X), CA}

S3 = {RB(Y), RB(X), RA(X), WB(Y), CB, WA(X), CA}

S4 = {RA(X), RB(Y), RB(X), WA(X), CA, WB(Y), CB}

Fig. 1: Two transactions and different schedules

In figure 1, we have two transactions TA and TB consisting of different operations.

S1, S2, S3 and S4 are four different schedules of these transactions. S1 and S2 are two

serial schedules where as S3 and S4 are two interleaved schedules. Obviously we can

have just two serial schedules of two transactions, but we can have many other

interleaved schedules of these transactions.

Interleaved schedules are preferred but they may generate three of the concurrent

access problems if not controlled properly. Before discussing the solution to this

problem, lets see what is the actual problem in concurrent access.

There are different situations during concurrent access of the data:

Different transactions accessing (reading or writing) different objects

Different transactions reading same object

314

Database Management System (CS403)

Different transactions accessing same object and one or all writing it

The first two of these situations do not create any problem during concurrent access,

so we do not need to worry in these situations. However, third situation is precisely

the one that creates the concurrency problems and we have to control this situation

properly. Third situation introduces the concept of conflicting operations; the

operations that are accessing the same object and one of them writing the object. The

transactions that contain conflicting operations are called conflicting transactions.

Basically, in concurrency control we have to take care of the conflicting transactions

and for that we have to study the concept of serializability.

Serializability

An interleaved schedule is said to be serializable if its final state is equivalent to some

serial schedule. Serializable schedule can also be defined as a schedule in which

conflicting operations are in a serial order. The serializable schedule ensures the

consistency of the database. However, this is worthwhile to mention here again that

serializability takes care of the inconsistencies only due to the interleaving of

transactions. The serializability concerns generating serializable schedules or we can

say that the purpose of the CC mechanism is to ensure serializability of the schedules.

Generating serializable schedule involves taking care of only conflicting operations.

We have to adopt a particular serial schedule among the conflicting operations; the

non-conflicting operations can be interleaved to any order, it will not create any

problem at all. There are two major approaches to implement the serializability;

Locking and Timestamping. We will discuss both in detail.

Locking

The basic idea of locking is that an object is locked prior to performing any operation.

When the operation has been performed the lock is released. Locking is maintained by

the transaction manager or more precisely lock manager. Transactions apply locks on

objects. During the time when an object is locked it is available for operations only to

the transaction that has got the lock on that object through lock manager. When an

item is locked and during that time another transaction needs to perform some

315

Database Management System (CS403)

operation on that object, the transaction applies for the lock on the object but since the

object is already locked, this transaction will have to wait. So it enters into a wait state.

When first transaction finishes its job, it releases locks on items then the second item

gets the lock on the object and then it can proceed.

Transactions perform two types of operations on objects; read or write. That is, a

transaction either performs a read operation on an object or it writes it. Accordingly

there are two types of locks as well. A read or shared lock and a write or exclusive

lock a transaction applies lock according to the nature of operation that it wants to

perform on a particular object. If a transaction TB applies a lock on an object O, the

lock manager first checks if O is free, it is free then TB gets the desired lock on O.

However, if O is already locked by another transaction say TA then lock manager

checks the compatibility of the locks; the one that has already been assigned and the

one that has been applied now. The compatibility of locks means that if the two locks

from two different transactions may exist at the same time. The compatibility of locks

is checked according to the following table:

Transaction A

Read Write

Transaction B

Read

Yes

Write

Table shows that if a transaction A has got a read lock on an object and transaction B

applies for the read lock, B will be granted this lock. It means two read locks are

compatible with each other, that is, they can exist at the same time. Therefore two or

even more than two transactions can read an object at the same time. Other cells

contain `No'; it means these locks are not compatible. So if a transaction has got a

`write' lock on an object and another transaction applies for a `read' or `write' lock,

the lock will not be granted and the transaction applying for the lock later, will have

to wait.

That is all for today's lecture. The locking mechanism will be discussed in detail in

the next lecture.

Summary

In this lecture we discussed two of the three CC problems. Then we discussed that

there are different types of schedules that determine the sequence of execution of

316

Database Management System (CS403)

operations form different transactions. A serial schedule maintains the consistency of

the database for sure, but serial schedule is not much useful. Interleaved schedule is

preferred but if not controlled properly an interleaved schedule may cause consistency

problems. So CC is based on the serializability theory that controls the order of

conflicting operations. The serializability is applied using locking or Timestamping.

Locking is based on two types of locks; shared and exclusive. Only two shared locks

are compatible with each other, none of the remaining combinations are compatible.

317

Table of Contents: