ZeePedia

Serial Execution, Serializability, Locking, Inconsistent Analysis

<< Incremental Log with Deferred, Immediate Updates, Concurrency Control
Locking Idea, DeadLock Handling, Deadlock Resolution, Timestamping rules >>
img
Database Management System (CS403)
VU
Lecture No. 44
Reading Material
"Database Systems Principles, Design and Implementation"
Chapter 12
written by Catherine Ricardo, Maxwell Macmillan.
"Database Management Systems" Second edition written by
Chapter 18
Raghu Ramakrishnan, Johannes Gehkre.
Overview of Lecture
o Concurrency control problems
o Serial and interleaved schedules
o Serializability theory
o Introduction to locking
We are studying the concurrency control (CC) mechanism. In the previous lecture we
discussed that concurrent access means the simultaneous access of data from database
by multiple users. The concurrency control (CC) concerns maintaining the
consistency of the database during concurrent access. We also studied that if
concurrent access is not controlled properly then database may encounter three
different problems which may turn database into an inconsistence state. One of these
problems we discussed in the previous lecture, now we will discuss the remaining two.
Uncommitted Update Problem
It reflects a situation when a transaction updates an object and another transaction
reads this updated value but later the first transaction is aborted. The problem in this
situation is that the second transaction reads the value updated by an aborted
transaction. This situation is shown in the table below:
310
img
Database Management System (CS403)
VU
TIME
TA
TB
BAL
t1
Read (BAL)
.......
1000
t2
BAL=BAL+100
........
1000
t3
Write (BAL)
........
1100
t4
.......
Read (BAL)
1100
t5
.......
BAL=BAL*1.5
1100
t6
.......
Write (BAL)
2650
t7
ROLLBACK
........
?
Table 1: Uncommitted update problem
As is shown in the table, at time t2, transaction TA updates the value of object `BAL'
by adding 100 in its initial value 1000 and at time t3 it writes this updated value to
database. So `BAL' is written as 1100. Now at time t4, TB reads the value of `BAL'
and it gets 1100 then TB updates the value multiplying it by 1.5 and writes it at time
t6. So the new value written for `BAL' is 2650. However at time t7 transaction TA is
rolled back as a result of this rollback the changes made by TA stand cancelled, but
TB has already performed certain processing on the value set by TA. This is an
inconsistent state of the database.
Inconsistent Analysis
This problem of concurrent access occurs when one transaction operates on many
records, meanwhile another modifies some of them effecting result of first. This
problem is expressed in the following table:
311
img
Database Management System (CS403)
VU
TIME
TA
TB
t1
Read (BAL_A) (5000)
.......
t2
INT = BAL_A * .05
........
t3
TOT = TOT + INT
........
t4
.......
.......
t5
.......
Read (BAL_A)
t6
.......
BAL_A=BAL_A ­ 1000
t7
.......
Write (BAL_A)
t8
.......
Read (BAL_E) (5000)
t9
.......
BAL_E = BAL_E + 1000
t10
.......
Write (BAL_E)
t11
Read (BAL_E) (6000)
........
t12
INT = BAS_E * .05
........
t13
TOT = TOT + INT
........
Table 2: Inconsistent analysis problem
Suppose Tr-A computes interest on all accounts balances. For this, TA reads the
balance of each account multiplies it with 0.05 and adds this interest amount into a
variable TOT. Now by time t5, TA has read the balance of account `A' (BAL_A) and
computed interest on it, it has to perform same process on all the accounts. In the
meanwhile from time t5 to t10 another transaction TB subtracts a value from balance
of account `A' and adds this value to the balance of account `E'. When transaction TA
reaches the account `E' to compute interest on it, the value added in it by TB is also
considered which as a matter of fact is being considered second time. First time from
account `A' and now from account `E'. The analysis obtained through transaction TA
will be wrong at the end.
We have discussed three problems of concurrent access; the CC mechanism has to
make sure that these problems do not occur in the database. In the following, we will
discuss how it is done. We will start by studying some basic concepts.
Serial Execution
Serial execution is an execution where transactions are executed in a sequential order,
that is, one after another. A transaction may consist of many operations. Serial
execution means that all the operations of one transaction are executer first, followed
by all the operations of the next transaction and like that. A Schedule or History is a
list of operations from one or more transactions. A schedule represents the order of
execution of operations. The order of operations in a schedule should be the same as
in the transaction. Schedule for a serial execution is called a serial schedule, so in a
serial schedule all operations of one transactions are listed followed by all the
312
img
Database Management System (CS403)
VU
operations of another transactions and so on. With a given set of transactions, we can
have different serial schedules. For example, if we have two transactions, then we can
have two different serial schedules as is explained in the table below:
BAL
Serial Sched 1
Serial Sched 2
BAL
TA, TB
TB, TA
Read (BAL)A
Read (BAL)B
50
50
(BAL = BAL ­ 10)A
50
(BAL=BAL * 2)B
50
Write (BAL)A
40
Write (BAL)B
100
40
Read (BAL)B
Read (BAL)A
100
(BAL=BAL * 2)B
80
(BAL = BAL ­ 10)A
100
Write (BAL)B
80
Write (BAL)A
90
Table 3: Two different serial schedules, TA, TB and TB, TA
The table shows two different schedules of two transactions TA and TB. The
subscript with each operation shows the transaction to which the operation belongs.
For example, Read (BAL)A means the read operation of TA to read the object `BAL'.
By looking at the table that in each serial schedule; all the operations of one
transaction are executed first followed by those of the other transaction. An important
point to be noted here is that different serial schedules of the same transactions may
result in different final state of the database. As we can see in the table 3, final value
of the object `BAL' is 80 with the serial schedule TA, TB and it is 90 with serial
schedule TB, TA. However, it is guaranteed that a serial schedule always leaves the
database in a consistent state. In other words, the three problems of concurrency that
we studies earlier will not occur if a serial schedule is followed.
The serial schedule ensures the consistency of the database, so should we prefer serial
schedule? The answer is no. Because serial execution is badly under utilization of
resources and users will not like it at all since they will have to wait a lot for the
transactions' execution. Suppose we are following a serial execution and a transaction
Tn has to executed completely before any other transaction may start. Lets say Tn
needs an input from the user who initiated it, and by chance users gets stuck
somewhere or is thinking, unless and until the users does not enter a value, transaction
Tn cannot proceed and all other transactions are waiting for their turn to execute. If
313
img
Database Management System (CS403)
VU
allowed to happen, this will be a much disliked situation. Therefore serial schedules
are not preferred for execution.
Contrary to serial schedule is an interleaved schedule in which transactions' execution
is interleaved, that is, operations of different transactions are intermix with each other.
However, the internal order of the operations is not changed during this intermixing.
Interleave schedule provides more efficient execution since control is switched among
transaction, so on one side it makes a better use of the resources and on the other hand
if one transaction is stuck or halted due to some reasons, the processing can go on
with other transactions. Following figure presents serial and interleaved schedules
among two transactions TA and TB
TA = {Read(X), Write(X), commit}
TB = {Read(Y), Read(X), Write(Y), commit}
S1 = {RA(X), WA(X), CA, RB(Y), RB(X), WB(Y), CB}
S2 = {RB(Y), RB(X), WB(Y), CB, RA(X), WA(X), CA}
S3 = {RB(Y), RB(X), RA(X), WB(Y), CB, WA(X), CA}
S4 = {RA(X), RB(Y), RB(X), WA(X), CA, WB(Y), CB}
Fig. 1: Two transactions and different schedules
In figure 1, we have two transactions TA and TB consisting of different operations.
S1, S2, S3 and S4 are four different schedules of these transactions. S1 and S2 are two
serial schedules where as S3 and S4 are two interleaved schedules. Obviously we can
have just two serial schedules of two transactions, but we can have many other
interleaved schedules of these transactions.
Interleaved schedules are preferred but they may generate three of the concurrent
access problems if not controlled properly. Before discussing the solution to this
problem, lets see what is the actual problem in concurrent access.
There are different situations during concurrent access of the data:
·
Different transactions accessing (reading or writing) different objects
·
Different transactions reading same object
314
img
Database Management System (CS403)
VU
·
Different transactions accessing same object and one or all writing it
The first two of these situations do not create any problem during concurrent access,
so we do not need to worry in these situations. However, third situation is precisely
the one that creates the concurrency problems and we have to control this situation
properly. Third situation introduces the concept of conflicting operations; the
operations that are accessing the same object and one of them writing the object. The
transactions that contain conflicting operations are called conflicting transactions.
Basically, in concurrency control we have to take care of the conflicting transactions
and for that we have to study the concept of serializability.
Serializability
An interleaved schedule is said to be serializable if its final state is equivalent to some
serial schedule. Serializable schedule can also be defined as a schedule in which
conflicting operations are in a serial order. The serializable schedule ensures the
consistency of the database. However, this is worthwhile to mention here again that
serializability takes care of the inconsistencies only due to the interleaving of
transactions. The serializability concerns generating serializable schedules or we can
say that the purpose of the CC mechanism is to ensure serializability of the schedules.
Generating serializable schedule involves taking care of only conflicting operations.
We have to adopt a particular serial schedule among the conflicting operations; the
non-conflicting operations can be interleaved to any order, it will not create any
problem at all. There are two major approaches to implement the serializability;
Locking and Timestamping. We will discuss both in detail.
Locking
The basic idea of locking is that an object is locked prior to performing any operation.
When the operation has been performed the lock is released. Locking is maintained by
the transaction manager or more precisely lock manager. Transactions apply locks on
objects. During the time when an object is locked it is available for operations only to
the transaction that has got the lock on that object through lock manager. When an
item is locked and during that time another transaction needs to perform some
315
img
Database Management System (CS403)
VU
operation on that object, the transaction applies for the lock on the object but since the
object is already locked, this transaction will have to wait. So it enters into a wait state.
When first transaction finishes its job, it releases locks on items then the second item
gets the lock on the object and then it can proceed.
Transactions perform two types of operations on objects; read or write. That is, a
transaction either performs a read operation on an object or it writes it. Accordingly
there are two types of locks as well. A read or shared lock and a write or exclusive
lock a transaction applies lock according to the nature of operation that it wants to
perform on a particular object. If a transaction TB applies a lock on an object O, the
lock manager first checks if O is free, it is free then TB gets the desired lock on O.
However, if O is already locked by another transaction say TA then lock manager
checks the compatibility of the locks; the one that has already been assigned and the
one that has been applied now. The compatibility of locks means that if the two locks
from two different transactions may exist at the same time. The compatibility of locks
is checked according to the following table:
Transaction A
Read Write
Transaction B
Read
Yes
No
Write
No
No
Table shows that if a transaction A has got a read lock on an object and transaction B
applies for the read lock, B will be granted this lock. It means two read locks are
compatible with each other, that is, they can exist at the same time. Therefore two or
even more than two transactions can read an object at the same time. Other cells
contain `No'; it means these locks are not compatible. So if a transaction has got a
`write' lock on an object and another transaction applies for a `read' or `write' lock,
the lock will not be granted and the transaction applying for the lock later, will have
to wait.
That is all for today's lecture. The locking mechanism will be discussed in detail in
the next lecture.
Summary
In this lecture we discussed two of the three CC problems. Then we discussed that
there are different types of schedules that determine the sequence of execution of
316
img
Database Management System (CS403)
VU
operations form different transactions. A serial schedule maintains the consistency of
the database for sure, but serial schedule is not much useful. Interleaved schedule is
preferred but if not controlled properly an interleaved schedule may cause consistency
problems. So CC is based on the serializability theory that controls the order of
conflicting operations. The serializability is applied using locking or Timestamping.
Locking is based on two types of locks; shared and exclusive. Only two shared locks
are compatible with each other, none of the remaining combinations are compatible.
317
Table of Contents:
  1. Introduction to Databases and Traditional File Processing Systems
  2. Advantages, Cost, Importance, Levels, Users of Database Systems
  3. Database Architecture: Level, Schema, Model, Conceptual or Logical View:
  4. Internal or Physical View of Schema, Data Independence, Funct ions of DBMS
  5. Database Development Process, Tools, Data Flow Diagrams, Types of DFD
  6. Data Flow Diagram, Data Dictionary, Database Design, Data Model
  7. Entity-Relationship Data Model, Classification of entity types, Attributes
  8. Attributes, The Keys
  9. Relationships:Types of Relationships in databases
  10. Dependencies, Enhancements in E-R Data Model. Super-type and Subtypes
  11. Inheritance Is, Super types and Subtypes, Constraints, Completeness Constraint, Disjointness Constraint, Subtype Discriminator
  12. Steps in the Study of system
  13. Conceptual, Logical Database Design, Relationships and Cardinalities in between Entities
  14. Relational Data Model, Mathematical Relations, Database Relations
  15. Database and Math Relations, Degree of a Relation
  16. Mapping Relationships, Binary, Unary Relationship, Data Manipulation Languages, Relational Algebra
  17. The Project Operator
  18. Types of Joins: Theta Join, Equi–Join, Natural Join, Outer Join, Semi Join
  19. Functional Dependency, Inference Rules, Normal Forms
  20. Second, Third Normal Form, Boyce - Codd Normal Form, Higher Normal Forms
  21. Normalization Summary, Example, Physical Database Design
  22. Physical Database Design: DESIGNING FIELDS, CODING AND COMPRESSION TECHNIQUES
  23. Physical Record and De-normalization, Partitioning
  24. Vertical Partitioning, Replication, MS SQL Server
  25. Rules of SQL Format, Data Types in SQL Server
  26. Categories of SQL Commands,
  27. Alter Table Statement
  28. Select Statement, Attribute Allias
  29. Data Manipulation Language
  30. ORDER BY Clause, Functions in SQL, GROUP BY Clause, HAVING Clause, Cartesian Product
  31. Inner Join, Outer Join, Semi Join, Self Join, Subquery,
  32. Application Programs, User Interface, Forms, Tips for User Friendly Interface
  33. Designing Input Form, Arranging Form, Adding Command Buttons
  34. Data Storage Concepts, Physical Storage Media, Memory Hierarchy
  35. File Organizations: Hashing Algorithm, Collision Handling
  36. Hashing, Hash Functions, Hashed Access Characteristics, Mapping functions, Open addressing
  37. Index Classification
  38. Ordered, Dense, Sparse, Multi-Level Indices, Clustered, Non-clustered Indexes
  39. Views, Data Independence, Security, Vertical and Horizontal Subset of a Table
  40. Materialized View, Simple Views, Complex View, Dynamic Views
  41. Updating Multiple Tables, Transaction Management
  42. Transactions and Schedules, Concurrent Execution, Serializability, Lock-Based Concurrency Control, Deadlocks
  43. Incremental Log with Deferred, Immediate Updates, Concurrency Control
  44. Serial Execution, Serializability, Locking, Inconsistent Analysis
  45. Locking Idea, DeadLock Handling, Deadlock Resolution, Timestamping rules