|
|||||
Database
Management System
(CS403)
VU
Lecture No.
01
Reading
Material
"Database
Systems Principles, Design and
Implementation"
Chapter
1.
written
by Catherine Ricardo, Maxwell
Macmillan.
Overview of
Lecture
o Introduction
to the course
o Database
definitions
o Importance
of databases
o Introduction
to File Processing Systems
o Advantages
of the Database
Approach
Introduction
to the course
This
course is first (fundamental) course on
database management systems. The
course
discusses
different topics of the
databases. We will be covering both
the theoretical and
practical
aspects of databases. As a student to
have a better understanding of
the subject,
it is
very necessary that you
concentrate on the concepts discussed in
the course.
Areas to be
covered in this
Course:
o Database
design and application
development: How do we
represent a real-world
system in
the form of a database? This
is one major topic covered in
this course. It
comprises of
different stages, we will discuss
all these stages one by
one.
o Concurrency
and robustness: How
does a DBMS allow many
users to access data
concurrently,
and how does it protect against
failures?
o Efficiency
and Scalability: How
does the database cope
with large amounts of
data?
9
Database
Management System
(CS403)
VU
o Study of
tools to manipulate databases:
In
order to practically implement,
that is, to
perform
different operations on databases
some tools are required.
The operations
on
databases include right from
creating them to add, remove
and modify data in
the
database and to access by different
ways. The tools that we will
be studying are
a
manipulation language (SQL) and a
DBMS (SQL Server).
Database
definitions:
Definitions
are important, especially in
technical subjects because
definition describes
very
comprehensively the purpose and the core
idea behind the thing.
Databases have
been
defined differently in literature. We
are discussing different
definitions here, if we
concentrate
on these definitions, we find
that they support each
other and as a result of
the
understanding of these definitions, we
establish a better understanding of
use,
working
and to some extent the
components of a database.
Def 1:
A
shared collection of logically
related data, designed to meet
the information
needs of
multiple users in an organization.
The term database is often
erroneously
referred
to as a synonym for a "database
management system (DBMS)".
They are
not
equivalent and it will be explained in
the next section.
Def 2:
A
collection of data: part numbers,
product codes, customer
information, etc. It
usually
refers to data organized and stored on a
computer that can be searched
and
retrieved
by a computer program.
Def 3:
A
data structure that stores
metadata, i.e. data about
data. More generally we
can
say an
organized collection of
information.
Def 4:
A
collection of information organized and
presented to serve a specific
purpose.
(A
telephone book is a common
database.) A computerized database is an
updated,
organized
file of machine readable
information that is rapidly
searched and
retrieved
by computer.
Def 5:
An
organized collection of information in
computerized format.
Def 6:
A
collection of related information
about a subject organized in a
useful manner
that
provides a base or foundation
for procedures such as retrieving
information,
drawing
conclusions, and making decisions.
Def 7:
A
Computerized representation of any
organizations flow of information
and
storage
of data.
Each of
the above given definition is
correct, and describe database
from slightly variant
perspectives.
From exam point of view,
anyone will do. However,
within this course, we
will be
referring first of the above
definitions more frequently, and concepts
discussed in
the
definition like, logically
related data, shared
collection should be clear.
Another
10
Database
Management System
(CS403)
VU
important
thing that you should be
very clear about is the
difference between
database
and the
database management system
(DBMS). See, the database is
the collection of
data
about
anything, could be anything.
Like cricket teams,
students, busses,
movies,
personalities,
stars, seas, buildings,
furniture, lab equipment, hobbies,
hotels, pets,
countries,
and many more anything about
which you want to store
data. What we mean
by data;
simply the facts or figures.
Following table shows the
things and the data that
we
may
want to store about
them:
Thing
Data
(Facts or figures)
Cricket
Player
Country,
name, date of birth, specialty, matches
played, runs etc.
Scholars
Name,
data of birth, age, country,
field, books published
etc.
Movies
Name,
director, language (Punjabi is
default in case )
etc.
Food
Name,
ingredients, taste, preferred
time, origin, etc.
Vehicle
Registration
number, make, owner, type,
price, etc.
There
could be infinite examples, and
please note that the
data that is listed
about
different
things in the above table is
not the only data
that can be defined or stored
about
these
things. As has been
explained in the definition one above,
there could be so
many
facts
about each thing that we
are storing data about;
what exactly we will store
depends
on the
perspective of the person or organization
who wants to store the data.
For example,
if you
consider food, data required
to be stored about the food
from the perspective of a
cook is
different from that of a person
eating it. Think of a food,
like, Karhahi Ghost,
the
facts
about Karhahi ghosht that a
cook will like to store may
be, quantity of salt,
green
and red
chilies, garlic, water, time
required to cook and like
that. Where as the
customer
is interested in
chicken or meat, then black
or red chilies, then weight,
then price and like
that.
Well, definitely there are
some things common but
some are different as well.
The
thing is
that the perspective or
point of view creates the
difference in what we
store;
however,
the main thing is that
the database stores the
data.
The
database management system
(DBMS), on the other hand is
the software or tool
that
is used
to manage the database and
its users. A DBMS consist of
different components or
subsystem
that we will study about
later. Each subsystem or component of the
DBMS
performs
different function(s), so a DBMS is
collection of different programs
but they all
work
jointly to manage the data
stored in the database and its
users. In many books and
may be in
this course sometimes database and
database management system
are used
interchangeably
but there is a clear
difference and we should be clear
about them.
Sometimes
another term is used, that
is, the database system,
again, this term has
been
used
differently by different people, however
in this course we use the
term database
system as
a combination of database and the
database management system. So
database is
collection
of data, DBMS is tool to
manage this data, and both
jointly are called
database
system.
11
Database
Management System
(CS403)
VU
Importance
of the Databases
Databases
are important; why?
Traditionally computer applications
are divided into
commercial
and scientific (or engineering) ones.
Scientific applications involve
more
computations,
that is, different type of
calculations that vary from
simple to very
complex.
Today
such applications exist,
like in the fields of space,
nuclear, medicine that
take
hours or
days of computations on even computers of
the modern age. On the
other hand,
the
applications that are termed
as commercial or business applications do
not involve
much
computations, rather minor
computation but mainly they
perform the
input/output
operations.
That is, these applications
mainly store the data in the
computer storage, then
access
and present it to the users in different
formats (also termed as data
processing) for
example,
banks, shopping, production, utilities
billing, customer services and
many
others.
As is clear from the example
systems mentioned, the commercial
applications
exist in
the day to day life and
are related directly with
the lives of common people.
In
order to
manage the commercial
applications more efficiently
databases are the
ultimate
choice
because efficient management of data is
the sole objective of the
databases. So
such
applications are being
managed by databases even in a
developing country
like
Pakistan,
yet to talk about the
developed countries. This
way databases are
related
directly
or indirectly almost every person in
society.
Databases
are not only being
used in the commercial
applications rather today
many of
the
scientific/engineering application are
also using databases less or
more.
databases
are concern of the
effectively latter form of
applications are more
Commercial
applications
involve The goal of this
course is to present an in-depth introduction
to
databases,
with an emphasis on how to
organize information in the
database and to
maintain
it and retrieve it efficiently, that
is, how to design a database and
use it
effectively.
Databases
and Traditional File Processing
Systems
Traditional
file processing system or simple
file processing system refers to
the first
computer-based
approach of handling the commercial or
business applications. That
is
why it is also
called a replacement of the
manual file system. Before
the use computers,
the
data in the offices or
business was maintained in the
files (well in that
perspective
some
offices may still be considered in
the pre-computer age).
Obviously, it was
laborious,
time consuming, inefficient,
especially in case of large
organizations.
Computers,
initially designed for the
engineering purposes were
though of as blessing,
since
they helped efficient management
but file processing environment
simply
transformed
manual file work to
computers. So processing became very
fast and efficient,
but as
file processing systems were used,
their problems were also
realized and some of
them
were very severe as
discussed later.
It is not
necessary that we understand
the working of the file
processing environment for
the
understanding of the database and
its working. However, a
comparison between
the
characteristics
of the two definitely helps
to understand the advantages of
the databases
12
Database
Management System
(CS403)
VU
and their
working approach. That is why the
characteristics of the traditional
file
processing
system environment have been
discussed briefly
here.
Fig. 1: A
typical file processing
environment
The
diagram presents a typical
traditional file processing environment.
The main point
being
highlighted is the program
and data interdependence, that
is, program and data
depend on
each other, well they
depend too much on each
other. As a result any
change
in one
affects the other as well.
This is something that makes
a change very painful or
problematic
for the designers or
developers of the system.
What do we mean by change
and why do we
need to change the system at
all. These things are
explained in the
following.
The
systems (even the file processing
systems) are created after a
very detailed
analysis
of the
requirements of the organizations.
But it is not possible to develop a
system that
does
not need a change afterwards.
There could be many reasons,
mainly being that
the
users get
the real taste of the
system when it is established. That
is, users tell the
analysts
or
designers their requirements,
the designers design and later
develop the system
based
on those
requirements, but when
system is developed and presented to the
users, it is only
then
they realize the outcome of
the effort. Now it could be
slightly and (unfortunately)
sometimes
very different from what
they expected or wanted it to be. So
the users ask
changes,
minor or major. Another
reason for the change is the
change in the requirements.
For
example, previously the
billing was performed in an organization
on the monthly
13
Database
Management System
(CS403)
VU
basis,
now company has decided to
bill the customers after
every ten days. Since the
bills
are
being generated from the
computer (using file processing
system), this change has
to
be
incorporated in the system.
Yet another example is that,
initially bills did not
contain
the
address of the customer, now
the company wants the
address to be placed on the
bill,
so here is change.
There could be many more
examples, and it is so common that we
can
say that
almost all systems need
changes, so system development is
always an on-going
process.
So we
need changes in the system,
but due to program-data interdependence
these
changes
in the systems were very
hard to make. A change in one will affect
the other
whether
related or not. For example,
suppose data about the
customer bills is stored in
the
file, and
different programs use this
file for different purposes,
like adding data into
the
bills
file, to compute the bill
and to print the bill. Now
the company asks to add
the
customers'
address in the bills, for
this we have to change the
structure of the bill file
and
also the
program that prints the
bill. Well, this was
necessary, but the painful
thing is that
the
other programs that are
using these bills files
but are not concerned
with the printing
of the
bills or the change in the
bill will also have to be changed, well;
this is needless
and
causes extra, unnecessary
effort.
Another
major drawback in the
traditional file system
environment is the non-sharing
of
data. It
means if different systems of an
organization are using some
common data then
rather
than storing it once and sharing
it, each system stores
data in separate files.
This
creates
the problem of redundancy or
wastage of storage and on the
other hand the
problem
on inconsistency. The change in the
data in one system sometimes is
not
reflected
in the same data stored in
other system. So different systems in
organization;
store
different facts about same
thing. This is inconsistency as is
shown in figure
below.
Fig. 2:
Some more problems in File
System Environment
14
Database
Management System
(CS403)
VU
Previous
section highlighted the file
processing system environment and major
problems
found
there. The following section
presents the benefits of the
database systems.
Advantages
of Databases
It will be
helpful to reiterate our database
definition here, that is,
database is a shared
collection
of logically related data,
designed to meet the information
needs of multiple
users in
an organization. A typical database
system environment is shown in
the figure 3
below:
Fig. 3: A
typical Database System
environment
The
figure shows different
subsystem or applications in an
educational institution,
like
library
system, examination system, and
registration system. There
are separate,
different
application
programs for every
application or subsystem. However,
the data for
all
applications
is stored at the same place in the
database and all application
programs,
relevant
data and users are being
managed by the DBMS. This is
a typical database
system
environment and it introduces the
following advantages:
o Data
Sharing
The
data for different
applications or subsystems is placed at the
same place. This
introduces
the major benefit of data
sharing. That is, data
that is common among
different
applications need not to be stored
repeatedly, as was the case in
the file
processing
environment. For example,
all three systems of an educational
institution
shown in
figure 3 need to store the
data about students. The
example data can be
seen
15
Database
Management System
(CS403)
VU
from
figure 2. Now the data
like registration number,
name, address, father name
that
is common
among different applications is
being stored repeatedly in the
file
processing
system environment, where as it is
being stored just once in
database
system
environment and is being shared by
all applications. The
interesting thing is
that
the individual applications do
not know that the
data is being shared and
they do
not
need to. Each application
gets the impression as if
the data is being for stored
for
it.
This brings the advantage of
saving the storage along
with others discussed
later.
o Data
Independence
Data and
programs are independent of
each other, so change is once has no
or
minimum
effect on other. Data and its
structure is stored in the database
where as
application
programs manipulating this
data are stored separately,
the change in one
does
not unnecessarily effect
other.
o Controlled
Redundancy
Means
that we do not need to
duplicate data unnecessarily; we do
duplicate data in
the
databases, however, this
duplication is deliberate and
controlled.
o Better
Data Integrity
Very
important feature; means the
validity of the data being
entered in the database.
Since the
data is being placed at a central place
and being managed by the
DBMS, so
it
provides a very conducive to
check or ensure that the
data being entered into
the
database
is actually valid. Integrity of
data is very important, since
all the processing
and the
information produced in return
are based on the data.
Now if the data
entered
is not
valid, how can we be sure that
the processing in the database is
correct and the
results
or the information produced is
valid? The businesses make decisions on
the
basis of
information produced from
the database and the wrong
information leads to
wrong
decisions, and business collapse. In the
database system environment,
DBMS
provides
many features to ensure the
data integrity, hence provides
more reliable data
processing
environment.
Dear
students, that is all for
this lecture. Today we got
the introduction of the
course,
importance
of the databases. Then we saw
different definitions of database and
studied
what is
data processing then studied
different features of the
traditional file processing
environment
and database (DB) system
environment. At the end of lecture we
were
discussing
the advantages of the DB approach.
There some others to be
studied in the
next
lecture. Suggestions are
welcome.
Exercises
o Think
about the data that
you may want to store about
different things around
you
o List
the changes that may
arise during the working of
any system, lets say
Railway
Reservation
System
16
Table of Contents:
|
|||||