|
|||||
Database
Management System
(CS403)
VU
Lecture No.
08
Reading
Material
"Database
Systems Principles, Design and
Implementation"
Section
5.4
written
by Catherine Ricardo, Maxwell
Macmillan.
Overview of
Lecture
o Concept
of Key and its
importance
o Different
types of keys
Attributes
Def
1:
An
attribute is any detail that
serves to identify, qualify,
classify, quantify, or
otherwise
express the state of an
entity occurrence or a
relationship.
Def
2:
Attributes
are data objects that
either identify or describe
entities.
Identifying
entity type and then
assigning attributes or other
way round; it's an "egg
or
hen"
first problem. It works both
ways; differently for
different people. It is possible
that
we first
identify an entity type, and
then we describe it in real
terms, or through its
attributes
keeping in view the
requirements of different users'
groups. Or, it could
be
other
way round; we enlist the
attribute included in different
users' requirements and
then
group
different attributes to establish
entity types. Attributes
are specific pieces
of
information,
which need to be known or
held. An attribute is either
required or optional.
When
it's required, we must have
a value for it, a value
must be known for each
entity
occurrence.
When it's optional, we could
have a value for it, a
value may be known
for
each
entity occurrence.
The
Keys
Attributes
act as differentiating agents among
different entity types, that
is, the
differences
between entity types must be
expressed in terms of attributes. An
entity type
can have
many instances; each
instance has got a certain
value against each
attribute
defined
as part of that particular
entity type. A key is a set
of attributes that can be used
to
80
Database
Management System
(CS403)
VU
identify
or access a particular entity
instance of an entity type
(or out of an entity
set).
The
concept of key is beautiful and
very useful; why and how. An
entity type may
have
many
instances, from a few to several
thousands and even more. Now
out of many
instances,
when and if we want to pick a
particular/single instance, and many
times we do
need
it, then key is the
solution. For example, think
of whole population ,
the
data of
all Pakistanis lying at one place, say
with NADRA people. Now if at
sometime
we need
to identify a particular person out of
all this data, how can we do
that? Can we
use name
for that, well think of
any name, like Mirza Zahir
Iman Afroz, now we
may
find
many people with this name in Pakistan.
Another option is the
combination of name
and
father name, then again,
Amjad Malik s/o Mirza Zahir
Iman Afroz, there could be
so
many
such pairs. There could be
many such examples. However,
if you think about
National
ID Card number, then no matter
whatever is the population ,
you will
always be
able to pick precisely a single person.
That is the key. While
defining an entity
type we
also generally define the
key of that entity type.
How do we select the key,
from
the
study of the real-world
system; key attribute(s)
already exist there, sometimes
they
don't
then the designer has to
define one. A key can be
simple, that is, consisting
of
single
attribute, or it could be composite
which consists of two or more
attributes.
Following
are the major types of
keys:
Super
Key
o
Candidate
Key
o
Primary
Key
o
Alternate
Key
o
Secondary
Key
o
Foreign
Key
o
The
last one will be discussed later,
remaining 5 are discussed in
the following:
o Super
key
A super
key is a set
of one or more attributes which
taken collectively, allow us
to
identify
uniquely an entity instance in
the entity set. This
definition is same as of a
key, it
means that the super key is
the most general type of
key. For example,
consider
the entity type STUDENT
with attributes registration
number, name, father
name,
address, phone, class,
admission date. Now which
attribute can we use that
can
uniquely
identify any instance of
STUDENT entity type. Of course,
none of the name,
father
name, address, phone number,
class, admission date can be
used for this
purpose.
Why? Because if we consider name as super
key, and situation arises
that we
need to
contact the parents of a particular
student. Now if we say to
our registration
department
that give us the phone
number of the student whose
name is Ilyas Hussain,
the
registration department conducts a
search and comes up with 10
different Ilyas
Hussain,
could be anyone. So the
value of the name attribute
cannot be used to pick
a
particular
instance. Same happens with
other attributes. However, if we
use the
registration
number, then it is 100% sure
that with a particular value
of registration
number we
will always find exactly a
single unique entity
instance. Once you
identified
the instance, you have
all its attributes
available, name, father
name,
81
Database
Management System
(CS403)
VU
everything.
The entity type STUDENT and
its attributes are shown
graphically in the
figure 1
below, with its super key
"regNo" underlined.
name
regNo
EMPLOYEE
fName
address
phoneNo
Fig. 1: An
entity type, its defining
attributes
and
super key
(underlined)
Once
specific characteristic with super
key is that, as per its
definition any
combination
of
attributes with the super
key is also a super key. Like, in
the example just
discussed
where we
have identified regNo as super
key, now if we consider any
combination of
regNo
with any other attribute of
STUDENT entity type, the
combination will also be a
super
key. For example, "regNo,
name", "regNo, fName, address",
"name, fName,
regNo"
and many others, all are
super keys.
o Candidate
key
A super
key for which no subset is a
super key is called a candidate key, or
the
minimal
super key is the candidate key. It
means that there are
two conditions for
the
candidate
key, one, it identifies the
entity instances uniquely, as is required
in case of
super
key, second, it should be
minimum, that is, no proper
subset of candidate key is
a key. So
if we have a simple super key,
that is, that consists of
single attribute, it is
definitely
a candidate key, 100%. However, if we
have a composite super key and
if
we take
any attribute out of it and
remaining part is not a super
key anymore then
that
composite
super key is also a candidate key since it is
minimal super key.
For
example,
one of the super keys that we
identified from the entity
type STUDENT of
figure 1
is "regNo, name", this super
key is not a candidate key, since if we
remove
the
regNo attribute from this
combination, name attribute alone is
not able to identify
the
entity instances uniquely, so it does
not satisfy the first
condition of candidate key.
On the
other hand if we remove the
attribute name from this
composite key then
the
regNo
alone is sufficient to identify
the instances uniquely, so "regNo,
name" does
have a
proper subset (regNo) that
can act as a super key; violation of
second
condition.
So the composite key "regNo,
name" is a super key but it is
not a candidate
key.
From here we can also establish a fact
that every candidate key is a super
key but
not
the other way
round.
o Primary
Key
A candidate
key chosen by the database designer to
act as key is the primary
key. An
entity
type may have more
than one candidate keys, in that
case the database designer
has
82
Database
Management System
(CS403)
VU
to
designate one of them as primary
key, since there is always
only a single primary
key
in an
entity type. If there is
just one candidate key then
obviously the same will
be
declared as
primary key. The primary
key can also be defined as the
successful candidate
key.
Figure 2 below contains the
entity type STUDENT of
figure 1 but with an
additional
attribute
nIdNumber (national ID card
Number).
nIdNumber
name
regNo
EMPLOYEE
fName
address
phoneNo
Fig. 2: An
entity type, its defining
attributes
and
two candidate keys
In figure
2, we can identify two different
attributes that can individually
identify the
entity
instances of STUDENT and they are
regNo and nIdNumber, both
are minimal
super
keys so both are candidate
keys. Now in this situation
we have got two
candidate
keys.
The one that we choose will be declared
as primary key, other will be
the alternate
key. Any
of the candidate keys can be selected as
primary key, it mainly
depends on the
database
designer which choice he/she makes. There
are certain things that
are generally
considered
while making this decision,
like the candidate key that
is shorter, easier to
remember,
to type and is more meaningful is
selected as primary key. These
are general
recommendations
in this regard, but finally it is
the decision of the designer and
he/she
may
have his/her own reasons
for a particular selection
that may be entirely
different
from
those mentioned above. The
relation that holds between
super and candidate keys
also
holds between candidate and primary
keys, that is, every
primary key (PK) is a
candidate
key and every candidate key is a super
key.
A certain
value that may be associated
with any attribute is NULL,
that means "not
given" or
"not defined". A major
characteristic of the PK is that it
cannot have the NULL
value. If
PK is a composite, then none of
the attributes included in
the PK can have the
NULL,
for example, if we are using
"name, fName" as PK of entity
type STUDENT,
then
none of the instances may
have NULL value in either of
the name or fName or
both.
o Alternate
Keys
Candidate
keys which are not chosen as
the primary key are
known as alternate
keys.
For
example, we have two candidate
keys of STUDENT in figure 2, regNo
and
nIdNumber, if we select
regNo
as PK
then the nIdNumber
will be
alternate key.
o Secondary
Key
Many
times we need to access
certain instances of an entity type
using the value(s) of
one
or more
attributes other than the
PK. The difference in
accessing instances using
the
83
Database
Management System
(CS403)
VU
value of
a key or non-key attribute is
that the search on the
value of PK will always
return
a single
instance (if it exists),
where as uniqueness is not guaranteed in
case of non-key
attribute.
Such attributes on which we
need to access the instances of an
entity type that
may
not necessarily return
unique instance is called
the secondary key. For
example, we
want to
see how many of our
students belong to Multan, in
that case we will access
those
instances of
the STUDENT entity type
that contain "Multan" in
their address. In this
case
address
will be called secondary key, since we
are accessing instances on the
basis of its
value,
and there is no compulsion that we will
get a single instance. Keep one thing
in
mind
here, that a particular access on
the value of a secondary key
MAY return a single
instance,
but that will be considered as chance or due to
that particular state of
entity set.
There is
not the compulsion or it is
not necessary for secondary
key to return unique
instance,
where as in case of super, candidate,
primary and alternate keys it is
compulsion
that
they will always return
unique instance against a particular
value.
Summary
Keys
are fundamental to the
concept almost any data
model including the E-R
data model
because
they enable the unique
identity of an entity instance.
There are different type
of
keys
that may exist in an entity
type.
Exercises:
· Define
attributes of the entity
types CAR, BOOK, MOVIE;
draw them
graphically
· Identify
different types of keys in
each one of them
84
Table of Contents:
|
|||||