ZeePedia

Attributes, The Keys

<< Entity-Relationship Data Model, Classification of entity types, Attributes
Relationships:Types of Relationships in databases >>
img
Database Management System (CS403)
VU
Lecture No. 08
Reading Material
"Database Systems Principles, Design and Implementation"
Section 5.4
written by Catherine Ricardo, Maxwell Macmillan.
Overview of Lecture
o Concept of Key and its importance
o Different types of keys
Attributes
Def 1:
An attribute is any detail that serves to identify, qualify, classify, quantify, or
otherwise express the state of an entity occurrence or a relationship.
Def 2:
Attributes are data objects that either identify or describe entities.
Identifying entity type and then assigning attributes or other way round; it's an "egg or
hen" first problem. It works both ways; differently for different people. It is possible that
we first identify an entity type, and then we describe it in real terms, or through its
attributes keeping in view the requirements of different users' groups. Or, it could be
other way round; we enlist the attribute included in different users' requirements and then
group different attributes to establish entity types. Attributes are specific pieces of
information, which need to be known or held. An attribute is either required or optional.
When it's required, we must have a value for it, a value must be known for each entity
occurrence. When it's optional, we could have a value for it, a value may be known for
each entity occurrence.
The Keys
Attributes act as differentiating agents among different entity types, that is, the
differences between entity types must be expressed in terms of attributes. An entity type
can have many instances; each instance has got a certain value against each attribute
defined as part of that particular entity type. A key is a set of attributes that can be used to
80
img
Database Management System (CS403)
VU
identify or access a particular entity instance of an entity type (or out of an entity set).
The concept of key is beautiful and very useful; why and how. An entity type may have
many instances, from a few to several thousands and even more. Now out of many
instances, when and if we want to pick a particular/single instance, and many times we do
need it, then key is the solution. For example, think of whole population , the
data of all Pakistanis lying at one place, say with NADRA people. Now if at sometime
we need to identify a particular person out of all this data, how can we do that? Can we
use name for that, well think of any name, like Mirza Zahir Iman Afroz, now we may
find many people with this name in Pakistan. Another option is the combination of name
and father name, then again, Amjad Malik s/o Mirza Zahir Iman Afroz, there could be so
many such pairs. There could be many such examples. However, if you think about
National ID Card number, then no matter whatever is the population , you will
always be able to pick precisely a single person. That is the key. While defining an entity
type we also generally define the key of that entity type. How do we select the key, from
the study of the real-world system; key attribute(s) already exist there, sometimes they
don't then the designer has to define one. A key can be simple, that is, consisting of
single attribute, or it could be composite which consists of two or more attributes.
Following are the major types of keys:
Super Key
o
Candidate Key
o
Primary Key
o
Alternate Key
o
Secondary Key
o
Foreign Key
o
The last one will be discussed later, remaining 5 are discussed in the following:
o Super key
A super key is a set of one or more attributes which taken collectively, allow us to
identify uniquely an entity instance in the entity set. This definition is same as of a
key, it means that the super key is the most general type of key. For example,
consider the entity type STUDENT with attributes registration number, name, father
name, address, phone, class, admission date. Now which attribute can we use that can
uniquely identify any instance of STUDENT entity type. Of course, none of the name,
father name, address, phone number, class, admission date can be used for this
purpose. Why? Because if we consider name as super key, and situation arises that we
need to contact the parents of a particular student. Now if we say to our registration
department that give us the phone number of the student whose name is Ilyas Hussain,
the registration department conducts a search and comes up with 10 different Ilyas
Hussain, could be anyone. So the value of the name attribute cannot be used to pick a
particular instance. Same happens with other attributes. However, if we use the
registration number, then it is 100% sure that with a particular value of registration
number we will always find exactly a single unique entity instance. Once you
identified the instance, you have all its attributes available, name, father name,
81
img
Database Management System (CS403)
VU
everything. The entity type STUDENT and its attributes are shown graphically in the
figure 1 below, with its super key "regNo" underlined.
name
regNo
EMPLOYEE
fName
address
phoneNo
Fig. 1: An entity type, its defining attributes
and super key (underlined)
Once specific characteristic with super key is that, as per its definition any combination
of attributes with the super key is also a super key. Like, in the example just discussed
where we have identified regNo as super key, now if we consider any combination of
regNo with any other attribute of STUDENT entity type, the combination will also be a
super key. For example, "regNo, name", "regNo, fName, address", "name, fName,
regNo" and many others, all are super keys.
o Candidate key
A super key for which no subset is a super key is called a candidate key, or the
minimal super key is the candidate key. It means that there are two conditions for the
candidate key, one, it identifies the entity instances uniquely, as is required in case of
super key, second, it should be minimum, that is, no proper subset of candidate key is
a key. So if we have a simple super key, that is, that consists of single attribute, it is
definitely a candidate key, 100%. However, if we have a composite super key and if
we take any attribute out of it and remaining part is not a super key anymore then that
composite super key is also a candidate key since it is minimal super key. For
example, one of the super keys that we identified from the entity type STUDENT of
figure 1 is "regNo, name", this super key is not a candidate key, since if we remove
the regNo attribute from this combination, name attribute alone is not able to identify
the entity instances uniquely, so it does not satisfy the first condition of candidate key.
On the other hand if we remove the attribute name from this composite key then the
regNo alone is sufficient to identify the instances uniquely, so "regNo, name" does
have a proper subset (regNo) that can act as a super key; violation of second
condition. So the composite key "regNo, name" is a super key but it is not a candidate
key. From here we can also establish a fact that every candidate key is a super key but
not the other way round.
o Primary Key
A candidate key chosen by the database designer to act as key is the primary key. An
entity type may have more than one candidate keys, in that case the database designer has
82
img
Database Management System (CS403)
VU
to designate one of them as primary key, since there is always only a single primary key
in an entity type. If there is just one candidate key then obviously the same will be
declared as primary key. The primary key can also be defined as the successful candidate
key. Figure 2 below contains the entity type STUDENT of figure 1 but with an additional
attribute nIdNumber (national ID card Number).
nIdNumber
name
regNo
EMPLOYEE
fName
address
phoneNo
Fig. 2: An entity type, its defining attributes
and two candidate keys
In figure 2, we can identify two different attributes that can individually identify the
entity instances of STUDENT and they are regNo and nIdNumber, both are minimal
super keys so both are candidate keys. Now in this situation we have got two candidate
keys. The one that we choose will be declared as primary key, other will be the alternate
key. Any of the candidate keys can be selected as primary key, it mainly depends on the
database designer which choice he/she makes. There are certain things that are generally
considered while making this decision, like the candidate key that is shorter, easier to
remember, to type and is more meaningful is selected as primary key. These are general
recommendations in this regard, but finally it is the decision of the designer and he/she
may have his/her own reasons for a particular selection that may be entirely different
from those mentioned above. The relation that holds between super and candidate keys
also holds between candidate and primary keys, that is, every primary key (PK) is a
candidate key and every candidate key is a super key.
A certain value that may be associated with any attribute is NULL, that means "not
given" or "not defined". A major characteristic of the PK is that it cannot have the NULL
value. If PK is a composite, then none of the attributes included in the PK can have the
NULL, for example, if we are using "name, fName" as PK of entity type STUDENT,
then none of the instances may have NULL value in either of the name or fName or both.
o Alternate Keys
Candidate keys which are not chosen as the primary key are known as alternate keys.
For example, we have two candidate keys of STUDENT in figure 2, regNo and
nIdNumber, if we select regNo as PK then the nIdNumber will be alternate key.
o Secondary Key
Many times we need to access certain instances of an entity type using the value(s) of one
or more attributes other than the PK. The difference in accessing instances using the
83
img
Database Management System (CS403)
VU
value of a key or non-key attribute is that the search on the value of PK will always return
a single instance (if it exists), where as uniqueness is not guaranteed in case of non-key
attribute. Such attributes on which we need to access the instances of an entity type that
may not necessarily return unique instance is called the secondary key. For example, we
want to see how many of our students belong to Multan, in that case we will access those
instances of the STUDENT entity type that contain "Multan" in their address. In this case
address will be called secondary key, since we are accessing instances on the basis of its
value, and there is no compulsion that we will get a single instance. Keep one thing in
mind here, that a particular access on the value of a secondary key MAY return a single
instance, but that will be considered as chance or due to that particular state of entity set.
There is not the compulsion or it is not necessary for secondary key to return unique
instance, where as in case of super, candidate, primary and alternate keys it is compulsion
that they will always return unique instance against a particular value.
Summary
Keys are fundamental to the concept almost any data model including the E-R data model
because they enable the unique identity of an entity instance. There are different type of
keys that may exist in an entity type.
Exercises:
·  Define attributes of the entity types CAR, BOOK, MOVIE; draw them
graphically
·  Identify different types of keys in each one of them
84
Table of Contents:
  1. Introduction to Databases and Traditional File Processing Systems
  2. Advantages, Cost, Importance, Levels, Users of Database Systems
  3. Database Architecture: Level, Schema, Model, Conceptual or Logical View:
  4. Internal or Physical View of Schema, Data Independence, Funct ions of DBMS
  5. Database Development Process, Tools, Data Flow Diagrams, Types of DFD
  6. Data Flow Diagram, Data Dictionary, Database Design, Data Model
  7. Entity-Relationship Data Model, Classification of entity types, Attributes
  8. Attributes, The Keys
  9. Relationships:Types of Relationships in databases
  10. Dependencies, Enhancements in E-R Data Model. Super-type and Subtypes
  11. Inheritance Is, Super types and Subtypes, Constraints, Completeness Constraint, Disjointness Constraint, Subtype Discriminator
  12. Steps in the Study of system
  13. Conceptual, Logical Database Design, Relationships and Cardinalities in between Entities
  14. Relational Data Model, Mathematical Relations, Database Relations
  15. Database and Math Relations, Degree of a Relation
  16. Mapping Relationships, Binary, Unary Relationship, Data Manipulation Languages, Relational Algebra
  17. The Project Operator
  18. Types of Joins: Theta Join, Equi–Join, Natural Join, Outer Join, Semi Join
  19. Functional Dependency, Inference Rules, Normal Forms
  20. Second, Third Normal Form, Boyce - Codd Normal Form, Higher Normal Forms
  21. Normalization Summary, Example, Physical Database Design
  22. Physical Database Design: DESIGNING FIELDS, CODING AND COMPRESSION TECHNIQUES
  23. Physical Record and De-normalization, Partitioning
  24. Vertical Partitioning, Replication, MS SQL Server
  25. Rules of SQL Format, Data Types in SQL Server
  26. Categories of SQL Commands,
  27. Alter Table Statement
  28. Select Statement, Attribute Allias
  29. Data Manipulation Language
  30. ORDER BY Clause, Functions in SQL, GROUP BY Clause, HAVING Clause, Cartesian Product
  31. Inner Join, Outer Join, Semi Join, Self Join, Subquery,
  32. Application Programs, User Interface, Forms, Tips for User Friendly Interface
  33. Designing Input Form, Arranging Form, Adding Command Buttons
  34. Data Storage Concepts, Physical Storage Media, Memory Hierarchy
  35. File Organizations: Hashing Algorithm, Collision Handling
  36. Hashing, Hash Functions, Hashed Access Characteristics, Mapping functions, Open addressing
  37. Index Classification
  38. Ordered, Dense, Sparse, Multi-Level Indices, Clustered, Non-clustered Indexes
  39. Views, Data Independence, Security, Vertical and Horizontal Subset of a Table
  40. Materialized View, Simple Views, Complex View, Dynamic Views
  41. Updating Multiple Tables, Transaction Management
  42. Transactions and Schedules, Concurrent Execution, Serializability, Lock-Based Concurrency Control, Deadlocks
  43. Incremental Log with Deferred, Immediate Updates, Concurrency Control
  44. Serial Execution, Serializability, Locking, Inconsistent Analysis
  45. Locking Idea, DeadLock Handling, Deadlock Resolution, Timestamping rules