ZeePedia

DATABASE SOFTWARE: Data Security, Data Integrity, Integrity, Accessibility, DBMS

<< DATA MANAGEMENT
String Manipulations: >>
img
Introduction to Computing ­ CS101
VU
LESSON 37
DATABASE SOFTWARE
Focus of the last Lesson was on Data Management
First of a two-Lesson sequence
We became familiar with the issues and problems related to data-intensive computing
We also found out about flat-file and tabular storage
Data Management
Keeping track of a few dozen data items is straight forward
However, dealing with situations that involve significant number of data items, requires more
attention to the data handling process
Dealing with millions - even billions - of inter-related data items requires even more careful
thought
Issues in Data Management
Data Entry
New titles are added every day
New customers are being added every day
That new data needs to be added accurately
Data Updates
All those actions require updates to existing data
Those changes need to be entered accurately
Data Security
All the data that BholiBooks has in its computer systems is quite critical to its operation
The security of the customers' personal data is of utmost importance. Hackers are always looking
for that type of data, especially for credit card numbers
This problem can be managed by using appropriate security mechanisms that provide access to
authorized persons/computers only
Security can also be improved through:
­  Encryption
­  Private or virtual-private networks
­  Firewalls
­  Intrusion detectors
­  Virus detectors
Data Integrity
Integrity refers to maintaining the correctness and consistency of the data
­
Correctness: Free from errors
­
Consistency: No conflict among related data items
Integrity can be compromised in many ways:
­
Typing errors
­
Transmission errors
­
Hardware malfunctions
­
Program bugs
­
Viruses
250
img
Introduction to Computing ­ CS101
VU
­  Fire, flood, etc.
Ensuring Data Integrity
Type Integrity
Limit Integrity
Referential Integrity
Physical Integrity
Data Accessibility
What is required is that:
­
Data be stored in an organized manner
­
Additional info about the data be stored so that the data access times are minimized
A solution to this concurrency control problem: Lock access to data while someone is using it
DBMS
A DBMS takes care of the storage, retrieval, and management of large data sets on a database
It provides SW tools needed to organize & manipulate that data in a flexible manner
It includes facilities for:
­
Adding, deleting, and modifying data
­
Making queries about the stored data
­
Producing reports summarizing the required contents
Database
A collection of data organized in such a fashion that the computer can quickly search for a desired
data item
OS Independence
It provides an OS-independent view of the data to the user, making data manipulation and
management much more convenient
What can be stored in a database?
As long as it is digital data, it can be stored:
­  Numbers, Booleans, text
­  Sounds
­  Images
­  Video
In the very, very old days ...
Even large amounts of data was stored in text files, known as flat-file databases
All related info was stored in a single long, tab- or comma-delimited text file
Each group of info ­ called a record - in that file was separated by a special character; vertical bar
`|' was a popular option
Each record consisted of a group of fields, each field containing some distinct data item
The Trouble with Flat-File Databases
The text file format makes it hard to search for specific info or to create reports that include only
certain fields from each record
Reason: One has to search sequentially through the entire file to gather desired info, such as `all
books by a certain author'
251
img
Introduction to Computing ­ CS101
VU
However, for small sets of data ­ say, consisting of several tens of kB ­ they can provide
reasonable performance
Tabular Storage: Features & Possibilities
1.Similar items of data form a column
2.Fields placed in a particular row ­ same as a flat-file record ­ are strongly interrelated
3.One can sort the table w.r.t. any column
4.That makes searching ­ e.g., for all the books written by a certain author ­ straight forward
5.Similarly, searching for the 10 cheapest/most expensive books can be easily accomplished through a
sort
6.Effort required for adding a new field to all the records of a flat-file is much greater than adding a new
column to the table
CONCLUSION: Tabular storage is better than flat-file storage
We will continue on with tables' theme today
Today's Lecture:
Database SW
In our 4th & final Lesson on productivity software, we will continue our discussion from last week
on data management
We will find out about relational databases
We will also implement a simple relational database
Let's continue on with the tabular approach. We stored data in a table last time, and liked it. Let's revisit
that table and then put together another one
Table from the Last Lecture
Title
Author
Publisher
Price
InStock
Y
Good Bye Mr.
Altaf Khan
BholiBooks
1000
Bhola
The Terrible
Bhola
BholiBooks
199
Y
Twins
Champion
Calculus &
Smith Sahib
Good Publishers
325
N
Analytical
Geometry
Accounting
Zamin Geoffry
Sung-e-Kilometer
29
Y
Secrets
Publishers
Another table ...
Customer
Title
Shipment
Type
Aadil Ali
Good Bye Mr. Bhola
2002.12.26
Air
Aadil Ali
The Terrible Twins
2002.12.26
Air
Miftah Muslim
Calculus &
2002.12.25
Surface
Analytical Geometry
Karen Kaur
Good Bye Mr. Bhola
2002.12.24
Air
252
img
Introduction to Computing ­ CS101
VU
This & the previous table are related
They share a column, & are related through it
A program can match info from a field in one table with info in a corresponding field of another
table to generate a 3rd table that combines requested data from both tables
That is, a program can use matching values in 2 tables to relate info in one to info in the other
Q: Who is BholiBooks' best customer?
That is, who has spent the most money on the online bookstore?
To answer that question, one can process the inventory and the shipment tables to generate a third
table listing the customer names and the prices of the books that they have ordered
Customer
Price
The generated table
Aadil Ali
1000
Aadil Ali
199
Miftah Muslim
325
Can you now process this table to find the answer to our question
Karen Kaur
1000
Relational Databases
Databases consisting of two or more related tables are called relational databases
A typical relational database may have anywhere from 10 to over a thousand tables
Each column of those tables can contain only a single type of data (contrast this with spreadsheet
columns!)
Table rows are called records; row elements are called fields
A relational database stores all its data inside tables, and nowhere else
All operations on data are done on those tables or those that are generated by table operations
Tables, tables, and nothing but tables!
37.1 RDBMS
Relational DBMS software
Contains facilities for creating, populating, modifying, and querying relational databases
Examples:
­
­Access
DB2
­FileMaker Pro
­
Objectivity/DB
­SQL Server
­
MySQL
­Oracle
­
Postgres
The Trouble with Relational DBs
Much of current SW development is done using the object-oriented methodology
When we want to store the object-oriented data into an RDBMS, it needs to be translated into a
form suitable for RDBMS
The Trouble with Relational DBs
Then when we need to read the data back from the RDBMS, the data needs to be translated back
into an object-oriented form before use
These two processing delays, the associated processing, and time spent in writing and maintaining
the translation code are the key disadvantages of the current RDBMSes
253
img
Introduction to Computing ­ CS101
VU
Solution?
Don't have time to discuss that, but try searching the Web on the following terms:
Object-oriented databases
­  Object-relational databases
Classification of DBMS w.r.t. Size
Personal/Desktop/Single-user (MB-GB)
­
Examples: Tech. papers' list; Methai shop inventory
­
Typical DMBS: Access
Server-based/Multi-user/Enterprise (GB-TB)
­
Examples: HBL; Amazon.com
­
Typical DMBS: Oracle, DB2
Seriously-huge databases (TB-PB-XB)
­
Examples: 2002 ­ BaBar experiment at Stanford (500TB); 2005 ­ LHC database at CERN (1XB)
­
Typical DMBS: Objectivity/DB
37.2 Some Terminology
Primary Key is a field that uniquely identifies each record stored in a table
Queries are used to view, change, and analyze data. They can be used to:
­  Combine data from different tables, efficiently
­  Extract the exact data that is desired
Forms can be used for entering, editing, or viewing data, one record at a time
Reports are an effective, user-friendly way of presenting data. All DBMSes provide tools for
producing custom reports.
Data normalization is the process of efficiently organizing data in a database. There are two goals
of the normalization process:
­  Eliminate redundant data
­  Storing only related data in a table
Before we do a demo, let me just mention my favorite database application:
Data Mining
The process of analyzing large databases to identify patterns
Example: Mining the sales records from a BholiBooks could identify interesting shopping patterns
like "53% of customers who bought book A also bought book B". This pattern can be put to good use!
Dat a mining often utilizes intelligent systems' techniques
Let's now demonstrate the use of a desktop RDBMS
We will create a new relational database
It will consist of two tables
We will populate those tables
We will generate a report after combining the data from the two tables
Access Tutorial
http://www.microsoft.com/education/DOWNLOADS/tutorials/classroom/office2k/acc2000.doc
254
img
Introduction to Computing ­ CS101
VU
Today's Lecture:
In this final Lesson on productivity software, we continued our discussion from last week on data
management
We found out about relational databases
We also implemented a simple relational database
Next Lecture' Goals
(Cyber Crime)
To know the different types of computer crimes that occur over cyber space
To familiarize ourselves with with several methods that can be used to minimize the effect of these
crimes
To get familiar with a few policies and legislation designed to tackle cyber crime
255
Table of Contents:
  1. INTRODUCTION
  2. EVOLUTION OF COMPUTING
  3. World Wide Web, Web’s structure, genesis, its evolution
  4. Types of Computers, Components, Parts of Computers
  5. List of Parts of Computers
  6. Develop your Personal Web Page: HTML
  7. Microprocessor, Bus interface unit, Data & instruction cache memory, ALU
  8. Number systems, binary numbers, NOT, AND, OR and XOR logic operations
  9. structure of HTML tags, types of lists in web development
  10. COMPUTER SOFTWARE: Operating Systems, Device Drivers, Trialware
  11. Operating System: functions, components, types of operating systems
  12. Forms on Web pages, Components of Forms, building interactive Forms
  13. APPLICATION SOFTWARE: Scientific, engineering, graphics, Business, Productivity, Entertainment, Educational Software
  14. WORD PROCESSING: Common functions of word processors, desktop publishing
  15. Interactivity to Forms, JavaScript, server-side scripts
  16. ALGORITHMS
  17. ALGORITHMS: Pseudo code, Flowcharts
  18. JavaScript and client-side scripting, objects in JavaScript
  19. Low, High-Level, interpreted, compiled, structured & object-oriented programming languages
  20. Software Design and Development Methodologies
  21. DATA TYPES & OPERATORS
  22. SPREADSHEETS
  23. FLOW CONTROL & LOOPS
  24. DESIGN HEURISTICS. Rule of thumb learned through trial & error
  25. WEB DESIGN FOR USABILITY
  26. ARRAYS
  27. COMPUTER NETWORKS: types of networks, networking topologies and protocols
  28. THE INTERNET
  29. Variables: Local and Global Variables
  30. Internet Services: FTP, Telnet, Web, eMail, Instant messaging, VoIP
  31. DEVELOPING PRESENTATIONS: Effective Multimedia Presentations
  32. Event Handlers
  33. GRAPHICS & ANIMATION
  34. INTELLIGENT SYSTEMS: techniques for designing Artificial Intelligent Systems
  35. Mathematical Functions in JavaScript
  36. DATA MANAGEMENT
  37. DATABASE SOFTWARE: Data Security, Data Integrity, Integrity, Accessibility, DBMS
  38. String Manipulations:
  39. CYBER CRIME
  40. Social Implications of Computing
  41. IMAGES & ANIMATION
  42. THE COMPUTING PROFESSION
  43. THE FUTURE OF COMPUTING
  44. PROGRAMMING METHODOLOGY
  45. REVIEW & WRAP-UP of Introduction to Computing