ZeePedia
Home >> Computer Science >> Database Management


Data Warehousing

Need of Data Warehousing
Why a DWH, Warehousing
The Basic Concept of Data Warehousing
Classical SDLC and DWH SDLC, CLDS, Online Transaction Processing
Types of Data Warehouses: Financial, Telecommunication, Insurance, Human Resource
Normalization: Anomalies, 1NF, 2NF, INSERT, UPDATE, DELETE
De-Normalization: Balance between Normalization and De-Normalization
DeNormalization Techniques: Splitting Tables, Horizontal splitting, Vertical Splitting, Pre-Joining Tables, Adding Redundant Columns, Derived Attributes
Issues of De-Normalization: Storage, Performance, Maintenance, Ease-of-use
Online Analytical Processing OLAP: DWH and OLAP, OLTP
OLAP Implementations: MOLAP, ROLAP, HOLAP, DOLAP
ROLAP: Relational Database, ROLAP cube, Issues
Dimensional Modeling DM: ER modeling, The Paradox, ER vs. DM,
Process of Dimensional Modeling: Four Step: Choose Business Process, Grain, Facts, Dimensions
Issues of Dimensional Modeling: Additive vs Non-Additive facts, Classification of Aggregation Functions
Extract Transform Load ETL: ETL Cycle, Processing, Data Extraction, Data Transformation
Issues of ETL: Diversity in source systems and platforms
Issues of ETL: legacy data, Web scrapping, data quality, ETL vs ELT
ETL Detail: Data Cleansing: data scrubbing, Dirty Data, Lexical Errors, Irregularities, Integrity Constraint Violation, Duplication
Data Duplication Elimination and BSN Method: Record linkage, Merge, purge, Entity reconciliation, List washing and data cleansing
Introduction to Data Quality Management: Intrinsic, Realistic, Orr’s Laws of Data Quality, TQM
DQM: Quantifying Data Quality: Free-of-error, Completeness, Consistency, Ratios
Total DQM: TDQM in a DWH, Data Quality Management Process
Need for Speed: Parallelism: Scalability, Terminology, Parallelization OLTP Vs DSS
Need for Speed: Hardware Techniques: Data Parallelism Concept
Conventional Indexing Techniques: Concept, Goals, Dense Index, Sparse Index
Special Indexing Techniques: Inverted, Bit map, Cluster, Join indexes
Join Techniques: Nested loop, Sort Merge, Hash based join
Data mining (DM): Knowledge Discovery in Databases KDD
Data Mining: CLASSIFICATION, ESTIMATION, PREDICTION, CLUSTERING,
Data Structures, types of Data Mining, Min-Max Distance, One-way, K-Means Clustering
DWH Lifecycle: Data-Driven, Goal-Driven, User-Driven Methodologies
DWH Implementation: Goal Driven Approach
DWH Implementation: Goal Driven Approach
DWH Life Cycle: Pitfalls, Mistakes, Tips
Course Project
Contents of Project Reports
Case Study: Agri-Data Warehouse
Web Warehousing: Drawbacks of traditional web sear ches, web search, Web traffic record: Log files
Web Warehousing: Issues, Time-contiguous Log Entries, Transient Cookies, SSL, session ID Ping-pong, Persistent Cookies
Data Transfer Service (DTS)
Lab Data Set: Multi -Campus University
Extracting Data Using Wizard
Data Profiling

Related Documents: