|
|
Home >>
Computer Science
>>
Database Management
|
Data Warehousing |
|
Need of Data Warehousing
|
Why a DWH, Warehousing
|
The Basic Concept of Data Warehousing
|
Classical SDLC and DWH SDLC, CLDS, Online Transaction Processing
|
Types of Data Warehouses: Financial, Telecommunication, Insurance, Human Resource
|
Normalization: Anomalies, 1NF, 2NF, INSERT, UPDATE, DELETE
|
De-Normalization: Balance between Normalization and De-Normalization
|
DeNormalization Techniques: Splitting Tables, Horizontal splitting, Vertical Splitting, Pre-Joining Tables, Adding Redundant Columns, Derived Attributes
|
Issues of De-Normalization: Storage, Performance, Maintenance, Ease-of-use
|
Online Analytical Processing OLAP: DWH and OLAP, OLTP
|
OLAP Implementations: MOLAP, ROLAP, HOLAP, DOLAP
|
ROLAP: Relational Database, ROLAP cube, Issues
|
Dimensional Modeling DM: ER modeling, The Paradox, ER vs. DM,
|
Process of Dimensional Modeling: Four Step: Choose Business Process, Grain, Facts, Dimensions
|
Issues of Dimensional Modeling: Additive vs Non-Additive facts, Classification of Aggregation Functions
|
Extract Transform Load ETL: ETL Cycle, Processing, Data Extraction, Data Transformation
|
Issues of ETL: Diversity in source systems and platforms
|
Issues of ETL: legacy data, Web scrapping, data quality, ETL vs ELT
|
ETL Detail: Data Cleansing: data scrubbing, Dirty Data, Lexical Errors, Irregularities, Integrity Constraint Violation, Duplication
|
Data Duplication Elimination and BSN Method: Record linkage, Merge, purge, Entity reconciliation, List washing and data cleansing
|
Introduction to Data Quality Management: Intrinsic, Realistic, Orr’s Laws of Data Quality, TQM
|
DQM: Quantifying Data Quality: Free-of-error, Completeness, Consistency, Ratios
|
Total DQM: TDQM in a DWH, Data Quality Management Process
|
Need for Speed: Parallelism: Scalability, Terminology, Parallelization OLTP Vs DSS
|
Need for Speed: Hardware Techniques: Data Parallelism Concept
|
Conventional Indexing Techniques: Concept, Goals, Dense Index, Sparse Index
|
Special Indexing Techniques: Inverted, Bit map, Cluster, Join indexes
|
Join Techniques: Nested loop, Sort Merge, Hash based join
|
Data mining (DM): Knowledge Discovery in Databases KDD
|
Data Mining: CLASSIFICATION, ESTIMATION, PREDICTION, CLUSTERING,
|
Data Structures, types of Data Mining, Min-Max Distance, One-way, K-Means Clustering
|
DWH Lifecycle: Data-Driven, Goal-Driven, User-Driven Methodologies
|
DWH Implementation: Goal Driven Approach
|
DWH Implementation: Goal Driven Approach
|
DWH Life Cycle: Pitfalls, Mistakes, Tips
|
Course Project
|
Contents of Project Reports
|
Case Study: Agri-Data Warehouse
|
Web Warehousing: Drawbacks of traditional web sear ches, web search, Web traffic record: Log files
|
Web Warehousing: Issues, Time-contiguous Log Entries, Transient Cookies, SSL, session ID Ping-pong, Persistent Cookies
|
Data Transfer Service (DTS)
|
Lab Data Set: Multi -Campus University
|
Extracting Data Using Wizard
|
Data Profiling
|
|
|
Related Documents:
|
|
|
|