| COURSE CONTENT: |
Data Warehousing (DW) Terms And Concepts (Optional) If course delegates are conversant with the topics covered in this roughly 2.5 hour section, it may be bypassed. However, since many of the topics have implications for physical DW design (sizing, aggregates, extract Ć Transform Ć load (ETL), constraints, backup and recovery), delegates not comfortable with most of the topics listed below are recommended to consider them: The DW environment What is a data warehouse? What is a data mart? What is Business Intelligence (BI)? How do OLTP & OLAP differ? What is data mining? Operational vs. historical data What is a star schema? What is a snowflake schema? Normalization vs. denormalization What are hierarchies? What is dimensional modelling? What is the Data Warehouse Bus Architecture (DWB)? What are surrogate keys? What is Extract, Transform, Load (ETL)? What are Slowly Changing Dimensions (SCD)? What is Metadata? What Materialized Views (MV)? How does logical design differ from physical design?
Oracle Database Architecture – A Summary Memory structure Logical storage structure Physical storage structure Processes SQL tools and extensions Admin tools Real Application Cluster
Logical to Physical Design Process DW database objects (e.g., partitions, dimensions, materialized views) Hardware and I/O considerations
DW Partitioning Design Parallelism options Partitioning options Compression options Partitioning and join considerations NULL considerations Date data type considerations Composite (multi-key) considerations Local vs. global indexes Backup and recovery considerations
DW Index Design Options B*tree index – how they work Bitmapped index – how they work Bitmapped join indexes – how they work NULL value considerations Partitioning indexes Local vs. global indexes Index-organised tables
Join Options Hash Lossless Nested loop Partition-wise Sort merge Star transformation
DW Constraint Design Constraint types (unique, data cleanliness, optimisation, etc.) Constraint options (ENABLE NO VALIDATE, DISABLE NOVALIDATE) Constraints and partitioning Views and constraints
Materialialized Views (I.E. Aggregate/Summary Tables) When to consider Oracle’s summary managment feature How to create materialized views Physical storage considerations Populating (loading) options Refresh options Query rewrite design Query rewrite and integrity (e.g., ENFORCED, TRUSTED) How to use the Oracle Summary Advisor EXPLAIN MVIEW utility usage
Dimensions Definition and role Hierarchies – types When to create dimensions How to create dimension Dimension with attributes Normalized dimensions Validating dimensions Altering dimensions Dimensions and constraints Deleting dimensions
The ETL Process Overview The ETL Process Extract Extraction Types Data Extraction Techniques Data Cleansing Data Cleansing Techniques Data Transformation Delivery Data Load Options Surrogate Keys ETL Sub-task Summary ETL Vendor Considerations ETL in the Database
Oracle’s ETL Offerings Overview Extraction from operational systems Changed data capture Data transformation (e.g., data integration & cleansing) Generating keys Loading the DW – options
Oracle DW Tools Warehouse Builder Discover Reports OLAP & data mining
Oracle DW erformance Considerations ETL and Table Functions How to optimise bitmap and bitmap join indexes How to optimise bitmap star join transformations Placing Oracle objects (e.g., indexes) into separate buffers, Oracle 9i enhancement of multiple block sizes How to use Oracle 9i’s dynamic SGA feature How to use dynamic sampling Materialized views and FAST REFRESH Materialized view ‘size’ control Indexes Partitioning Don’t forget: dbms_stats, OPTIMIZER_MODE Star vs. snowflake schemas
Parallelism & Performance Introduction Parallelisable operation How parallelism works Parallelising SQL (DDL) Parallelising SQL (DML) Degree of parallelism Prameter setting for parallelism
Oracle 10G DW Changes and Enhancements – A Glance Optimiser hints Multi-CPU exploitation (e.g., table scans, REORGs, etc.) Materialized views and auto-rewrite Materialized views and dbms_advisor Automated Workload Repository (AWR) Multiple blocksizes/buffer management Star join query optimisation Oracle streams (i.e., near real-time data warehousing) Read-only table spaces Automatic Storage Management (ASM) Oracle Warehouse Builder vs. scripts Asynchronous Change Data Capture (CDC) Oracle Data Pump (e.g., high speed bulk data and metadata movement)
MT07/01 |