"You Can Do It" Datawarehouse: Beginner To Advanced in Two Hours
"You Can Do It" Datawarehouse: Beginner To Advanced in Two Hours
by Arup Nanda Manager Database Systems Starwood Hotels & Resorts International White Plains, NY
Objectives
Exploring DW Techniques in Oracle Case Study Oracle 10G Additions
DB1 Cust1
Cust2
Cust9
Cust3 Datawarehouse
Cust8
Cust4
DB3
DB4
Problems
Incoming Data Irregular Summary Tables Need Refreshing Quarters Added Continuously Archival Requirements Vary Across Customers Quick Retrieval of Archival Needed
Problems contd.
Summary on Summary Tables as Materialized Views Need Refresh Whenever New Data Arrives Or When Data is Purged/Reinstated Customers Added and Deleted Frequently
Objective
To Minimize Downtime for Refreshes
Incrementally Refresh Partitioning Techniques
To Add Customers Easily To Add Quarters Easily To Archive Off and Purge Easily and Atomically To Restore Archives Quickly
Objective contd.
To have an ETL Setup for Easy Addition of Objects Such As Tables, Indexes, Mat Views. Use Only Available Oracle and Unix Tools
PL/SQL Unix Shell Scripts SQL*Plus
Design
Varying Dimensions
Customer Quarter
Composite Partitioning
Range (for Quarters) List (for Customers)
Local Indexes
Partitioning
Partitioned on CLAIM_DATE
RANGE Partitioned named YyyQq
Storage Clauses Not Defined
Supartitioned on CUST_NAME
LIST Named YyyQq_CustName, e.g. Y03Q3_CUST1
Indexing
All Indexes Local
CREATE INDEX IN_CLAIM_SUM_01 LOCAL ON SUMTAB1 (COL1, COL2)
Storage
Each Subpartition of Index or Table is kept in separate tablespaces named in the format Y<Year>Q<Qtr>_<CustName>_DATA e.g. Y02Q2_CUST1_DATA Y02Q2_CUST2_DATA Y03Q3_CUST1_DATA
Table
Index
Customers
Cust3 Y03 Q3
Cust3 Y03 Q3
Tablespace
create tablespace y03q3_cust1_data datafile /oradata/y03q3_cust1_data_01.dbf size 500m autoextend on next 500m extent management local segment space management auto
CREATE TABLE TAB1 ( ) PARTITION BY RANGE (CLAIM_DATE) SUBPARTITION BY LIST (CUST_NAME) ( PARTITION Y03Q1 VALUES LESS THAN (TO_DATE(2003/04/01,YYYY/MM/DD)), ( SUBPARTITION Y03Q1_CUST1 VALUES (CUST1) TABLESPACE Y03Q1_CUST1_DATA, SUBPARTITION Y03Q1_CUST2 VALUES (CUST2) TABLESPACE Y03Q1_CUST2_DATA,
Table DDL
)
)
CREATE INDEX IN_TAB1_01 ON TAB1 (COL1) LOCAL NOLOGGING ( PARTITION Y03Q1 ( SUBPARTITION Y03Q1_CUST1 TABLESPACE Y03Q1_CUST1_INDX, SUBPARTITION Y03Q1_CUST2 TABLESPACE Y03Q1_CUST2_INDX, and so on for all subpartitions SUBPARTITION Y03Q1_DEF TABLESPACE USER_DATA ), PARTITION Y03Q2 ( SUBPARTITION Y03Q2_CUST1 TABLESPACE Y03Q2_CUST1_INDX, SUBPARTITION Y03Q2_CUST2 TABLESPACE Y03Q2_CUST2_INDX, and so on for all subpartitions SUBPARTITION Y03Q2_DEF TABLESPACE USER_DATA ), and so on for all the partitions PARTITION DEF ( SUBPARTITION DEF_CUST1 TABLESPACE USER_DATA, SUBPARTITION DEF_CUST2 TABLESPACE USER_DATA, and so on for all subpartitions SUBPARTITION DEF_DEF TABLESPACE USER_DATA ) )
Index DDL
Creating DDLs
Static Part
create table tab1 () DDL to Create Table Variable Part partition y03q1 ( subpartition y03q1_cust1 tablespace )
Constraints
Constraints defined as DISABLE NOVALIDATE RELY
Constraint
VALIDATE/NOVALIDATE
Table TAB1 (Column: STATUS) Current Values A, I, F Check Constraint: STATUS IN (A,I)
ENABLE/DISABLE
New Value F
RELY
RELY
Reasons To Include Relation Information to the Metadata To Enable Query Rewrite
On Source
SELECT CUST1 AS CUST_NAME, CLAIM_DATE, PROVIDER_ID, COUNT(DISTINCT CLAIM_ID) AS NUM_CLAIMS, COUNT(*) AS NUM_LINES FROM . GROUP BY
Casting
SELECT CAST (CUST_NAME AS VARCHAR2(20)) AS CUST_NAME FROM <viewname> CAST (column_name AS datatype (precision))
cust1
View
INDEX
Temporary Table
Massaging Analyzing
DW
cust
Old Sub Partition
View
INDEX
TABLE
DW
Technique
Not Using DBMS_MVIEW.REFRESH MV is always STALE
Temp Table
CREATE TABLE T1_Y03Q1_CUST1 TABLESPACE Y03Q1_CUST1_DATA PARALLEL 8 NOLOGGING AS SELECT FROM CUST1.VIEW1@DB1 WHERE CLAIM_DATE >= add_months(trunc(to_date(03','RR'),'YYYY'), 3*(to_number(1')-1)) and batch_date < last_day(add_months(trunc( to_date(03','RR'),'YYYY'), 3*(to_number(1')) - 1 )) + 1
Script
CREATE TABLE T1_Y&&YY.Q&&Q._&&CUST TABLESPACE Y&&YY.Q&&Q._&&CUST._DATA PARALLEL 8 NOLOGGING AS SELECT FROM &&CUST..VIEW1@&&DBLINK WHERE CLAIM_DATE >= ADD_MONTHS(TRUNC(TO_DATE('&&YY','RR'),'YYYY'), 3*(TO_NUMBER('&&Q')-1)) AND BATCH_DATE < LAST_DAY(ADD_MONTHS(TRUNC( TO_DATE('&&YY','RR'),'YYYY'), 3*(TO_NUMBER('&&Q')) -1 )) + 1
External Table
Reason Source is a non-Oracle DB, e.g. DB2 Source is External, no DB Link Allowed Fixed Format vs- Delimited Fixed Format Faster, Easier More Space Delimited Less Space Slower, Slightly More Complex
Massaging
Removing NOT NULL Constraints Making Datatypes Consistent
The CAST operation converts NUMBER(m,n) to NUMBER cast(col1 as number(10,2)) as col1_m COL1 NUMBER(5,2) COL1_M NUMBER
Analyzing
Using DBMS_STATS.GATHER_TABLE_STATS PARALLEL Degree
dbms_stats.gather_table_stats ( ownname => DWOWNER', tabname => '&&TABNAME', estimate_percent => dbms_stats.auto_sample_size, method_opt => 'FOR ALL INDEXED COLUMNS SIZE AUTO', degree => dbms_stats.default_degree, cascade => TRUE );
Mat Views
MVs Created as Tables CREATE TABLE MV_SUMMTAB1 Storage clauses just like the underlying table CREATE MATERIALIZED VIEW MV_SUMMTAB1 ON PREBUILT TABLE AS SELECT
http://www.proligence.com/painless_alter.pdf
Query Rewrite
Table SUM_CLAIMS PROVIDER_ID, STATE, TYPE, TOT_AMT Table MV_SUM_CLAIMS PROVIDER_ID, STATE, SUM(TOT_AMT) TOT_AMT GROUP BY PROVIDER_ID, STATE
Query Rewrite
Init.ora Parameters query_rewrite_enabled='TRUE' query_rewrite_integrity='STALE_TOLERATED ENFORCED Rewrite only if guaranteed TRUSTED Uses only if RELY STALE_TOLERATED Even if not RELY
Checking QR
dbms_mview.explain_rewrite ( select cust_name, count(*) from summtab1 group by cust_name );
Design
MV_* subpartitions are on the same tablespace as the parents. Subparts of MV_SUMMTAB1_0? are in the same TS as SUMMTAB1 Subparts of MV_SUMMTAB2_0? in SUMMTAB2
Quarter
MV2
MV1 PARENT
TableSpace1 TableSpace2
MV and Parents
Partition Pruning Partition-wise Joins Partition Independence
Adding Quarters/Customers
Partition
Default Partition VALUES LESS THAN (MAXVALUE)
Subpartition
Default Subpartition VALUES (DEFAULT)
Cust1
Cust2
Cust3
DEF
Cust1
Cust2
Cust3
DEF
Cust1
Cust2
Cust3
Cust4
DEF
Cust1
Cust2
Cust3 DEF
Cust1
Cust2
Cust3 DEF
Backup/Restore
Backup
ALTER TABLESPACE <TSName> READ ONLY Copy the files to tape/CD.
Restore
Copy the file back into the directory ALTER TABLESPACE <TSName> RECOVER
Archival/Purge
Table SP1 SP2 SP3 SP4
Table4
Table SP1
SP2
SP3
SP4
Table4
Table SP1
SP2
SP3
Table4
Archival/Purge
CREATE TABLE S1_Y<yy>Q<q>_<CustName> TABLESPACE Y<yy>Q<q>_<CustName>_<TSType> AS SELECT * FROM SUMMTAB1 WHERE 1=2 / CREATE INDEXES, CONSTRAINTS, etc. / ALTER TABLE SUMMTAB1 EXCHANGE SUBPARTITION Y<yy>Q<q>_<CustName> WITH TABLE Y<yy>Q<q>_<CustName> INCLUDING INDEXES /
Check TTS
ALTER TABLESPACE Y<yy>Q<q>_<CustName>_<TSType> READ ONLY; DBMS_TTS.TRANSPORT_SET_CHECK ( <DataTS>,<IndexTS>) ; SELECT * FROM TRANSPORT_SET_VIOLATIONS;
Transport TS
Export Parameter File
TRANSPORT_TABLESPACE=y TTS_FULLCHECK=Y FILE=<FileLocation>/exp<TS>.dmp TABLESPACES=(<DataTS>, <IndexTS>)
Purge
Drop Subpartition Drop the Tablespace
DROP TABLESPACE <TSName> INCLUDING CONTENTS AND DATAFILES;
Restore
ALTER TABLE SPLIT SUBPARTITION <DefaultSP> Copy Datafiles & Export Dump Files from CD/Tape Import Parameter File
TRANSPORT_TABLESPACES=Y TABLESPACES=(<DataTS>,<IndexTS>) DATAFILES=()
Merge Subpartition
Merging Subpartitions
Index Subpartitions Created in Users Default Tablespace Subpartition Template
ALTER TABLE SUMTAB1 ADD SUBPARTITION TEMPLATE
Resumable Statement
When? Running Large Report Jobs Creating Large Indexes ALTER SESSION ENABLE RESUMABLE NAME Job1; View DBA_RESUMABLE NAME Name specified in ALTER SESSION COORD_SESSION_ID Coord Session in PQ SQL_TEXT The text of the SQL STATUS - RUNNING, SUSPENDED, ABORTED, ABORTING, TIMEOUT ERROR_NUMBER/ERROR_MSG
Objectives Revisited
To Minimize Downtime for Refreshes
Incrementally Refresh Partitioning Techniques
To Add Customers Easily To Add Quarters Easily To Archive Off and Purge Easily and Atomically To Restore Archives Quickly
Oracle 10G
Transportable Tablespaces Can Be Reinstated At a Different Operating System
Can be used for Restoring to a Different OS
Thank You!
Updated Copy Can Be Found In
www.proligence.com
[email protected]