9512.net
甜梦文库
当前位置:首页 >> 数学 >>

Real-World Performance培训


Real-World Performance Training
Loading Data Real-World Performance Team

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Agenda
1

The DW/BI Death Spiral Parallel Execution

2
3 4 5

Loading Data
Exadata and Database In-Memory

Dimensional Queries

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Retail Demo

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Oracle Retail Data Warehouse
The Schema

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Retail Demonstration
Table Sizes
Table Transactions Payments Line Items Total Size of Source Data in GByte 51.8 54.2 940.8 1046.9 Number or Rows in Millions 463.7 463.7 6,980.6 7,908.0

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Retail Demonstration
Table Sizes – Default Compression
Table Transactions Payments Line Items Total Size of Table (GB) 29.1 29.2 257.1 315.5 Compression Ratio 1.78 : 1 1.86 : 1 3.66 : 1 3.32 : 1

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Retail Demonstration
Table Sizes – Hybrid Columnar Compression
Table Transactions Payments Line Items Total Size of Table (GB) 4.8 4.9 55.0 64.7 Compression Ratio 10.82 : 1 10.99 : 1 17.11 : 1 16.18 : 1

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Retail Demo

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Setup

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Setup

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Initial Data Load

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Initial Data Load

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Initial Data Load

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Flash—Scan and Count Rows

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Flash—Scan and Count Rows

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Flash—Scan and Count Rows

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Gather Statistics

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Gather Statistics

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Load Daily Data

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Load Daily Data

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Validate Daily Data

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Validate Daily Data

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Transform Daily Data

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Transform Daily Data

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Transform Daily Data

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Exchange In Daily Data

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Exchange In Daily Data

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Exchange In Daily Data

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Gather Incremental Stats

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Gather Incremental Stats

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Gather Incremental Stats

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Loading a Data Warehouse
? Two broad approaches
– ETL: Extract Transform Load – ELT: Extract Load Transform

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted

33

Loading a Data Warehouse
ETL – Extract Transform Load

? Extract the data from the source system. In many cases, this is the Data Warehouse itself ? Perform Transformation and Validation, usually on some middle tier server ? Load the data into the Data Warehouse.
– Often the data is written to the Data Warehouse using DML operations; inserts, updates and deletes. In turn, this may require indexes in order to perform

? A whole business has been developed around “data integration” products and services, such as
– Informatica – Ab Initio

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted

34

Loading a Data Warehouse
ELT – Extract Load Transform

? Extract the data from the source system ? Load the data as-is into “staging” tables on the Data Warehouse system ? Validation and Transformation performed via SQL and set based processing techniques

? Final data is added to the target fact or dimension table
– Partition Exchange is an effective technique for this step

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted

35

Loading a Data Warehouse
Extract

? Extracting data from a source system is often the most challenging
– What tools are available depends on the data source – For Oracle, there is no “data unload” product
? Home grown tools ? Fastreader from WisdomForce (now Informatica) ? Datapump Export, Transportable Tablespaces

– Compression Benefits
? Reduced time to copy data over the network ? Increased load performance

– Where will the data be staged
? DBFS, NFS, ZFS ? USB Drive !

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted

36

Loading a Data Warehouse
Loading

? Data Loading is a CPU/Memory constrained operation.
– Data loads scale well over multiple CPUs, cores and hosts (assuming no other form of contention) – Memory usage for meta data associated with highly partitioned objects can become significant at high DOP

? Use external tables with a parallel SQL statement (e.g. CTAS or IAS) to minimize on-disk and in-memory meta data. Do NOT use multiple threads of SQL*Loader
– Using external tables is also much simpler than having to manage multiple threads of SQL*Loader
Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 37

Loading a Data Warehouse
Loading ? Direct Path Load
– Enabled using the APPEND hint – Default for CTAS and for Parallel Inserts

? Why Direct Path Load?
– Allows a single parallel insert operation to efficiently load data from multiple parallel server processes
? Significant performance improvement for parallel DML/DDL operations

– Required for basic/default and HCC compression – No redo or undo – Bypasses buffer cache

? Possible Issues
– Only one direct path load into a table/partition at a time – No logging for Data Guard

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Loading a Data Warehouse
Anatomy of an External Table
create table FAST_LOAD ( Reference the column definition list ... Uncompress the data ) Mount Point organization external using a secure wrapper ( type oracle_loader default directory SPEEDY_FILESYSTEM The Character set must preprocessor exec_file_dir:’zcat.sh’ characterset ‘ZHS16GBK’ match the Character set badfile ERROR_DUMP:’FAST_LOAD.bad’ of the Files logfile ERROR_DUMP:’FAST_LOAD.log’ ( Note Compressed Files file column mapping list ... ) location The number of files (file_1.gz, file_2.gz, file_3.gz, file_4.gz ) should match or be a reject limit 1000 parallel 4 multiple of the DoP. /

External Table Definition

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Loading a Data Warehouse
Validation and Transformation

? Elimination of duplicates
– Outer Join back to the table – Window function – Aggregate with HAVING clause

? Foreign Key References
– Outer Joins between tables

? The choice of techniques will be dependent on the following
– Good/Bad validation of the data – The desire to identify and locate bad rows e.g. find ROWIDS – The desire to programmatically eliminate bad rows

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Data Validation SQL
Duplicate Rows
Simply Check the Data Obtain one of the ROWIDs of duplicates to investigate
select pk, count(*), max(rowid) from DIRTY_DATA group by pk having count(*)>1;

Query the rows you wish to keep eliminating duplicates based on the load time
select column_list from ( select a.*,row_number() over ( partition by pk order by load_time desc ) rowno from DIRTY_DATA a ) where rowno=1

select pk,count(*) from DIRTY_DATA group by pk having count(*)>1;

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Data Validation SQL
Orphaned Row Check
Look For Orphans select C.rowid from PARENT P right outer join CHILD C on P.pk = C.fk where P.pk is null; Look for Parents with no Children select P.rowid from PARENT P left outer join CHILD C on P.pk = C.fk where C.fk is null;

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Loading a Data Warehouse
Data Transformation vs Data Modification

? Data Transformation
– Change data by performing transformations into a new table – Consistent and Predictable Performance – Supports Direct Path Loads and Compression

? Data Modification
– Change data in place – Update, Delete, Insert – Overhead and performance impact of changing existing blocks – Does not work well with compression
Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Loading a Data Warehouse
Data Transformation SQL

? Use either
– INSERT /*+APPEND */ INTO … SELECT – CREATE TABLE … AS SELECT

? Using an INSERT
– Constraints such as NOT NULL can be correctly applied and enforced – Data type, column lengths and precision can be defined and preserved

? Using a CTAS
– DDL (not DML) – Some optimizations available, that are currently disabled for DML. This may change over time.

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Rewriting DML as Transformation
Delete
alter session enable parallel dml; delete from tx_log where symbol = ‘JAVA’; commit; alter session enable parallel dml; insert /*+ append */ into tx_log_new select * from tx_log where symbol != ‘JAVA’; alter table tx_log rename to tx_log_old; alter table tx_log_new rename to tx_log;

The predicate is the compliment of the DELETE, it selects the rows to keep

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Rewriting DML as Transformation
Update
alter session enable parallel dml; update sales_ledger set tax_rate = 9.9 where tax_rate = 9.3 and sales_date > ‘01-Jan-09’; alter session enable parallel dml; insert /*+ append */ into sales_ledger_new select <column list>, case sales_date>‘01-Jan-09’ and tax_rate=9.3 then 9.9 else tax_rate end, <column list> from sales_ledger;
Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

commit;

The UPDATE predicates are moved to the SELECT list in a CASE statement to transform the rows

Loading a Data Warehouse
Example Workflow

? An example workflow may be:
– Load data into first staging table
? Basic data integrity, nulls, data types etc.

– Check the data, writing “good” data to a second staging table
? Uniqueness, foreign keys, business rules etc. ? Apply constraints with “RELY DISABLE NOVALIDATE”

– Transform the data into a third staging table
? Tax corrections, time zone corrections, consolidate codes etc. ? Gather statistics on final staging table, including synopsis

– Use Partition Exchange to “swap in” the staging table to the Fact table
? Gather Global statistics, which will be rolled up using partition synopses

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Example Workflow
Load Data into Staging Table
5-25 5-26 5-27 5-28 5-29 5-30 5-31 Daily Partitioned Table Partition Synopses

...

Load data from external table into stage_1

Stage_1

Stage_2

Stage_3

Stage_1_err

Stage_2_err
48

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted

Data Load
Validation
5-25 5-26 5-27 5-28 5-29 5-30 5-31 Daily Partitioned Table Partition Synopses

...

Valid data transformed into stage_2 Invalid data transformed Into stage_1_err

Stage_1

Stage_2

Stage_3

Stage_1_err

Stage_2_err
49

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted

Data Load
Transformation
5-25 5-26 5-27 5-28 5-29 5-30 5-31 Daily Partitioned Table Partition Synopses

...

Data transformation Into stage_3 VAT codes, time zones etc Invalid data transformed Into stage_2_err

Stage_1

Stage_2

Stage_3

Stage_1_err

Stage_2_err
50

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted

Data Load
Gather Statistics
5-25 5-26 5-27 5-28 5-29 5-30 5-31 Daily Partitioned Table Partition Synopses

...

Gather statistics

Stage_1

Stage_2

Stage_3

Stage_1_err

Stage_2_err
51

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted

Data Load
Partition Exchange
5-25 5-26 5-27 5-28 5-29 5-30 5-31 Daily Partitioned Table Partition Synopses

...

Exchange stage_3 with partition

Stage_1

Stage_2

Stage_3

Stage_1_err

Stage_2_err
52

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted

Data Transformation SQL
Transformation vs. Modification
Driver Compression Fragmentation Logging and UNDO Transformation No Impact None Possible to eliminate Modification Compression may be Lost and severely impact performance Fragmentation, row chaining, and holes will almost certainly take place Will take place and may impact performance and administration requirements Indexes will be maintained in place. This may be a performance overhead and Bit Map indexes may become fragments and require rebuilding No impact Overhead of UNDO and Logging Old Code runs with performance challenges

Indexes

Indexes need to be rebuilt if used

Meta Data Space Coding

Grants etc will require redefinition Overhead of maintaining multiple copies of the data New code required writing and new techniques need teaching

3rd Party Issues

May not be supported by Tool Vendors

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |

Loading a Data Warehouse
Summary

? Data validation and modification
– Best executed in the database via SQL – This presents big challenges to users who have committed to classic ETL tools such as Informatica – Changes of data are best made via transformation and redefinition than via classic OLTP DML statements ( delete, update, merge )
? Allows exploitation of hardware and parallelism ? Minimizes fragmentation and maximizes compression ? Minimizes logging and minimizes recovery

– Set based techniques use efficient CPU and IO techniques

Copyright ? 2014, Oracle and/or its affiliates. All rights reserved. |


赞助商链接

更多相关文章:
报名系统设计毕业设计论文
In this paper, based on real-world problems ...performance systems, the development of systems to...我所开发的培训网 9 基于 web 的报名系统设计 站...
压力测试性能测试培训——LoadRunner11
压力测试性能测试培训——LoadRunner11_计算机软件及应用_IT/计算机_专业资料。...Real-World schedule:按照场景设计进行加压,可以实现梯度加压和峰值加压; ? Basic...
英文培训论文1(1)2)
英文培训论文1(1)2)_管理学_高等教育_教育专区。...performance in the exam to get the certificate ...as a real language rather than a tool for test...
change management
performance-related pay,or empowering individuals or...there is real-world evidence to show that it ...新的生产技术,可使裁员或增 加特定的 IT 培训的...
2014-ICDE论文集总结
By using a number of real-world datasets, we demonstrate its appealing performance not only w.r.t. the outlier detection rate but also w.r.t. the ...
《SAFER Real-time系统》培训考核试题
SAFER Real-time 系统培训考核试卷 考试日期: 专业队 班 姓名: 分数: 一、 填空题(每题 5 分,共 60 分) 1、 2、 Safer real-time 软件是我们的固定设施...
2017年6月英语六级第二套及答案解析
C)Mental images often interfere with athletes’ performance. D)Thinking has ...have only begun to demonstrate what they can produce in real-world learning...
performance
performance_英语学习_外语学习_教育专区。英文讲稿2Unit...She becomes world-famous singer. Do the exercise...tells the story of someone looks for real love....
川外语言文学复试资料
Competencies consist of essential skills, knowledge, attitudes, and behaviors required for effective performance of a real-world task or activity. Design: ...
ABB DCS 培训课程 T315-01 References - RevA
Different facets of these real world entities are modeled as aspects. An ...performance and reliable communication with predictable response times in real ...
更多相关标签:

All rights reserved Powered by 甜梦文库 9512.net

copyright ©right 2010-2021。
甜梦文库内容来自网络,如有侵犯请联系客服。zhit325@126.com|网站地图