Oracle OpenWorld 2019
SAN FRANCISCO
Copyright © 2019 Oracle and/or its affiliates.
Oracle Advanced Compression:
Essential Concepts, Tips, and
Tricks for Enterprise Data
Gregg Christman
Product Management
Core Database Product Development
Copyright © 2019 Oracle and/or its affiliates.
Safe Harbor
The following is intended to outline our general product direction. It is intended for information purposes
only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code,
or functionality, and should not be relied upon in making purchasing decisions. The development,
release, timing, and pricing of any features or functionality described for Oracle’s products may change
and remains at the sole discretion of Oracle Corporation.
Statements in this presentation relating to Oracle’s future plans, expectations, beliefs, intentions and
prospects are “forward-looking statements” and are subject to material risks and uncertainties. A detailed
discussion of these factors and other risks that affect our business is contained in Oracle’s Securities and
Exchange Commission (SEC) filings, including our most recent reports on Form 10-K and Form 10-Q
under the heading “Risk Factors.” These filings are available on the SEC’s website or on Oracle’s website
at http://www.oracle.com/investor. All information in this presentation is current as of September 2019
and Oracle undertakes no duty to update any statement in light of new information or future events.
Copyright © 2019 Oracle and/or its affiliates.
Session Agenda
• Data Lifecycle Management.
• Data Lifecycle Compression.
• Enabling Compression.
• Best Practices and Insights.
• Tracking Compression.
• More Information.
Copyright © 2019 Oracle and/or its affiliates.
Data Lifecycle Management
Managing Data Over Its
Lifetime
Copyright © 2019 Oracle and/or its affiliates.
Data Lifecycle Management Challenges
― Exponential increases in data volumes are putting enterprise IT
infrastructures under severe pressure.
• Including: storage costs, performance, scalability and manageability.
― Regulatory requirements are changing how and why data is being retained.
• Organizations are now required to retain and control much more information for much
longer periods – often for 7-10 years.
― Many data lifecycle solutions often have no knowledge of the use, or value, of
the data under Oracle Database management.
• Making these “database-unaware” technologies virtually useless.
Copyright © 2019 Oracle and/or its affiliates.
The enterprise data architecture benefits organizations in the
areas of performance, storage management, data lifecycle
automation, data transfer, compliance, versioning and more…
Data Lifecycle Compression
Oracle Enterprise Data
Architecture
• Innovative database solutions providing high
performance, greater automation and increased
cost savings throughout the data lifecycle.
• Industry-leading flexibility for the optimal
deployment of applications, whether in the cloud
or on-premises.
Copyright © 2019 Oracle and/or its affiliates.
Heat Map (Advanced Compression)
Tracking data usage in a database.
Enables organizations to understand…
• Where tables/partitions are in regards to their data lifecycle
(active, in-active or historic).
• How data is accessed (for queries and/or modification).
• How access patterns change over time.
• The granularity of the database object (table vs. partition).
Copyright © 2019 Oracle and/or its affiliates.
Data Lifecycle Usage
OLTP/Transactional
Data Compression (Advanced Row Compression)
― Compression specifically designed to work with OLTP/DW
applications.
• Everything is faster: table scans, backups, database cloning, etc.
• Buffer cache becomes more efficient by storing more data without having to
add memory.
• Data remains compressed in memory.
― Compression during all types of data manipulation operations,
including conventional DML such as INSERT and UPDATE.
• The compression ratio achieved depends on the data being compressed,
specifically the cardinality of the data.
• Customer Experience: 2x-4x compression ratios.
Copyright © 2019 Oracle and/or its affiliates.
Data Lifecycle Usage
OLTP/Transactional
Index Compression (Advanced Index Compression)
Low Index Compression.
– The correct and most optimal numbers of prefix columns are computed
automatically to produce the best compression ratio.
• Possible to have different index leaf blocks compressed with different prefix
column count or not be compressed at all, if there are no repeating prefixes.
• Customer Experience: 2x-3x compression ratios.
High Index Compression.
– Utilizes additional complex compression algorithms on a potentially larger number
of index keys to achieve higher levels of compression.
• Customer Experience: 4x-5x, highly compressible indexes 15x-20x.
Copyright © 2019 Oracle and/or its affiliates.
Data Lifecycle Usage
OLTP/Transactional
Prefix Compression (Index Key Compression)
‒ Eliminates duplicate copies of pre-defined number of index prefix
columns at the index leaf block level.
• Effective way to permanently reduce the index size, both on disk and in
cache.
‒ The number of prefix columns to consider for compression is
specified by the DBA at the index create time (or rebuild time) and
is constant for all index leaf blocks.
• Compression can be very beneficial when the prefix columns of an index
have many repeated rows within a leaf block.
• ANALYZE INDEX will give advice on whether / how many columns to
choose.
• Customers experience: 2x compression ratios.
Copyright © 2019 Oracle and/or its affiliates.
Index Compression Tips and Insights
Prefix Compression.
― Requires running ANALYZE INDEX to obtain the prefix column count.
― Specified prefix column count may not be optimal to produce the best
compression ratio for every block in the index.
• Advanced Index Compression.
― Cannot compress Index Organized Tables (IOTs) -- IOTS can be compressed with free
Prefix Compression.
― Cannot compress functional Indexes or bitmap indexes.
Copyright © 2019 Oracle and/or its affiliates.
Data Lifecycle Usage
OLTP/Transactional
Unstructured Data Compression (Advanced LOB Compression)
– Detects if SecureFiles LOB data is compressible and will compress
using industry standard compression algorithms.
• If the compression does not yield any savings, or if the data is already
compressed, SecureFiles will turn off compression for such LOBs.
– Random access reads and writes to compressed SecureFiles LOBs
are achieved without the need to decompress the entire file.
• Only the sub portion of the compressed file needs to be decompressed
thus saving CPU and I/O.
– Setting table or index compression does not affect SecureFiles
LOB compression or vice versa.
Copyright © 2019 Oracle and/or its affiliates.
Data Lifecycle Usage
Analytics/Reporting
Data Compression (Hybrid Columnar Compression – Query Level Compression)
― Optimized to increase scan query performance for query-mostly
tables.
• Maximizes storage savings and query performance benefits.
― Tables are organized into Compression Units (CUs) – CU’s
comprised of multiple database blocks.
• Within Compression Unit, data is organized by column instead of by row.
• Column organization brings similar values close together, enhancing
compression – 10x compression ratio typical.
Copyright © 2019 Oracle and/or its affiliates.
Data Lifecycle Usage
Archive/Historic
Data Compression (Hybrid Columnar Compression – Archive Level Compression)
― Optimized to maximize storage savings, typically achieving a
compression ratio of 15:1 (15x).
― In contrast to Warehouse Compression, Archive Compression is a
pure storage saving technology.
• Intended for tables or partitions that store cold historic/archive data that is
rarely accessed.
• No need to move data to tape - data is always online and always accessible.
― Tables also organized into Compression Units (CUs) – CU’s
comprised of multiple database blocks.
Copyright © 2019 Oracle and/or its affiliates.
Updating HCC Compressed Data Tips and Insights
―Hybrid Columnar Compressed row that gets updated gets moved to
new Compression Unit (CU).
• ROWID of the row also changes.
• Moved row’s compression level changed from Hybrid Columnar to lesser compressed
level (typically 2x to 4x ratio).
―INSIGHT: Updated rows can “automatically” be returned to HCC
compression levels using Automatic Data Optimization (ADO) policy.
• The row remains in its new Compression Unit.
• Can also ALTER TABLE MOVE ONLINE to return to HCC compression.
― IMPACT: If you do lots of row updates, then HCC performance could
be impacted as will the storage footprint for the rows being updated.
Copyright © 2019 Oracle and/or its affiliates.
Session Agenda
• Data Lifecycle Management.
• Data Lifecycle Compression.
• Enabling Compression.
• Best Practices and Insights.
• Tracking Compression.
• More Information.
Copyright © 2019 Oracle and/or its affiliates.
Online Block-Level Compression with Automatic Data Optimization (ADO)
– ADO enables organizations to create policies for
data compression – allows tables to transparently
switch from row to columnar compression.
• Transparently change to most optimal compression level.
– Oracle Database evaluates policies during the
DBA-defined database maintenance window, and
uses the information collected by Heat Map to
determine which policies to execute.
• All operations are executed automatically and in the
background -- no user intervention or application changes
required.
• Compression performed in-place, no data movement required.
Copyright © 2019 Oracle and/or its affiliates.
Preview | Automatic Index Optimization
Compression and optimization for indexes using existing Automatic
Data Optimization (ADO) framework.
• Existing Heat Map capability collects activity statistics on the index.
Index optimizations include:
• Compress: Compresses portions of the key values in an index segment.
(3x compression ratio typical)
• Coalesce: Merges the contents of index blocks where possible to free
blocks for reuse.
• Rebuild: Rebuilds index to improve space usage and access speed.
Automates movement of indexes to tier 2 storage when tier 1 storage
under space pressure.
Copyright © 2019 Oracle and/or its affiliates.s
Online Segment-Level Compression
― ALTER TABLE ... MOVE TABLE/PARTITION/SUBPARTITION …
ONLINE allows DML operations to continue to run uninterrupted
on the table/partition/subpartition that is being moved.
• Using the UPDATE INDEXES clause maintains both local and global indexes
during the move -- so a manual index rebuild is not required.
― Using DBMS_REDEFINITION keeps the table online for both
read/write activity during the migration.
• Online redefinition will clone the indexes to the interim table during the
operation.
• All the cloned indexes are incrementally maintained during the sync (refresh)
operation so there is no interruption in the use of the indexes during, or after,
the online redefinition.
• Run DBMS_REDEFINITION in parallel for best performance.
Copyright © 2019 Oracle and/or its affiliates.
Session Agenda
• Data Lifecycle Management.
• Data Lifecycle Compression.
• Enabling Compression.
• Best Practices and Insights. (Performance Related)
• Tracking Compression.
• More Information.
Copyright © 2019 Oracle and/or its affiliates.
Bulk Load Performance
Conventional Path Bulk Loads.
― Rows are inserted into existing compressed blocks uncompressed.
• As additional inserts are performed on the same block, and the block begins to fill up,
the internal threshold will be met and the block will be compressed.
• When using conventional-path inserts it is possible that the same block will be
compressed multiple times during the same operation.
• Starting 12.2, conventional path array inserts will do directly into HCC CUs.
Direct Path Bulk Loads.
― Performed above the high-water mark, so blocks are completely filled and
compressed only once, and then written to disk.
• Direct path is optimized for compression (provides better load performance)
• When performing bulk loads, specify insert /*+ append */ for better performance.
• Use direct path bulk loads (CTAS, Append Hint, SQL Loader and etc…) when
possible.
Copyright © 2019 Oracle and/or its affiliates.
PCTFREE
― If PCTFREE is 0 (or to low), and the block is full, then updates will
cause row movement and/or row chaining.
• After compression has taken place, if a row that has been compressed is
updated, then the resulting row can be re-compressed only if another
compression is triggered.
• Subsequent updates will cause row migrations once all space in the block is
exhausted (i.e. the reserved space from the PCTFREE setting).
• Row movement/chaining can result in performance degradations for the
updates, and for subsequent queries which access the affected rows
• This is true whether or not the rows in the block are compressed or not.
― Before Oracle Database 12cR2, blocks containing many types of
chained rows could not be compressed.
• This limitation has been removed in Oracle Database 12c Release 2 and
above.
Copyright © 2019 Oracle and/or its affiliates.
Optimizer
― Statistics for a compressed table are different from the same
table in non-compressed form.
• This means execution plan differences can occur.
― If table is compressed, the size of table is smaller.
• This could make the optimizer prefer a Full-Table-Scan more than it
would on the same uncompressed table.
Copyright © 2019 Oracle and/or its affiliates.
Encryption
Tablespace Encryption.
― Data and index compression are done before encryption.
• Ensures the maximum space and performance benefits from compression,
while also receiving the security of encryption at rest.
Column Encryption.
― Compression is done after encryption.
• This means that compression will have minimal effectiveness on encrypted
columns.
• There is one notable exception: if the column is a Secure Files LOB, and the
encryption is implemented with Secure Files LOB Encryption, and the
compression (and possibly deduplication) are implemented with Secure
Files LOB Compression & Deduplication, then compression will be done
before encryption.
Copyright © 2019 Oracle and/or its affiliates.
Miscellaneous Performance Insights
– Compression Overhead.
• Approximately 3% to 5% CPU is typically reported by customers.
• CPU overhead offset partially by reduced IO and IO-related operations.
– Chained Row Compression.
• Before Oracle Database 12cR2, blocks containing many types of chained rows
could not be compressed.
• This limitation has been removed in Oracle Database 12c Release 2 and above.
– Partial Compression.
• Partial compression looks for uncompressed rows and transforms those into
a compressed form - faster than recompressing the whole block again.
– Compression Granularity.
• Tablespace, Table or Partition.
Copyright © 2019 Oracle and/or its affiliates.
Miscellaneous Performance Insights – even more…
– Compress Smaller Tables/Partitions?
• Reducing IO and IO operations can benefit even the smallest tables/partitions.
– Compress System Tables?
• We do not recommend or support compressing the tables owned by SYS in
SYSTEM tablespace.
– Basic Table Compression and DMLs.
• With BASIC compression (the free feature), DMLs will progressively un-compress
the updated rows and the blocks will not be re-compressed.
• An ALTER TABLE MOVE needs to be performed at some point to restore the
compression ratio.
– Testing Compressing.
• Always best to test compression using your own data, applications and systems.
Copyright © 2019 Oracle and/or its affiliates.
Session Agenda
• Data Lifecycle Management.
• Data Lifecycle Compression.
• Enabling Compression.
• Best Practices and Insights. (What Not to Compress)
• Tracking Compression.
• More Information.
Copyright © 2019 Oracle and/or its affiliates.
What Not to Compress
― External tables.
― System tables.
― Tables with LONG datatypes.
― Temp tables.
― Tables with row dependencies.
― Clustered tables.
― Queue/Message Tables. (Tables where rows are inserted into the table, then later most or all of
the rows are deleted, then more rows are inserted and then again deleted)
― Tables with more than 255 columns. (OLTP Table Compression with Oracle Database 11g
only)
Copyright © 2019 Oracle and/or its affiliates.
Bitmap Indexes
― Bitmap Index Compression is a standard feature of the Oracle
Database.
• Advanced Index compression and Prefix Compression doesn't apply to
bitmap indexes.
• Bitmap indexes are always stored in a compressed manner without the
need of any user intervention.
― A bitmap index uses a different key from a B-tree index, but is
stored in a B-tree structure.
Copyright © 2019 Oracle and/or its affiliates.
Redo Logs
― Advanced Compression does not have a feature that
compresses redo logs directly.
• When a block is recompressed, the redo logs will store enough data to
restore the block to its state prior to recompression.
― Some of the features of Advanced Compression will help
reduce the amount of redo generated over time, by reducing
the size of database objects. Including: tables, indexes,
SecureFiles LOBs, etc…
• If Data Guard Redo Transport Compression is enabled, then the redo
data will be compressed before it is transmitted from the primary to the
standbys, and then decompressed and applied on the standbys.
Copyright © 2019 Oracle and/or its affiliates.
Session Agenda
• Data Lifecycle Management.
• Data Lifecycle Compression.
• Enabling Compression.
• Best Practices and Insights. (Improving Compression Ratios)
• Tracking Compression.
• More Information.
Copyright © 2019 Oracle and/or its affiliates.
Improving Data Compression Ratios
―Space usage reduction gives the best results where the most
duplicate data is stored (low cardinality).
• Sorting data (on the columns with the most duplicates) may increase the
compression ratio.
• Larger block sizes may provide higher compression ratios – always test
first.
―Organizations must consider the cost of both (sorting data and
larger block sizes) in relation to the amount of increase in the
compression ratio.
• A small increase in compression ratio may not be worth the added
effort.
• A larger increase, especially with colder/historic data (few/no
modifications), may be worth the extra effort.
Copyright © 2019 Oracle and/or its affiliates.
Improving LOB Compression Ratios
― Best practice suggestion is to use SecureFiles for LOBS over 4k
in size, better compression than storing inline.
― LOBs, such as documents or XML files, typically experience a
compression ratio of 2x to 3x.
• Bitmap images are already compressed and are unlikely to compress
any further.
• If LOB was already compressed unlikely Oracle will get much, if any,
additional compression.
―Starting with Oracle Database 12c users can estimate
compression ratios for LOBS with Compression Advisor.
• Can also estimate LOB compression using something akin to
gzip.
Copyright © 2019 Oracle and/or its affiliates.
Preview | SecureFiles Shrink
Improved SecureFiles Space Management.
• Previously shrink DDL (i.e. alter table … shrink space cascade)
did not support SecureFiles segments.
SecureFiles now supports shrink DDL for SecureFiles
segments.
• SecureFiles shrink is online -- concurrent queries and DMLs are
allowed, and concurrent DDLs will be serialized.
• Manually terminate shrink, at any time, without losing progress.
• Shrink terminates automatically when no additional storage
saving are possible.
Supports existing SecureFiles features: compression,
encryption and deduplication.
Copyright © 2019 Oracle and/or its affiliates.s
Session Agenda
• Data Lifecycle Management.
• Data Lifecycle Compression.
• Enabling Compression.
• Best Practices and Insights.
• Tracking Compression.
• More Information.
Copyright © 2019 Oracle and/or its affiliates.
Tracking Advanced Row
Compression w/ AWR • HSC OLTP positive compression + HSC
OLTP negative compression = Total number
of attempted compressions and
re-compressions (non-direct load).
• HSC OLTP Compressed Blocks =
Incremented first time a block a compressed.
• HSC IDL Compressed Blocks = Block
compressions above the HWM (such as in CTAS or
insert append).
• ((HSC OLTP positive compression + HSC
OLTP negative compression) - (HSC IDL
Compressed Blocks + HSC OLTP
Compressed Blocks)) = number of attempted
compressions and re-compressions
re-compressions (including direct loads).
Copyright © 2019 Oracle and/or its affiliates.
Tracking Hybrid Columnar
Compression w/ AWR
• EHCC Rows Compressed = Total number
of rows compressed using HCC
compression.
• EHCC Rows Not Compressed = Total
number of rows not compressed with
HCC.
• EHCC Total Rows Decompressed = Total
number of rows decompressed counting
every decompression.
• EHCC CUs Compressed = Total number
HCC CU’s compressed.
Starting 12.2, the term "EHCC" in all stats have been
universally replaced with "HCC".
Copyright © 2019 Oracle and/or its affiliates.
Session Agenda
• Data Lifecycle Management.
• Data Lifecycle Compression.
• Enabling Compression.
• Best Practices and Insights.
• Tracking Compression.
• More Information.
Copyright © 2019 Oracle and/or its affiliates.
Additional Resources
Join the Conversation
https://twitter.com/aco_gregg
https://blogs.oracle.com/DBStorage/
http://www.oracle.com/database/ https://www.oracle.
advanced-compression/index.html com/a/tech/docs/ad
vanced-
compression-poc-
insights.pdf
Copyright © 2019 Oracle and/or its affiliates.
More Information
Copyright © 2019 Oracle and/or its affiliates.
Thank You
Gregg Christman
Product Manager
Core Database Product Development
Session Survey
Help us make the content
even better. Please complete
the session survey in the
Mobile App.
Copyright © 2019 Oracle and/or its affiliates.