News & Events

28 Mar 2017

Advanced Row Compression Improvements with Oracle Database 12c

//
Comments0

Oracle Database 12c Release 2 includes many new features and our commitment, as always, is to be the best database for any business scenario. Please read more on this here. Oracle Database 12c Release 2 (12.2) is available in the cloud, with Oracle Cloud at Customer, and on-premises.

Today we are going to talk about Advanced Row Compression. Advanced Row Compression is a table compression feature of Oracle Advanced Compression. Other Advanced Compression features include: Advanced LOB Compression, RMAN Backup Compression, Data Guard Redo Network Compression, Heat Map and Automatic Data Optimization (ADO). For a background on Advanced Compression, please see this white paper.

If you are familiar with Advanced Compression technologies, please skip to the Advanced Row Compression Improvements section, otherwise continue reading.

Advanced Row Compression at a Glance

Advanced Row Compression uses a unique compression algorithm specifically designed to work with OLTP applications. The algorithm works by eliminating duplicate values within a database block, even across multiple columns. Compressed blocks contain a structure called a symbol table that maintains compression metadata. When a block is compressed, duplicate values are eliminated by first adding a single copy of the duplicate value to the symbol table. Each duplicate value is then replaced by a short reference to the appropriate entry in the symbol table.

The benefits of using Advanced Row Compression include:

  • Low overhead on DML operations. In fact, in some cases DMLs are faster when compression is used (more on this later).
  • Faster scans*
  • 2X-4X space savings in general**

* Scans go faster because there are fewer blocks to read and Oracle’s compression technology makes it as fast as reading uncompressed data ** Exact space savings will depend on the data in the table and the table schema

Advanced Row Compression Improvements

With Oracle Database 12c Release 2 there are significant enhancements to Advanced Row Compression:

Better Chained Rows Support

Before Oracle Database 12c, blocks containing many types of chained rows could not be compressed. This limitation has been removed in Oracle Database 12c Release 2.

Comparison: Oracle Database 11g Release 2 compared to Oracle Database 12c Release 2
Rows were direct loaded and also conventionally loaded and the compression ratios were compared using the test conditions below:

Number of columns: 330
Number of rows: ~1 million
Schema: Real-world table from an Oracle Customer‘s schema

Results: 2.25X compression ratio in both cases (direct loaded and also conventionally loaded) in Oracle Database 12c Release 2 compared to no compression with Oracle Database 11g Release 2.

 

Partial Compression

With Advanced Row Compression, when the block is full, it is compressed. More rows are then added (since more rows can now fit into the block), and the process of recompression is repeated several times until the rows in the block cannot be compressed further. Blocks are usually compressed and reformatted in their entirety, but in some cases in Oracle Database 12c Release 2 the block can be partially compressed, hence resulting in CPU savings and extra compression.

The partial compression feature is used on already compressed blocks (i.e. compressed with Advanced Row Compression). It looks for uncompressed rows and transforms those into a compressed form, adding or reusing symbols from the block dictionary – this is faster than recompressing the whole block again. Full block recompression also requires that no rows are locked in the block or, that all the rows in the block are locked by the transaction inserting rows into the block. In the first case, full block recompression is done as a recursive transaction, and in the second case, full recompression is done as an inline transaction. Partial compression gets around these requirements by locking and compressing only those rows that are uncompressed and unlocked – hence it can take place in the presence of other uncommitted transactions in the block.

Consider a SQL update operation: When rows cannot be updated in place due to space constraints, the update is broken up into lower level overwrite and insert sub-operations (the initial row piece is overwritten with a stub that points to the new row image inserted into a new block). This is done to avoid ROWID changes to existing rows on update operations. Assuming concurrent transactions, each transaction typically updates a few rows, and the rows inserted into a new block are neither all locked by the same transaction (some rows from previously committed transactions are unlocked) nor are they all unlocked (some transactions are yet to commit). In such a case, full block recompression is not possible – this causes some blocks to be left uncompressed during an update (or, equivalently, during an insert operation). In Oracle Database 12c Release 2, with partial compression, these rows can be compressed. For an in-depth treatment of updates to Advanced Row compressed tables, please see: this post). In simpler terms, partial compression can now compress rows which could not be compressed before. Let us see some examples.

Comparison: Oracle Database 11g Release 2 compared to Oracle Database 12c Release 2 (SCOTT.EMP demo table)

Test case updates 1 column.
Number of columns: 8
Number of rows: ~500,000

Results: 2.13X compression ratio with Oracle Database 12c Release 2 compared to no compression with Oracle Database 11g Release 2.

Comparison: Oracle Database 11g Release 2 compared to Oracle Database 12c Release 2 (Oracle ERP table AP_AE_LINES_ALL)

Test case updates 1 column.
Number of columns: 56
Number of rows: ~1 million

Results: 1.53X compression ratio with Oracle Database 12c Release 2 compared to no compression with Oracle Database 11g Release 2.

Background Compression

In Oracle Database 11g Release 2, block compression was only triggered using direct load operations or by DML operations:

  • Direct Load: via import, create table as select, move (alter table, partition move [online]).
  • DML: via inserts or updates which then trigger compression.

With Database 12c Release 2, background space tasks can compress blocks if these find them as candidates. A block can be marked as a candidate to be compressed based on previous DML operations or by other space tasks evaluating the space usage of a block. Compression done in the background is the same as Advanced Row Compression, but triggered by background processes instead of an active SQL session. Another example of background triggered compression is with ADO ADO (Automatic Data Optimization) that is defined based on policies declaratively set by the user.

Array Inserts – Quick Multi Inserts Enhancements

Insert operations such as INSERT … AS SELECT and PLSQL/OCI insert loops with bind variables, in many cases, go through a special array insert code path. Typically, these operations generate one redo record containing hundreds of rows to be inserted instead of hundreds of redo records each with a single row. This code path is used when the rows being inserted conform to certain restrictions, such as not requiring to be chained.

Before Database 12c Release 2, array inserts caused multiple recompressions per block: A batch of rows is inserted into a block and the block is compressed, the next batch is inserted and the block is recompressed and so on.

In Database 12c Release 2, we estimate the number of rows that would fit into a compressed block. All these rows are then buffered, compressed, and a full block image is generated. Note that compression occurs only once or twice (one compression to estimate compression ratio) per block, as opposed to occurring potentially many times as in pre-Database 12c Release 2. This gives us a big benefit in elapsed time. We also obtain a much better compression ratio because we compress many more rows together (hence being able to extract common symbols more effectively). The algorithms also adaptively vary the number of rows to be buffered and increase the number of rows buffered depending on running estimates of how many compressed rows would fit into a block.

Comparison: Oracle Database 11g Release 2 compared to Oracle Database 12c Release 2 (Oracle ERP table AP_AE_LINES_ALL)

Test case: Quick Multi Insert with Advanced Row Compression
Number of columns: 56
Number of rows: ~1 million

Results: (Oracle Database 12c Release 2 compared to Oracle Database 11g Release 2)

  •  2.7X redo size reduction
  •  5.2X faster data ingestion speed than uncompressed Quick Multi Insert
  •  3.5X faster IO (reduction in logical block reads)

With this enhancement, tables with Advanced Row Compression not only enable much faster scans, but also faster inserts than uncompressed tables, because of reduced logical and physical block gets.

My thanks to Jesus and Ajit for the great insights regarding Advanced Row Compression improvements with Oracle Database 12c Release 2. Please join me in April, as the database storage optimization adventure continues with my next blog, in which we will discuss rolling compression, with partitioning, in Oracle Database 11g.

 

Credit: oracle.com