This feature is probably my favorite new feature in 11.2.3.3, because it allows Exadata to do smart scans of segments in flash. The examples below are run on a quarter rack X3-2 HC with storage sw version 12.1.1 and db version 11.2.0.4.
Here I have a simple query joining one small table of 19MB via a hash join with a larger table of ~150GB, with some filtering and a few aggregations.
SQL> explain plan for select /*+ monitoring*/ sum(extended_amt), count(extended_amt) from v_demo
where sku_item_name like 'Pampers%' and (bsns_unit_key > 2012057 and bsns_unit_key < 2021070)
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 518373028
---------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 42 | 69534 (1)| 00:11:36 | | | | | |
| 1 | SORT AGGREGATE | | 1 | 42 | | | | | | | |
| 2 | PX COORDINATOR | | | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10001 | 1 | 42 | | | | | Q1,01 | P->S | QC (RAND) |
| 4 | SORT AGGREGATE | | 1 | 42 | | | | | Q1,01 | PCWP | |
|* 5 | HASH JOIN | | 4668 | 191K| 69534 (1)| 00:11:36 | | | Q1,01 | PCWP | |
| 6 | PX RECEIVE | | 1 | 26 | 6 (0)| 00:00:01 | | | Q1,01 | PCWP | |
| 7 | PX SEND BROADCAST | :TQ10000 | 1 | 26 | 6 (0)| 00:00:01 | | | Q1,00 | P->P | BROADCAST |
| 8 | PX BLOCK ITERATOR | | 1 | 26 | 6 (0)| 00:00:01 | | | Q1,00 | PCWC | |
|* 9 | TABLE ACCESS STORAGE FULL| DWR_SKU_ITEM | 1 | 26 | 6 (0)| 00:00:01 | | | Q1,00 | PCWP | |
| 10 | PX BLOCK ITERATOR | | 207M| 3172M| 69515 (1)| 00:11:36 | 1 | 128 | Q1,01 | PCWC | |
|* 11 | TABLE ACCESS STORAGE FULL | R_DWB_RTL_SLS_RETRN_LINE_ITEM | 207M| 3172M| 69515 (1)| 00:11:36 | 1 | 128 | Q1,01 | PCWP | |
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("A"."SKU_ITEM_KEY"="B"."SKU_ITEM_KEY")
9 - storage("SKU_ITEM_NAME" LIKE 'Pampers%')
filter("SKU_ITEM_NAME" LIKE 'Pampers%')
11 - storage("BSNS_UNIT_KEY"<2021070 AND "BSNS_UNIT_KEY">2012057 AND SYS_OP_BLOOM_FILTER(:BF0000,"A"."SKU_ITEM_KEY"))
filter("BSNS_UNIT_KEY"<2021070 AND "BSNS_UNIT_KEY">2012057 AND SYS_OP_BLOOM_FILTER(:BF0000,"A"."SKU_ITEM_KEY"))
First I'll show an execution from Exadata storage sw version 11.2.3.2.1
SQL> @snap "select /*+ monitoring*/ sum(extended_amt), count(extended_amt) from v_demo
where sku_item_name like 'Pampers%' and (bsns_unit_key > 2012057 and bsns_unit_key < 2021070)"
SUM(EXTENDED_AMT) COUNT(EXTENDED_AMT)
----------------- -------------------
47905.43 31471
Elapsed: 00:00:06.27
NAME
---------------------------------------------------------------------------------------------------------------------------------------------------------
-- ExaSnapper v0.7 BETA by Tanel Poder @ Enkitec - The Exadata Experts ( http://www.enkitec.com )
---------------------------------------------------------------------------------------------------------------------------------------------------------
DB_LAYER_IO DB_PHYSIO_BYTES |##################################################| 145469 MB 20708 MB/sec
DB_LAYER_IO DB_PHYSRD_BYTES |##################################################| 145469 MB 20708 MB/sec
DB_LAYER_IO DB_PHYSWR_BYTES | | 0 MB 0 MB/sec
AVOID_DISK_IO PHYRD_FLASH_RD_BYTES | | 0 MB 0 MB/sec
AVOID_DISK_IO PHYRD_STORIDX_SAVED_BYTES |############################################# | 129787 MB 18476 MB/sec
REAL_DISK_IO SPIN_DISK_IO_BYTES |##### | 15682 MB 2232 MB/sec
REAL_DISK_IO SPIN_DISK_RD_BYTES |##### | 15682 MB 2232 MB/sec
REAL_DISK_IO SPIN_DISK_WR_BYTES | | 0 MB 0 MB/sec
REDUCE_INTERCONNECT PRED_OFFLOADABLE_BYTES |##################################################| 145469 MB 20708 MB/sec
REDUCE_INTERCONNECT TOTAL_IC_BYTES | | 577 MB 82 MB/sec
REDUCE_INTERCONNECT SMART_SCAN_RET_BYTES | | 577 MB 82 MB/sec
REDUCE_INTERCONNECT NON_SMART_SCAN_BYTES | | 0 MB 0 MB/sec
CELL_PROC_DEPTH CELL_PROC_DATA_BYTES |##### | 15685 MB 2233 MB/sec
CELL_PROC_DEPTH CELL_PROC_INDEX_BYTES | | 0 MB 0 MB/sec
CLIENT_COMMUNICATION NET_TO_CLIENT_BYTES | | 0 MB 0 MB/sec
CLIENT_COMMUNICATION NET_FROM_CLIENT_BYTES | | 0 MB 0 MB/sec
Here I'm using Tanel Poders Exasnapper to show a detailed view of where the I/O is done. In total the database should read ~145GB. 129GB is saved through the storage indexes and 15GB is read from the traditional HDD in the Exadata. The last column with MB/s is pretty correct here for the HDDs, we are reading 2232MB/s. A quarter rack is able to achieve faster throughput from disk, the reason why this is not higher is probably because it is not completely sequential I/O due to the storage indexes.
Next I'm running the same query on storage sw version 12.1.1.1
SQL> @snap "select /*+ monitoring*/ sum(extended_amt), count(extended_amt) from v_demo
where sku_item_name like 'Pampers%' and (bsns_unit_key > 2012057 and bsns_unit_key < 2021070)";
SUM(EXTENDED_AMT) COUNT(EXTENDED_AMT)
----------------- -------------------
47905.43 31471
Elapsed: 00:00:01.69
NAME
---------------------------------------------------------------------------------------------------------------------------------------------------------
-- ExaSnapper v0.7 BETA by Tanel Poder @ Enkitec - The Exadata Experts ( http://www.enkitec.com )
---------------------------------------------------------------------------------------------------------------------------------------------------------
DB_LAYER_IO DB_PHYSIO_BYTES |##################################################| 145469 MB 67812 MB/sec
DB_LAYER_IO DB_PHYSRD_BYTES |##################################################| 145469 MB 67812 MB/sec
DB_LAYER_IO DB_PHYSWR_BYTES | | 0 MB 0 MB/sec
AVOID_DISK_IO PHYRD_FLASH_RD_BYTES |##### | 15212 MB 7091 MB/sec
AVOID_DISK_IO PHYRD_STORIDX_SAVED_BYTES |############################################# | 129790 MB 60503 MB/sec
REAL_DISK_IO SPIN_DISK_IO_BYTES | | 467 MB 218 MB/sec
REAL_DISK_IO SPIN_DISK_RD_BYTES | | 467 MB 218 MB/sec
REAL_DISK_IO SPIN_DISK_WR_BYTES | | 0 MB 0 MB/sec
REDUCE_INTERCONNECT PRED_OFFLOADABLE_BYTES |##################################################| 145469 MB 67812 MB/sec
REDUCE_INTERCONNECT TOTAL_IC_BYTES |# | 1742 MB 812 MB/sec
REDUCE_INTERCONNECT SMART_SCAN_RET_BYTES |# | 1742 MB 812 MB/sec
REDUCE_INTERCONNECT NON_SMART_SCAN_BYTES | | 0 MB 0 MB/sec
CELL_PROC_DEPTH CELL_PROC_DATA_BYTES |##### | 14523 MB 6770 MB/sec
CELL_PROC_DEPTH CELL_PROC_INDEX_BYTES | | 0 MB 0 MB/sec
CLIENT_COMMUNICATION NET_TO_CLIENT_BYTES | | 0 MB 0 MB/sec
CLIENT_COMMUNICATION NET_FROM_CLIENT_BYTES | | 0 MB 0 MB/sec
Elapsed time dropped from 6,27 seconds to 1,69 second, and now all I/O is done on flash as we can see from PHYRD_FLASH_RD_BYTES.
Earlier I had to use the alter table x storage(cell_flash_cache keep) to get smart scans to run from flash, now it just works! I have no experience with this in a real production system, if anyone has any real world experience with this please leave a comment!