Here's the schedule for this years OUGN conference. Lots of interesting stuff here, I'm looking forward to hear Cary Millsap, Kellyn Pot'Vin and the ever present Doug Burns. There will also be some great Norwegian presenters, like Henry Hopkinson from Norgesgruppen on Enterprise Manager. I could continue to list names here, but I can't list all of the 5 tracks.
I'm also presenting again this year, with the title: "Live demo: Multitenant with RAC and Data Guard". I'm co-presenting with my good friend and ex-colleague HÃ¥kon Onsager, and we're on right after Cary Millsap on Friday morning. We will look at Multitenant from an operations point of view, and what will it look like in a Data Guard environment with or without RAC. We are also looking into how to get a flexible and stable platform for your future pluggable databases. It will be fun!
Monday, February 17, 2014
Wednesday, February 12, 2014
Exadata Smart Flash Table Caching
This feature is probably my favorite new feature in 11.2.3.3, because it allows Exadata to do smart scans of segments in flash. The examples below are run on a quarter rack X3-2 HC with storage sw version 12.1.1 and db version 11.2.0.4.
Here I have a simple query joining one small table of 19MB via a hash join with a larger table of ~150GB, with some filtering and a few aggregations.
First I'll show an execution from Exadata storage sw version 11.2.3.2.1
Here I have a simple query joining one small table of 19MB via a hash join with a larger table of ~150GB, with some filtering and a few aggregations.
SQL> explain plan for select /*+ monitoring*/ sum(extended_amt), count(extended_amt) from v_demo where sku_item_name like 'Pampers%' and (bsns_unit_key > 2012057 and bsns_unit_key < 2021070) Explained. SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------------------------------------------------------------------------------------- Plan hash value: 518373028 --------------------------------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 42 | 69534 (1)| 00:11:36 | | | | | | | 1 | SORT AGGREGATE | | 1 | 42 | | | | | | | | | 2 | PX COORDINATOR | | | | | | | | | | | | 3 | PX SEND QC (RANDOM) | :TQ10001 | 1 | 42 | | | | | Q1,01 | P->S | QC (RAND) | | 4 | SORT AGGREGATE | | 1 | 42 | | | | | Q1,01 | PCWP | | |* 5 | HASH JOIN | | 4668 | 191K| 69534 (1)| 00:11:36 | | | Q1,01 | PCWP | | | 6 | PX RECEIVE | | 1 | 26 | 6 (0)| 00:00:01 | | | Q1,01 | PCWP | | | 7 | PX SEND BROADCAST | :TQ10000 | 1 | 26 | 6 (0)| 00:00:01 | | | Q1,00 | P->P | BROADCAST | | 8 | PX BLOCK ITERATOR | | 1 | 26 | 6 (0)| 00:00:01 | | | Q1,00 | PCWC | | |* 9 | TABLE ACCESS STORAGE FULL| DWR_SKU_ITEM | 1 | 26 | 6 (0)| 00:00:01 | | | Q1,00 | PCWP | | | 10 | PX BLOCK ITERATOR | | 207M| 3172M| 69515 (1)| 00:11:36 | 1 | 128 | Q1,01 | PCWC | | |* 11 | TABLE ACCESS STORAGE FULL | R_DWB_RTL_SLS_RETRN_LINE_ITEM | 207M| 3172M| 69515 (1)| 00:11:36 | 1 | 128 | Q1,01 | PCWP | | --------------------------------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - access("A"."SKU_ITEM_KEY"="B"."SKU_ITEM_KEY") 9 - storage("SKU_ITEM_NAME" LIKE 'Pampers%') filter("SKU_ITEM_NAME" LIKE 'Pampers%') 11 - storage("BSNS_UNIT_KEY"<2021070 AND "BSNS_UNIT_KEY">2012057 AND SYS_OP_BLOOM_FILTER(:BF0000,"A"."SKU_ITEM_KEY")) filter("BSNS_UNIT_KEY"<2021070 AND "BSNS_UNIT_KEY">2012057 AND SYS_OP_BLOOM_FILTER(:BF0000,"A"."SKU_ITEM_KEY"))
First I'll show an execution from Exadata storage sw version 11.2.3.2.1
SQL> @snap "select /*+ monitoring*/ sum(extended_amt), count(extended_amt) from v_demo where sku_item_name like 'Pampers%' and (bsns_unit_key > 2012057 and bsns_unit_key < 2021070)" SUM(EXTENDED_AMT) COUNT(EXTENDED_AMT) ----------------- ------------------- 47905.43 31471 Elapsed: 00:00:06.27 NAME --------------------------------------------------------------------------------------------------------------------------------------------------------- -- ExaSnapper v0.7 BETA by Tanel Poder @ Enkitec - The Exadata Experts ( http://www.enkitec.com ) --------------------------------------------------------------------------------------------------------------------------------------------------------- DB_LAYER_IO DB_PHYSIO_BYTES |##################################################| 145469 MB 20708 MB/sec DB_LAYER_IO DB_PHYSRD_BYTES |##################################################| 145469 MB 20708 MB/sec DB_LAYER_IO DB_PHYSWR_BYTES | | 0 MB 0 MB/sec AVOID_DISK_IO PHYRD_FLASH_RD_BYTES | | 0 MB 0 MB/sec AVOID_DISK_IO PHYRD_STORIDX_SAVED_BYTES |############################################# | 129787 MB 18476 MB/sec REAL_DISK_IO SPIN_DISK_IO_BYTES |##### | 15682 MB 2232 MB/sec REAL_DISK_IO SPIN_DISK_RD_BYTES |##### | 15682 MB 2232 MB/sec REAL_DISK_IO SPIN_DISK_WR_BYTES | | 0 MB 0 MB/sec REDUCE_INTERCONNECT PRED_OFFLOADABLE_BYTES |##################################################| 145469 MB 20708 MB/sec REDUCE_INTERCONNECT TOTAL_IC_BYTES | | 577 MB 82 MB/sec REDUCE_INTERCONNECT SMART_SCAN_RET_BYTES | | 577 MB 82 MB/sec REDUCE_INTERCONNECT NON_SMART_SCAN_BYTES | | 0 MB 0 MB/sec CELL_PROC_DEPTH CELL_PROC_DATA_BYTES |##### | 15685 MB 2233 MB/sec CELL_PROC_DEPTH CELL_PROC_INDEX_BYTES | | 0 MB 0 MB/sec CLIENT_COMMUNICATION NET_TO_CLIENT_BYTES | | 0 MB 0 MB/sec CLIENT_COMMUNICATION NET_FROM_CLIENT_BYTES | | 0 MB 0 MB/sec
Here I'm using Tanel Poders Exasnapper to show a detailed view of where the I/O is done. In total the database should read ~145GB. 129GB is saved through the storage indexes and 15GB is read from the traditional HDD in the Exadata. The last column with MB/s is pretty correct here for the HDDs, we are reading 2232MB/s. A quarter rack is able to achieve faster throughput from disk, the reason why this is not higher is probably because it is not completely sequential I/O due to the storage indexes.
Next I'm running the same query on storage sw version 12.1.1.1
SQL> @snap "select /*+ monitoring*/ sum(extended_amt), count(extended_amt) from v_demo where sku_item_name like 'Pampers%' and (bsns_unit_key > 2012057 and bsns_unit_key < 2021070)"; SUM(EXTENDED_AMT) COUNT(EXTENDED_AMT) ----------------- ------------------- 47905.43 31471 Elapsed: 00:00:01.69 NAME --------------------------------------------------------------------------------------------------------------------------------------------------------- -- ExaSnapper v0.7 BETA by Tanel Poder @ Enkitec - The Exadata Experts ( http://www.enkitec.com ) --------------------------------------------------------------------------------------------------------------------------------------------------------- DB_LAYER_IO DB_PHYSIO_BYTES |##################################################| 145469 MB 67812 MB/sec DB_LAYER_IO DB_PHYSRD_BYTES |##################################################| 145469 MB 67812 MB/sec DB_LAYER_IO DB_PHYSWR_BYTES | | 0 MB 0 MB/sec AVOID_DISK_IO PHYRD_FLASH_RD_BYTES |##### | 15212 MB 7091 MB/sec AVOID_DISK_IO PHYRD_STORIDX_SAVED_BYTES |############################################# | 129790 MB 60503 MB/sec REAL_DISK_IO SPIN_DISK_IO_BYTES | | 467 MB 218 MB/sec REAL_DISK_IO SPIN_DISK_RD_BYTES | | 467 MB 218 MB/sec REAL_DISK_IO SPIN_DISK_WR_BYTES | | 0 MB 0 MB/sec REDUCE_INTERCONNECT PRED_OFFLOADABLE_BYTES |##################################################| 145469 MB 67812 MB/sec REDUCE_INTERCONNECT TOTAL_IC_BYTES |# | 1742 MB 812 MB/sec REDUCE_INTERCONNECT SMART_SCAN_RET_BYTES |# | 1742 MB 812 MB/sec REDUCE_INTERCONNECT NON_SMART_SCAN_BYTES | | 0 MB 0 MB/sec CELL_PROC_DEPTH CELL_PROC_DATA_BYTES |##### | 14523 MB 6770 MB/sec CELL_PROC_DEPTH CELL_PROC_INDEX_BYTES | | 0 MB 0 MB/sec CLIENT_COMMUNICATION NET_TO_CLIENT_BYTES | | 0 MB 0 MB/sec CLIENT_COMMUNICATION NET_FROM_CLIENT_BYTES | | 0 MB 0 MB/sec
Elapsed time dropped from 6,27 seconds to 1,69 second, and now all I/O is done on flash as we can see from PHYRD_FLASH_RD_BYTES.
Earlier I had to use the alter table x storage(cell_flash_cache keep) to get smart scans to run from flash, now it just works! I have no experience with this in a real production system, if anyone has any real world experience with this please leave a comment!
Sunday, February 9, 2014
Enabling flash cache compression on Exadata X3-2
I have a plan to write a few posts about the new features in the storage software that was released in December 2013, first out is this one on enabling flash cache compression.
What you need to enable this feature
Flash cache compression will allow the storage cells to compress and decompress data in the flash cards on the fly. The compression is done in the flash cards themselves, so it does not use CPU on the cell hosts and it is completely transparent to the database.
This procedure is explained in the release notes for the Storage cell software, for 11.2.3.3 that is doc id 1487339.1. I'm doing this with downtime, so all databases are down and the GI is down. See MOS if you want to do this rolling with the databases and GI up.
To enable the compression we need to drop the flash cache, the area reserved for smart flash logging and then all flashdisks. This is done as root on the first cellserver
Because this is an X3-2 system we must enable support for compression and then enable the compression.
Recreating celldisks, then the cache for smart flash logging and finally the flash cache itself.
This needs to be repeated on all storage cells. That's it! On a quarter rack this was done within half an hour.
What you need to enable this feature
- An Oracle Exadata of generation X3 or X4
- Storage software 11.2.3.3 or 12.1.
- Licenses for the Advanced Compression Option
Flash cache compression will allow the storage cells to compress and decompress data in the flash cards on the fly. The compression is done in the flash cards themselves, so it does not use CPU on the cell hosts and it is completely transparent to the database.
This procedure is explained in the release notes for the Storage cell software, for 11.2.3.3 that is doc id 1487339.1. I'm doing this with downtime, so all databases are down and the GI is down. See MOS if you want to do this rolling with the databases and GI up.
To enable the compression we need to drop the flash cache, the area reserved for smart flash logging and then all flashdisks. This is done as root on the first cellserver
[root@dm01cel01 ~]# cellcli -e alter flashcache all flush Flash cache dm01cel01_FLASHCACHE altered successfully [root@dm01cel01 ~]# cellcli -e drop flashcache all Flash cache dm01cel01_FLASHCACHE successfully dropped [root@dm01cel01 ~]# cellcli -e drop flashlog all Flash log dm01cel01_FLASHLOG successfully dropped [root@dm01cel01 ~]# cellcli -e drop celldisk all flashdisk CellDisk FD_00_dm01cel01 successfully dropped CellDisk FD_01_dm01cel01 successfully dropped ...
Because this is an X3-2 system we must enable support for compression and then enable the compression.
[root@dm01cel01 ~]# cellcli -e alter cell "flashCacheCompX3Support=TRUE" Cell dm01cel01 successfully altered [root@dm01cel01 ~]# cellcli -e "alter cell flashCacheCompress=TRUE" Cell dm01cel01 successfully altered
Recreating celldisks, then the cache for smart flash logging and finally the flash cache itself.
[root@dm01cel01 ~]# cellcli -e "create celldisk all flashdisk" CellDisk FD_00_dm01cel01 successfully created CellDisk FD_01_dm01cel01 successfully created ... [root@dm01cel01 ~]# cellcli -e "create flashlog all size=512M" Flash log dm01cel01_FLASHLOG successfully created [root@dm01cel01 ~]# cellcli -e create flashcache all Flash cache dm01cel01_FLASHCACHE successfully created
This needs to be repeated on all storage cells. That's it! On a quarter rack this was done within half an hour.
[root@dm01db01 ~]# dcli -g ~/cell_group -l root cellcli -e "list cell detail"|grep flash dm01cel01: flashCacheMode: WriteBack dm01cel01: flashCacheCompress: TRUE dm01cel02: flashCacheMode: writeback dm01cel02: flashCacheCompress: TRUE dm01cel03: flashCacheMode: writeback dm01cel03: flashCacheCompress: TRUEUpdate:
@db_magnus Works fabulous on the X4-2 too. Did you notice that the flash cache doubled in size after enabling compression?
— Carol Dacko (@cdacko) February 10, 2014
Doubling of cache size? Sure thing! I didn't check this before the change, so I had to double-check the X3 datasheet, but it should have 1,6TB of flash pr storage cell.
[root@dm01cel01 ~]# cellcli -e list flashcache detail|grep -i Size effectiveCacheSize: 2978.75G size: 2978.75G
My Oracle database notes...
Here's my blog focusing on Oracle Databases. There will probably be some pure database stuff, some Exadata and other stuff I stumble across focusing on databases. For me, just to structure what I do enough to post it on a blog will be a good for me... I have a habit of solving the same problem several times because I haven't documented how to do it. Maybe blogging will help?
Who am I? I'm Magnus, a former DBA, now working pre-sales at Oracle on Engineered Systems. On this blog everything I write is my own, and my views does not necessarily reflect those of Oracle.
I'd love to hear from YOU if you find anything I write useful. You can reach me on twitter @db_magnus, in the comments below or through my google+ icon on the right.
Who am I? I'm Magnus, a former DBA, now working pre-sales at Oracle on Engineered Systems. On this blog everything I write is my own, and my views does not necessarily reflect those of Oracle.
I'd love to hear from YOU if you find anything I write useful. You can reach me on twitter @db_magnus, in the comments below or through my google+ icon on the right.
Subscribe to:
Posts (Atom)