SlideShare a Scribd company logo
A close encounter with real world 
performance issues 
Riyaj Shamsudeen 
©OraInternals Riyaj Shamsudeen
©OraInternals Riyaj Shamsudeen 2 
Who am I? 
 17 years using Oracle products/DBA 
 OakTable member 
 Certified DBA versions 7.0,7.3,8,8i 9i 
 Specializes in RAC, performance tuning, 
Internals and E-business suite 
 Chief DBA with OraInternals 
 Co-author of “Expert Oracle Practices” ‘2010 
 Email: rshamsud at 
 Blog :
©OraInternals Riyaj Shamsudeen 3 
These slides and materials represent the work and opinions of the author and do not 
constitute official positions of my current or past employer or any other organization. 
This material has been peer reviewed, but author assume no responsibility whatsoever for 
the test cases. 
If you corrupt your databases by running my scripts, you are solely responsible for that. 
This material should not should not be reproduced or used without the authors' written 
©OraInternals Riyaj Shamsudeen 4 
 Issue: High Kernel mode CPU usage 
 Issue: Hung RAC cluster 
 Issue: High Kernel mode CPU usage – 2 
 Issue: ASMM (Skipped due to time constraints)
Architecture overview 
iAnpstpalniccaet i1on 
server server 
©OraInternals Riyaj Shamsudeen 5 
... ... 
dCaetnatbraasl e 
Note, only partial architecture shown due to client confidentiality.
©OraInternals Riyaj Shamsudeen 6 
dCaetnatbraasl e 
Note, only partial architecture shown due to client confidentiality. 
IAnpspt l1. 
IAnpspt l2. 
IAnpspt l3. 
There are many application instances each servicing a disjoint 
group of users. As per the design, any application instance can be 
shutdown with no user impact and traffic will be automatically 
rerouted to surviving instances (almost like RAC). 
Unfortunately, shutdown of any application instance shuts 
down every application instance leading to a site wide 
Database version : 
Operating system: Solaris 10 
©OraInternals Riyaj Shamsudeen 7
iAnpstpalniccaet i1on 
 After an application restart database link connections 
are released. 
This terminates connections in the central database. 
 CPU usage spikes up to 90% 
Select c1, c2 from t1@central; 
©OraInternals Riyaj Shamsudeen 8 
server server 
... ... 
dCaetnatbraasl e 
©OraInternals Riyaj Shamsudeen 9 
 Very high CPU usage in the kernel mode in the central database. 
 ASM instance in that central database server times out and 
 Why is there such an high CPU usage?
Who is using my CPU? 
 In Solaris ( and in many UNIX platforms) mpstat utility can be 
used to understand the per-processor statistics. 
 Mpstat output shows that almost all the CPUs are used in kernel 
Tue Sep 9 17:46:34 2008CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 
Tue Sep 9 17:46:34 2008 0 561 0 9 554 237 651 219 87 491 0 4349 9 91 0 0 
Tue Sep 9 17:46:34 2008 1 1197 1 34 911 0 2412 591 353 630 0 15210 30 63 0 7 
Tue Sep 9 17:46:34 2008 2 58 0 9 313 0 613 106 190 105 0 3562 8 90 0 2 
Tue Sep 9 17:46:34 2008 3 161 0 26 255 0 492 92 161 530 0 2914 6 92 0 2 
Tue Sep 9 17:46:34 2008 4 0 0 0 86 1 2 3 1 63 0 8 0 100 0 0 
Tue Sep 9 17:46:34 2008 5 283 0 34 662 0 1269 153 326 211 0 6753 13 77 0 10 
Tue Sep 9 17:46:34 2008 6 434 0 43 349 0 589 54 170 1534 0 3002 7 88 0 5 
Tue Sep 9 17:46:34 2008 12 30 0 0 195 0 279 110 31 80 0 1590 3 97 0 0 
Tue Sep 9 17:46:34 2008 13 288 0 9 449 0 844 117 158 127 0 4486 7 85 0 8 
Tue Sep 9 17:46:34 2008 14 155 0 0 430 0 744 102 160 83 0 3875 7 80 0 13 
Tue Sep 9 17:46:34 2008 15 16 0 0 237 0 359 115 31 124 0 2074 3 91 0 6 
©OraInternals Riyaj Shamsudeen 10
Out of 655 seconds total elapsed time 
455 seconds spent on latch free waits. 
Almost all latch waits are 
For enqueues latches. 
©OraInternals Riyaj Shamsudeen 11 
Statspack report 
Snap Id Snap Time Sessions --------- ------------------- -------- C-u-r-s-/-S-e-s-s- Begin Snap: 3131 09-Sep-08 17:46:17 5,030 1.9 
End Snap: 3132 09-Sep-08 17:47:16 4,995 2.0 
Elapsed: 0.98 (mins) 
DB Time: 10.55 (mins) 
Event Waits Time (s) (ms) Time Wait Class 
------------------------------ ------------ ----------- ------ ------ ---------- 
latch free 868 455 525 71.9 Other 
latch: row cache objects 103 189 1833 29.8 Concurrenc 
log file sync 885 92 103 14.5 Commit 
CPU time 85 13.4 
db file parallel write 3,868 10 3 1.6 System I/O 
latch contention is for enqueues latches: 
Pct Avg Wait Pct 
Get Get Slps Time NoWait NoWait 
Latch Name Requests Miss /Miss (s) Requests Miss 
------------------------ -------------- ------ ------ ------ ------------ ------ 
enqueues 1,355,722 3.3 0.0 452 0 N/A
Latch contention High CPU usage 
©OraInternals Riyaj Shamsudeen 12 
Chicken or Egg? 
 But, usual symptoms of latch contention is high CPU 
usage in user mode. 
 In this case, We will ignore latch contention for now.
Let’s reproduce the issue 
 To reproduce kernel mode CPU issue, we will create a database 
link from an application schema, connecting to a test schema in 
the central database. 
 Then, we will execute a select over the database link creating a 
new connection in the central database: 
select * from dual@central; 
©OraInternals Riyaj Shamsudeen 13
Truss – trace system calls 
 Identified the connection from test schema @ client database to 
the test schema @ central database. 
select sid, serial#, LOGON_TIME,LAST_CALL_ET from v$session where 
logon_time  sysdate-(1/24)*(1/60) 
---------- ---------- -------------------- ------------ 
1371 35028 12-SEP-2008 20:47:30 0 
4306 51273 12-SEP-2008 20:47:29 1 --- 
 Starting truss on that process in central database 
truss -p pid -d -o /tmp/truss.log 
Better yet.. Truss –d –D –E –p pid -o /tmp/truss.log 
 Logged off from content database and this should trigger a log-off 
from remote central database. 
©OraInternals Riyaj Shamsudeen 14
©OraInternals Riyaj Shamsudeen 15 
Truss output 
 Reading truss output in central database connection, we can see 
ten shmdt calls are consuming time. 
18.4630 close(10) = 0 
18.4807 shmdt(0x380000000) = 0 
18.5053 shmdt(0x440000000) = 0 
18.5295 shmdt(0x640000000) = 0 
18.5541 shmdt(0x840000000) = 0 
18.5784 shmdt(0xA40000000) = 0 
18.6026 shmdt(0xC40000000) = 0 
18.6273 shmdt(0xE40000000) = 0 
18.6512 shmdt(0x1040000000) = 0 
18.6752 shmdt(0x1240000000) = 0 
18.6753 shmdt(0x1440000000) = 0 
 Each shmdt call consumed approximately 0.024 seconds or 
24ms. 18.5295-18.5053=0.0242
©OraInternals Riyaj Shamsudeen 16 
Call: shmdt 
shmdt calls are used to detach from shared memory segments. 
Every database disconnect must detach from shared memory 
©OraInternals Riyaj Shamsudeen 17 
Shmdt calls 
 There are 10 shared memory segments for this SGA. So, there 
are 10 shmdt calls. 
ipcs -ma|grep 14382 
m 1241514024 0x97e45100 --rw-r----- oracle orainvtr … 
m 1241514023 0 --rw-r----- oracle orainvtr … 
m 1241514022 0 --rw-r----- oracle orainvtr … 
m 1241514016 0 --rw-r----- oracle orainvtr … 
m 889192479 0 --rw-r----- oracle orainvtr … 
Fun with numbers 
 Each shmdt call consumes 0.024 seconds 
 For one session ( 10 calls) = 0.24 seconds 
 For 300 connections = 300 * 0.24 = 72 seconds. 
 At best case, with 12 concurrent processes, this would last for 6 
seconds or so. 
 This is matching with our observation of 6 seconds high kernel 
mode CPU usage. 
©OraInternals Riyaj Shamsudeen 18
Reducing shmdt calls 
 To reduce shmdt calls we need to reduce shared memory 
 Database engine tries to create biggest segment possible at initial 
startup and slowly reduces segment size, until segments can be 
created successfully. 
 SHMMAX kernel parameter was set lower, and so, we decided to 
increase that parameter first and reboot the server. 
©OraInternals Riyaj Shamsudeen 19
Well, that was embarrassing.. 
 After increasing SHMMAX size we expected to have just one 
shared memory segment. 
 This will reduce the impact 10 times. 
 Surprise! There were still 10 shared memory segments. 
©OraInternals Riyaj Shamsudeen 20
After SHMMAX change 
 Started to truss on database startup to see why the instance is 
creating multiple shared memory segments. 
17252: 4.5957 munmap(0xFFFFFD7FFDAE0000, 32768) = 0 
17252: 4.5958 lgrp_version(1, ) = 1 
17252: 4.5958 _lgrpsys(1, 0, ) = 42 
17252: 4.5958 _lgrpsys(3, 0x00000000, 0x00000000) = 19108 
17252: 4.5959 _lgrpsys(3, 0x00004AA4, 0x06399D60) = 19108 
17252: 4.5959 _lgrpsys(1, 0, ) = 42 
17252: 4.5960 pset_bind(PS_QUERY, P_LWPID, 4294967295, 0xFFFFFD7FFFDFB11C) = 0 
17252: 4.5960 pset_info(PS_MYID, 0x00000000, 0xFFFFFD7FFFDFB0D4, 0x00000000) = 0 
17252: 4.5961 pset_info(PS_MYID, 0x00000000, 0xFFFFFD7FFFDFB0D4, 0x061AA2B0) = 0
pset_bind and _lgrpsys 
 Calls _lgrpsys and pset_bind are new and googling these 
function calls showed that there may be related to Non 
Uniforn Memory Access (NUMA) architecture. 
17252: 4.5957 munmap (0xFFFFFD7FFDAE0000, 32768) = 0 
17252: 4.5958 lgrp_version (1, ) = 1 
17252: 4.5958 _lgrpsys (1, 0, ) = 42 
17252: 4.5958 _lgrpsys (3, 0x00000000, 0x00000000) = 19108 
17252: 4.5959 _lgrpsys (3, 0x00004AA4, 0x06399D60) = 19108 
17252: 4.5959 _lgrpsys (1, 0, ) = 42 
17252: 4.5960 pset_bind (PS_QUERY, P_LWPID, 4294967295, 0xFFFFFD7FFFDFB11C) = 0 
17252: 4.5960 pset_info (PS_MYID, 0x00000000, 0xFFFFFD7FFFDFB0D4, 0x00000000) = 0 
17252: 4.5961 pset_info (PS_MYID, 0x00000000, 0xFFFFFD7FFFDFB0D4, 0x061AA2B0) = 0
NUMA architecture (overview) 
Memory #1 
Memory #2 Memory #3 Memory #4 
CPU 0 CPU 1 CPU 2 CPU 3 CPU 4 CPU 5 CPU 6 CPU 7 
 For cpu0 and cpu1, memory board 1 is local. Other 
memory areas are remote for cpu0 and cpu1. 
 Access to local memory is faster compared to 
remote memory access.
NUMA architecture (Overview ) 
Shm 1 Shm 2 Shm 3 Shm 4 
Memory #1 
Memory #2 Memory #3 Memory #4 
CPU 0 CPU 1 CPU 2 CPU 3 CPU 4 CPU 5 CPU 6 CPU 7 
dbwr 0 Dbwr 1 Dbwr 2 Dbwr 3 
 To make use of NUMA technology, Oracle spreads 
SGA across NUMA nodes.
NUMA optimization 
 Binds DBWR to a CPU set. That DBWR handles all writes from 
that shared memory segment. 
 User processes also tries to use free buffers from the working set 
of buffers from that NUMA node process is running from. 
(Update: This turned out to be a Oracle database bug 5173642). 
 LGWR also seems to have some code optimization to better use 
NUMA technology, but my test cases are not conclusive enough.
Locality groups 
 In Solaris, NUMA technology is implemented as locality groups. 
 _lgrpsys and pset_bind calls are to get current locality group 
information and bind processes to a processor set. 
 Now, we can understand why SGA was split into multiple 
 But, Do we really have that many locality groups in this server?
Locality groups 
 Lgrpinfo tool can provide NUMA node details 
lgroup 0 (root): 
Children: 10 12 14 15 17 19 21 23 
CPUs: 0-15 
Memory: installed 65024 Mb, allocated 2548 Mb, free 62476 Mb 
Lgroup resources: 1-8 (CPU); 1-8 (memory) 
Latency: 146 
lgroup 1 (leaf): 
Children: none, Parent: 9 
CPUs: 0 1 
Memory: installed 7680 Mb, allocated 1964 Mb, free 5716 Mb 
Lgroup resources: 1 (CPU); 1 (memory) 
Load: 0.105 
Latency: 51 
There were many locality groups 
defined and seven of them were 
leaf node locality groups in this server.
Local access to memory from CPU 1 to memory in the same 
NUMA node as the CPU has a reference number of 51. 
Remote access to memory in a remote node has a reference 
number of 146. 
| 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 
0 | 146 146 113 113 113 113 113 113 146 146 146 146 146 146 146 146 113 146 113 146 146 146 146 146 146 
1 | 146 51 81 81 113 113 113 113 146 81 113 113 146 113 146 146 113 146 113 146 146 146 146 146 146 
2 | 113 81 51 113 81 113 113 81 113 113 113 81 113 113 113 113 113 113 113 113 113 113 113 113 113 
3 | 113 81 113 51 113 81 81 113 113 113 113 113 113 81 113 113 113 113 113 113 113 113 113 113 113 
4 | 113 113 81 113 51 81 81 113 113 113 113 113 113 113 113 113 81 113 113 113 113 113 113 113 113 
15 | 146 146 113 113 113 113 113 113 146 146 146 146 146 146 146 113 113 146 113 146 146 146 146 146 146 
Is NUMA bad? 
 Indeed 10 shared memory segments were created, one for a 
locality groups. 
 We disabled NUMA to resolve this problem temporarily. 
 NUMA is a great technology. Sequent Dynix/ptx has 
implemented NUMA technology successfully a decade ago. 
 It’s just that we are encountering an unfortunate side effect of 
 We can disable NUMA or reduce number of NUMA nodes. 
*._db_block_numa = 1 
 Use patch for bug 819953 disable NUMA instead of underscore 
parameters (only if needed to disable NUMA) 
 Note 399261.1 and 759565.1 describes these issues. 
 It looks like, there is one shared memory segment per locality 
group and one segment encompassing all locality groups. One 
small bootstrap segment is also created.
Solution – contd. 
 Another option is to control logout rate. 
 Jonathan Lewis mentioned these parameters to control logout 
storm rate later. 
Parameter Meaning Value 
_logout_storm_retrycnt maximum retry count for logouts 600 
timeout in centi-seconds for wait 5 
between retries 
number of processes that can logout in a 0 
©OraInternals Riyaj Shamsudeen 32 
 Issue: High Kernel mode CPU usage 
 Issue: Hung RAC cluster 
 Issue: High Kernel mode CPU usage - 2 
 Issue: ASMM ( Skipped due to time constraints )
©OraInternals Riyaj Shamsudeen 33 
 All RAC instances are stuck for 10-15 minutes intermittently. 
Application is not responsive during that time period. 
 This happens randomly and no specific correlation with 
time of day.
AWR analysis indicates gc buffer busy waits, many 
sessions were waiting for GC events. 
Event Event Class % Event Avg Active Sessions 
gc buffer busy acquire Cluster 46.61 6.42 
CPU + Wait for CPU CPU 21.91 3.02 
gc cr block busy Cluster 9.14 1.26 
enq: CF - contention Other 4.44 0.61 
©OraInternals Riyaj Shamsudeen 34 
AWR analysis 
Top User Events 
gc current block busy Cluster 2.50 0.35
Event Event Class % Event Avg Active Sessions 
gc buffer busy acquire Cluster 43.68 11.48 
gc cr block busy Cluster 10.92 2.87 
CPU + Wait for CPU CPU 9.84 2.59 
row cache lock Concurrency 5.12 1.35 
©OraInternals Riyaj Shamsudeen 35 
ASH analysis 
 ASH report confirms the issue too. 
Top User Events 
gc cr multi block request Cluster 4.90 1.29
Why gc buffer busy? 
 GC buffer busy waits indicates that buffer is busy waiting 
for some sort of Global event. 
 Another session is working on that buffer and that 
session is waiting for a global cache event. 
 We need to understand why that session 2 is waiting 
for global cache event. 
©OraInternals Riyaj Shamsudeen 36
Another node… 
 In RAC, it is essential to review performance statistics 
from all nodes. 
 One node performance can bring down entire cluster 
performance or even lead to hung cluster. 
Event Event Class % Event Avg Active Sessions 
buffer busy waits Concurrency 30.92 12.00 
log file sync Commit 23.61 9.16 
log buffer space Configuration 16.88 6.55 
CPU + Wait for CPU CPU 7.48 2.90 
©OraInternals Riyaj Shamsudeen 37 
Top User Events 
row cache lock Concurrency 3.31 1.29
User sessions are waiting for 
Global buffer busy waits. 
Review of waits 
Node #1 Node #2 Node #3 Node #4 
User sessions are waiting for buffer 
busy waits and log file sync waits. 
Background sessions are 
Waiting for CF locks. 
Event % Event 
buffer busy waits 30.92 
log file sync 23.61 
log buffer space 16.88 
Event % Activity 
enq: CF - contention 5.52 
CPU + Wait for CPU 2.62 
enq: TM - contention 1.28 
gc cr block busy 1.27
Buffer busy waits 
 Buffer busy waits indicates that buffers are not available 
and busy undergoing a short change. 
 Buffer busy waits can be caused by DBW trying to write 
buffers too. 
Event Event Class % Event Avg Active Sessions 
buffer busy waits Concurrency 30.92 12.00 
log file sync Commit 23.61 9.16 
log buffer space Configuration 16.88 6.55 
CPU + Wait for CPU CPU 7.48 2.90 
©OraInternals Riyaj Shamsudeen 39 
Top User Events 
row cache lock Concurrency 3.31 1.29
Log file sync waits 
 Log file sync waits indicates that log file write mechanism 
is not fast enough. 
 This could be due to problem with LGWR, Log file I/O 
performance issue or even OS CPU scheduling issues. 
Event Event Class % Event Avg Active Sessions 
buffer busy waits Concurrency 30.92 12.00 
log file sync Commit 23.61 9.16 
log buffer space Configuration 16.88 6.55 
CPU + Wait for CPU CPU 7.48 2.90 
©OraInternals Riyaj Shamsudeen 40 
Top User Events 
row cache lock Concurrency 3.31 1.29
Background waits 
 Further review of ASH report indicates that there were 
waits for background processes too. 
 Few enq: CF contention waits. %Activity is 5.5, but that 
can be misleading. 
Event Event Class % Activity Avg Active Sessions 
enq: CF - contention Other 5.52 0.63 
CPU + Wait for CPU CPU 2.62 0.30 
enq: TM - contention Application 1.28 0.15 
©OraInternals Riyaj Shamsudeen 41 
Top Background Events 
gc cr block busy Cluster 1.27 0.15
©OraInternals Riyaj Shamsudeen 42 
 User processes in node 3  4 are suffering from global 
buffer busy waits. 
 User processes in node 2 are suffering from buffer busy 
waits and log file sync waits. 
 Background processes in node 2 are suffering from CF 
enqueue waits and buffer busy waits. 
 If there are background processes waiting for locking 
contention, then that must be resolved first. Every thing 
else could be just a symptom.
----- ---------------- ---------------- ---- -- ----- ----- ------ ----- -------- ----- 
4 0000001022E12398 0000001022E123F0 4368 CF 0 0 0 4 8 0 
4 0000001022E12FE0 0000001022E13038 4369 CF 0 0 0 4 39 0 
©OraInternals Riyaj Shamsudeen 43 
4 0000001022E15588 0000001022E155E0 4374 CF 0 0 0 4 34 0 
4 0000001022E12BD0 0000001022E12C28 4375 CF 0 0 0 4 39 0 
4 0000001022DFBD30 0000001022DFBD88 4387 CF 0 0 2 0 120592 2 
4 0000001022E13E98 0000001022E13EF0 4388 CF 0 0 0 5 49 0 
1 0000001022E12058 0000001022E120B0 4372 CF 0 0 0 4 41 0 
1 0000001022E121F8 0000001022E12250 4373 CF 0 0 0 4 41 0 
1 0000001022E12E40 0000001022E12E98 4374 CF 0 0 0 4 41 0 
1 0000001022E133F0 0000001022E13448 4376 CF 0 0 0 4 41 0 
1 0000001022DFBD30 0000001022DFBD88 4387 CF 0 0 2 0 121783 2 
3 0000001022E09BC8 0000001022E09C20 4134 CF 0 0 0 4 99 0 
3 0000001022E15A68 0000001022E15AC0 4368 CF 0 0 0 4 39 0 
3 0000001022E15658 0000001022E156B0 4369 CF 0 0 0 4 39 0 
3 0000001022E15C08 0000001022E15C60 4370 CF 0 0 0 4 39 0 
3 0000001022E154B8 0000001022E15510 4376 CF 0 0 0 4 39 0 
3 0000001022DFBD30 0000001022DFBD88 4387 CF 0 0 2 0 120855 2 
2 0000001022E15318 0000001022E15370 4368 CF 0 0 0 4 40 0 
2 0000001022E14EF0 0000001022E14F48 4373 CF 0 0 0 5 81 0 
2 0000001022DFBD30 0000001022DFBD88 4387 CF 0 0 2 0 121231 2 
38 rows selected. 
Notice that no process holding CF lock in an 
incompatible mode.
Locking scenario 
Waiters’ queue Holders’ queue 
©OraInternals Riyaj Shamsudeen 44 
lock 4387/4 
Resource CF lock is held only in 
compatible mode. 
 pid 4134 has highest ctime of 99 
seconds and state is WAITING. So, it 
is first in the waiters’ queue. 
 But why waiting?
Process is waiting for 283 
Problem continued.. 
---------- ---------------- ---------------- ---------- -- ---------- ---------- ---------- ---------- ---------- ---------- 
4 0000001022E12398 0000001022E123F0 4368 CF 0 0 0 4 193 0 
4 0000001022E12FE0 0000001022E13038 4369 CF 0 0 0 4 224 0 
4 0000001022E13CF8 0000001022E13D50 4370 CF 0 0 0 5 266 0 
4 0000001022E0FD20 0000001022E0FD78 4371 CF 0 0 0 4 224 0 
4 0000001022E12E40 0000001022E12E98 4372 CF 0 0 0 4 224 0 
4 0000001022E126D8 0000001022E12730 4373 CF 0 0 0 4 224 0 
4 0000001022E15588 0000001022E155E0 4374 CF 0 0 0 4 219 0 
4 0000001022E12BD0 0000001022E12C28 4375 CF 0 0 0 4 224 0 
4 0000001022DFBD30 0000001022DFBD88 4387 CF 0 0 2 0 120777 2 
4 0000001022E13E98 0000001022E13EF0 4388 CF 0 0 0 5 234 0 
2 0000001022E15318 0000001022E15370 4368 CF 0 0 0 4 224 0 
2 0000001022E15CD8 0000001022E15D30 4369 CF 0 0 0 4 224 0 
2 0000001022E14108 0000001022E14160 4370 CF 0 0 0 4 224 0 
2 0000001022E15E90 0000001022E15EE8 4371 CF 0 0 0 4 223 0 
2 0000001022E15B38 0000001022E15B90 4372 CF 0 0 0 4 224 0 
2 0000001022E14EF0 0000001022E14F48 4373 CF 0 0 0 5 265 0 
2 0000001022E154B8 0000001022E15510 4374 CF 0 0 0 4 224 0 
2 0000001022E15DA8 0000001022E15E00 4375 CF 0 0 0 4 224 0 
2 0000001022DFBD30 0000001022DFBD88 4387 CF 0 0 2 0 121415 2 
1 0000001022E13660 0000001022E136B8 4368 CF 0 0 0 4 225 0 
1 0000001022E12128 0000001022E12180 4369 CF 0 0 0 4 225 0 
1 0000001022E13250 0000001022E132A8 4370 CF 0 0 0 4 225 0 
1 0000001022E10CA8 0000001022E10D00 4371 CF 0 0 0 4 249 0 
1 0000001022E12058 0000001022E120B0 4372 CF 0 0 0 4 225 0 
1 0000001022E121F8 0000001022E12250 4373 CF 0 0 0 4 225 0 
1 0000001022E12E40 0000001022E12E98 4374 CF 0 0 0 4 225 0 
1 0000001022E133F0 0000001022E13448 4376 CF 0 0 0 4 225 0 
1 0000001022DFBD30 0000001022DFBD88 4387 CF 0 0 2 0 121967 2 
3 0000001022E09BC8 0000001022E09C20 4134 CF 0 0 0 4 283 0 
3 0000001022E0EE68 0000001022E0EEC0 4190 CF 0 0 0 4 18 0 
3 0000001022E15A68 0000001022E15AC0 4368 CF 0 0 0 4 223 0 
3 0000001022E15658 0000001022E156B0 4369 CF 0 0 0 4 223 0 
3 0000001022E15C08 0000001022E15C60 4370 CF 0 0 0 4 223 0 
3 0000001022E13590 0000001022E135E8 4371 CF 0 0 0 4 238 0 
3 0000001022E13F68 0000001022E13FC0 4372 CF 0 0 0 4 225 0 
3 0000001022E15998 0000001022E159F0 4373 CF 0 0 0 4 223 0 
3 0000001022E15318 0000001022E15370 4374 CF 0 0 0 4 223 0 
3 0000001022E154B8 0000001022E15510 4376 CF 0 0 0 4 223 0 
3 0000001022DFBD30 0000001022DFBD88 4387 CF 0 0 2 0 121039 2 
©OraInternals Riyaj Shamsudeen 45
Pstack 4134 
#0 0x00000030364cb053 in __select_nocancel () from /lib64/ 
#1 0x0000000001d92111 in skgpnap () 
#2 0x000000000752d9b6 in ksliwat () 
#3 0x000000000752b668 in kslwait () 
#9 0x000000000753ed1b in ksqgtlctx () 
#10 0x000000000753db0b in ksqgelctx () 
#11 0x00000000076f5bb8 in kcc_get_enqueue () 
#12 0x00000000076f329b in kccocx () 
#13 0x00000000076f3140 in kccbcx () 
#14 0x0000000005563b0c in kcra_scan_redo () 
#15 0x000000000556334d in kcra_dump_redo () 
#16 0x0000000005561fcc in kcra_dump_redo_internal () 
Usually called if the process dumping 
due to errors or exceptions. 
Is there a process dumping errors? 
©OraInternals Riyaj Shamsudeen 46
At the same time, alert log had entries for that PID 4134 
 At the end of the trace file it was hung in ‘PINNED BUFFER HISTORY’. 
©OraInternals Riyaj Shamsudeen 47 
Alert log 
*** 2009-05-21 10:46:04.109 
*** SESSION ID:(4134.4598) 2009-05-21 10:46:04.109 
*** CLIENT ID:() 2009-05-21 10:46:04.109 
*** SERVICE NAME:(PROD) 2009-05-21 10:46:04.109 
*** MODULE NAME:() 2009-05-21 10:46:04.109 
*** ACTION NAME:() 2009-05-21 10:46:04.109 
Dump continued from file: 
ORA-07445: exception encountered: core dump [ksxpmprp()+42] [SIGSEGV] [ADDR:0x14] 
[PC:0x33BB5BE] [Address not mapped to object] []
That’s a bug! 
 Of course, that’s a bug we were encountering. 
 As per the bug, process requests for CF locks, but hangs until cleaned up by pmon. 
 Of course, easy fix is to kill the processes encountering ORA-7445 errors 
immediately and the long term fix was to fix these bugs (Both ORA-7445 errors and 
bug 8318486). 
©OraInternals Riyaj Shamsudeen 48
CF enqueue wierdness 
---------- ---- -- ---------- ---------- ---------- ---------- ---------- ---------- 
2 4387 CF 0 0 2 0 122115 2 
1 4113 CF 0 4 4 0 77 0 
1 4113 CF 0 0 4 0 77 2 
1 4387 CF 0 0 2 0 122667 2 
4 4387 CF 0 0 2 0 121476 2 
3 4387 CF 0 0 2 0 121739 2 
©OraInternals Riyaj Shamsudeen 49
©OraInternals Riyaj Shamsudeen 50 
 Issue: High Kernel mode CPU usage 
 Issue: Hung RAC cluster 
 Issue: High Kernel mode CPU usage - 2 
 Issue: ASMM ( Skipped due to time constraints )
©OraInternals Riyaj Shamsudeen 51 
 High Kernel mode usage in an high end servers (Different 
client from discussion 1). 
 Database was upgraded from 9i to 10gR2 recently. 
 But, the node was running for fine for few weeks with no 
Mpstat – per processor stats 
Mpstat indicates many processes using CPU in 
%sys mode. 
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 
0 25 0 107 294 6 1711 232 661 265 0 3702 33 40 0 26 
1 21 0 454 205 9 1399 161 547 227 0 3422 44 33 0 23 
2 23 0 91 146 5 1301 123 502 255 0 3503 44 34 0 22 
3 123 0 1772 161 5 1383 127 544 264 0 3438 44 33 0 23 
4 163 0 1651 144 5 1308 109 448 208 0 2984 45 31 0 23 
37 94 0 1295 4652 4600 450 44 174 376 0 1554 51 43 0 5 
38 418 0 217 128 24 1039 79 376 127 1 3307 49 32 0 20 
39 41 0 1310 4904 4863 495 35 174 428 0 2218 51 40 0 9 
64 4 0 45 171 18 887 121 340 299 0 5802 31 49 0 20 
65 73 0 1188 148 9 1219 116 453 231 0 5418 38 42 0 21 
66 171 0 5809 133 27 1247 78 452 220 0 4524 41 38 0 22 
67 5 0 278 204 57 1583 110 567 254 0 4898 35 40 0 25 
68 0 0 5 41 27 8 9 5 5 0 7 99 1 0 0 
69 1 0 79 128 5 1465 87 495 279 0 6097 29 45 0 26 
70 6 0 1173 4342 4277 789 63 275 653 0 5596 23 64 0 13 
©OraInternals Riyaj Shamsudeen 52
AWR report for a 30 minute period showed nothing obvious 
©OraInternals Riyaj Shamsudeen 53 
AWR report 
Top 5 Timed Events Avg %Total 
~~~~~~~~~~~~~~~~~~ wait Call 
Event Waits Time (s) (ms) Time Wait Class 
------------------------------ ------------ ----------- ------ ------ ---------- 
CPU time 67,946 48.1 
db file sequential read 7,007,846 38,738 6 27.4 User I/O 
gc buffer busy 1,705,205 12,142 7 8.6 Cluster 
gc cr grant 2-way 2,804,825 6,538 2 4.6 Cluster 
db file scattered read 761,202 6,330 8 4.5 User I/O 
 I/O wait times are not too abnormal. 
 We can theorize that if there is high I/O, it can result in high kernel mode 
CPU usage. 
 But, this workload is quite normal for this node.
Dtrace and Solaris 10 
 Dtrace is a great tool to do root cause analysis and available in 
Solaris 10. 
 Dtrace can be used to see what calls are executed by peeking at 
 For example, following dtrace one-liner can break down the 
system calls executing in the server. 
dtrace -n 'syscall:::entry { @Calls[probefunc] = count(); }' 
©OraInternals Riyaj Shamsudeen 54
Dtrace output for kernel mode CPU usage shows CPUs 
Are spending time in mutex. Not much help here. 
©OraInternals Riyaj Shamsudeen 55 
Dtrace output 
unix`lock_set_spl_spin 1845 0.6% 
unix`utl0 1873 0.6% 
unix`atomic_add_32 2036 0.7% 
unix`page_exists 2105 0.7% 
unix`lock_set 2165 0.7% 
SUNW,UltraSPARC-IV`send_mondo_set 2716 0.9% 
genunix`fsflush 2765 0.9% 
genunix`avl_walk 3622 1.2% 
unix`disp_lowpri_cpu 3722 1.2% 
unix`default_lock_delay 6790 2.2% 
unix`kphysm_del_span_query 6790 2.2% 
genunix`rm_assize 6830 2.2% 
unix`_resume_from_idle 11758 3.8% 
unix`mutex_enter 16469 5.4% 
unix`disp_getwork 18809 6.1% 
unix`mutex_delay_default 69892 22.7%
One interesting to notice in the mpstat output is that 
CPUs with high Kernel mode usage also has high amount 
Of xcalls. 
©OraInternals Riyaj Shamsudeen 56 
Mpstat again 
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 
453 33 1 615 76 6 1052 54 290 265 0 3243 38 23 0 39 
454 0 0 35 58 6 807 37 227 197 0 3090 51 16 0 33 
455 37 0 39 74 6 993 37 244 220 0 3812 36 18 0 46 
480 0 0 105694 107 6 926 80 292 299 0 4319 32 35 0 32 
481 51 0 214 81 6 842 59 255 233 0 3217 52 19 0 28 
482 41 0 43 92 6 1105 64 325 318 0 3377 41 22 0 37 
483 68 1 95373 104 6 1060 80 313 355 0 4238 33 28 0 39 
484 0 0 23 56 6 746 36 231 156 0 2131 46 16 0 38 
485 10 0 1931 64 6 703 43 193 151 0 3659 53 14 0 33 
486 0 0 52 39 6 564 17 145 137 0 1513 17 16 0 67 
487 1 1 420 30 6 225 14 68 44 0 778 85 5 0 10
 Cross calls are CPU-to-CPU interrupts used for Memory 
 In a server with many memory boards and huge SGA, cross 
calls are necessary evil. 
 But, excessive and continuous cross calls are not optimal. 
 If many processes are accessing SGA buffers aggressively 
then that can lead to increased cross calls. 
 Increased cross calls = increased %sys mode CPU usage. 
©OraInternals Riyaj Shamsudeen 57
AWR report again 
 So, excessive database activity can lead to higher xcalls and 
can lead to a symptom of high Kernel mode CPU usage. 
 Interestingly, there is just one SQL with very high elapsed 
and cpu time. One session can’t cause this issue! 
Elapsed CPU Elap per % Total 
Time (s) Time (s) Executions Exec (s) DB Time SQL Id 
---------- ---------- ------------ ---------- ------- ------------- 
20,813 20,784 1 20812.8 14.7 d3zaxrb127axc 
©OraInternals Riyaj Shamsudeen 58
 AWR reports does not show SQL statements if the 
statements are still executing! 
 So, decided to review few hours report instead of a 30 
minutes AWR report. 
 32.1% of DB time spent on one SQL? That can cause 
Elapsed CPU Elap per % Total 
Time (s) Time (s) Executions Exec (s) DB Time SQL Id 
---------- ---------- ------------ ---------- ------- ------------- 
©OraInternals Riyaj Shamsudeen 59
If you don’t succeed first, try again? 
 Apparently, a report were not completing in time. 
 Front end program timed out and users submitted more 
reports out of frustration. 
 Client has an alert: If there are many sessions executing the 
same SQL statement for prolonged period. 
 Unfortunately, these sessions had no sql_id populated in 
v$session and so alert failed. 
©OraInternals Riyaj Shamsudeen 60
Opened cursors? 
But, v$open_cursor shows that there are many sessions with 
opened cursors on that sql_id. 
select sid, serial#, module, osuser from v$session where sid in ( 
Select sid from v$open_cursor where sql_id=' d3zaxrb127axc ' ); 
---------- ---------- ------------------------------------------------ ----------- 
10365 65389 prod 
10162 31712 prod 
9979 13788 prod 
10763 1819 prod 
10007 46806 prod 
9576 33605 prod 
©OraInternals Riyaj Shamsudeen 61 
40 rows selected.
Execution plan 
Merge join cartesian step 2 caused the issue. Cardinality estimates at 
step 3 is 1 and so, CBO chose cartesian join at step 2. 
| Id | Operation | Name | E-Rows | OMem | 1Mem | Used-Mem | 
| 1 | NESTED LOOPS | | 1 | | | | 
| 2 | MERGE JOIN CARTESIAN | | 1 | | | | 
| 3 | NESTED LOOPS | | 1 | | | | 
| 4 | NESTED LOOPS | | 1 | | | | 
| 5 | NESTED LOOPS | | 1 | | | | 
| 6 | NESTED LOOPS | | 1 | | | | 
| 7 | NESTED LOOPS | | 1 | | | | 
| 8 | NESTED LOOPS | | 1 | | | | 
| 9 | NESTED LOOPS | | 1 | | | | 
|* 10 | INDEX UNIQUE SCAN | VALUE_SETS_U2 | 1 | | | | 
|* 11 | INDEX UNIQUE SCAN | VALUE_SETS_U2 | 1 | | | | 
|* 12 | INDEX UNIQUE SCAN | VALUE_SETS_U2 | 1 | | | | 
|* 13 | INDEX UNIQUE SCAN | VALUE_SETS_U2 | 1 | | | | 
|* 15 | INDEX UNIQUE SCAN | VALUE_SETS_U2 | 1 | | | | 
|* 16 | INDEX UNIQUE SCAN | VALUE_SETS_U2 | 1 | | | | 
|* 18 | INDEX RANGE SCAN | ACCT_HIER_N1 | 1 | | | | 
|* 19 | INDEX RANGE SCAN | FLEX_VALUE_NM_HIER_U1 | 1 | | | | 
| 20 | BUFFER SORT | | 97323 | 67M| 2842K| 59M (0)| 
|* 21 | TABLE ACCESS FULL | FLEX_VALUES_TL | 97323 | | | | 
|* 23 | INDEX UNIQUE SCAN | FLEX_VALUES_U1 | 1 | | | | 
©OraInternals Riyaj Shamsudeen 62
©OraInternals Riyaj Shamsudeen 63 
 Cardinality estimates on table step 18 and 17 were totally 
 An index was added to the table. 
 From 10g onwards, compute statistics is default. This threw 
away histograms on those columns in the index leading to 
incorrect cardinality calculations. 
 Recollecting stats with proper histograms completely 
resolved the issue.
©OraInternals Riyaj Shamsudeen 64 
 Issue: High Kernel mode CPU usage 
 Issue: Hung RAC cluster 
 Issue: High Kernel mode CPU usage - 2 
 Issue: ASMM ( Skipped due to time constraints )
 Problem: Application was very slow. Database was recently 
upgraded to 10g. Management was quite unhappy and 
blamed it on 10g. 
 Even rollback to 9i was on the table as an option. Following 
report is for 3 minutes duration. 
Top 5 Timed Events Avg ~~~~~~~~~~~~~~~~~~ wait % TCoatlall -E-v-e-n-t - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -W-a--it-s- - - -T-i-m-e- -(-s-)- - -(-m-s-)- - -T-i-m-e- db file sequential read 96,535 704 7 30.1 
SGA: allocation forcing component growth 50,809 498 10 21.3 
library cache pin 180 219 1218 9.4 
latch: shared pool 2,767 217 78 9.3 
log file switch completion 225 216 958 9.2 
©OraInternals Riyaj Shamsudeen 65
Statspack report 
 From top 5 timed events, high amount of SGA activity, waits 
for library cache pin and latch contention etc. 
 Application opens new connections if it “detects” that SQL 
is hung. 
Top 5 Timed Events Avg ~~~~~~~~~~~~~~~~~~ wait % TCoatlall -E-v-e-n-t - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -W-a--it-s- - - -T-i-m-e- -(-s-)- - -(-m-s-)- - -T-i-m-e- db file sequential read 96,535 704 7 30.1 
SGA: allocation forcing component growth 50,809 498 10 21.3 
library cache pin 180 219 1218 9.4 
latch: shared pool 2,767 217 78 9.3 
log file switch completion 225 216 958 9.2 
©OraInternals Riyaj Shamsudeen 66
©OraInternals Riyaj Shamsudeen 67 
SGA resize ? 
 'SGA: allocation forcing component growth' gives a clue 
that there is SGA re-alignment occurring. 
 Looking at SGA area section of statspack: 
Buffer cache increased by 32 MB and 
Shared pool decreased by 32 MB 
Prior New Snap Id Cache Size (MB) Siz e ( M B ) D i f f e r e(nMcBe) ------1-8-1- B-u-f-f-e-r- -C-a-c-h-e- - - - -1-,-3-7-6- - - - -1-,-3-4-4- - - - - - - ---3-2- 
Shared Pool 288 320 32
Plotting buffer_cache size from statspack table shows that 
buffer cache is underwent constant reorganization. 
select * from perfstat.stats$sgastat where name='buffer_cache' 
order by snap_id; 
Row 8 
Row 10 
Row 12 
Row 14 
Row 16 
Row 18 
Row 20 
©OraInternals Riyaj Shamsudeen 68 
Row 2 
Row 1 
Row 4 
Row 3 
Row 6 
Row 5 
Row 7 
Row 9 
Row 11 
Row 13 
Row 15 
Row 17 
Row 19 
Row 21 
Plotting heaps with KGH:NO ACCESS tag shows turbulent 
select * from perfstat.stats$sgastat where name='KGH: NO ACCESS' 
order by snap_id 
Row 10 
Row 12 
Row 14 
Row 16 
Row 18 
Row 20 
Row 22 
©OraInternals Riyaj Shamsudeen 69 
Row 4 
Row 3 
Row 6 
Row 5 
Row 8 
Row 7 
Row 9 
Row 11 
Row 13 
Row 15 
Row 17 
Row 19 
Row 21 
Column J
©OraInternals Riyaj Shamsudeen 70 
 ASMM algorithms were detecting need for more buffer 
cache memory. 
 This forced de-allocation from shared pool to buffer cache. 
 This created more library cache latching and parsing issue. 
So, algorithm detected need for more shared pool memory. 
 Algorithm de-allocated memory from buffer cache and 
allocated to shared pool, inducing artificial disk reads. 
 Of course, this vicious cycle was continuous and causing 
performance issues.
©OraInternals Riyaj Shamsudeen 71 
 In summary, constant reorganization of SGA areas caused 
this issue. 
 DBA has setup sga_target and commented out all other 
memory parameters. 
 Memory was allocated and deallocated constantly from 
shared pool. V$sga_resize_ops showing constant 
reorganization of memory areas.
©OraInternals Riyaj Shamsudeen 72 
 There are many ways this can be resolved. 
 Disabling ASMM completely. 
 Providing a minimum size for all SGA components 
and leaving little bit for ASMM for automatic resize. 
 Increasing the underscore parameter 
_memory_broker_stat_interval to higher value like 3600 
or higher vlue. 
 or applying few patches
©OraInternals Riyaj Shamsudeen 73 
 Oracle support site. Various documents 
 Internal’s guru Steve Adam’s website 
 Jonathan Lewis’ website 
 Julian Dyke’s website 
 ‘Oracle8i Internal Services for Waits, Latches, Locks, and Memory’ 
by Steve Adams 
 Tom Kyte’s website 

More Related Content

What's hot

Deep review of LMS process
Deep review of LMS processDeep review of LMS process
Deep review of LMS process
Riyaj Shamsudeen
Dbms plan - A swiss army knife for performance engineers
Dbms plan - A swiss army knife for performance engineersDbms plan - A swiss army knife for performance engineers
Dbms plan - A swiss army knife for performance engineers
Riyaj Shamsudeen
Rac 12c optimization
Rac 12c optimizationRac 12c optimization
Rac 12c optimization
Riyaj Shamsudeen
Demystifying cost based optimization
Demystifying cost based optimizationDemystifying cost based optimization
Demystifying cost based optimization
Riyaj Shamsudeen
Riyaj: why optimizer_hates_my_sql_2010
Riyaj: why optimizer_hates_my_sql_2010Riyaj: why optimizer_hates_my_sql_2010
Riyaj: why optimizer_hates_my_sql_2010
Riyaj Shamsudeen
Redo internals ppt
Redo internals pptRedo internals ppt
Redo internals ppt
Riyaj Shamsudeen
Advanced rac troubleshooting
Advanced rac troubleshootingAdvanced rac troubleshooting
Advanced rac troubleshooting
Riyaj Shamsudeen
maclean liu
Introduction to Parallel Execution
Introduction to Parallel ExecutionIntroduction to Parallel Execution
Introduction to Parallel Execution
Doug Burns
OpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developersOpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developers
Connor McDonald
oracle cloud with 2 nodes processing
oracle cloud with 2 nodes processingoracle cloud with 2 nodes processing
oracle cloud with 2 nodes processing
mahdi ahmadi
Tracing Parallel Execution (UKOUG 2006)
Tracing Parallel Execution (UKOUG 2006)Tracing Parallel Execution (UKOUG 2006)
Tracing Parallel Execution (UKOUG 2006)
Doug Burns
OakTable World Sep14 clonedb
OakTable World Sep14 clonedb OakTable World Sep14 clonedb
OakTable World Sep14 clonedb
Connor McDonald
【Maclean liu技术分享】拨开oracle cbo优化器迷雾,探究histogram直方图之秘 0321
【Maclean liu技术分享】拨开oracle cbo优化器迷雾,探究histogram直方图之秘 0321【Maclean liu技术分享】拨开oracle cbo优化器迷雾,探究histogram直方图之秘 0321
【Maclean liu技术分享】拨开oracle cbo优化器迷雾,探究histogram直方图之秘 0321
maclean liu
Sangam 19 - PLSQL still the coolest
Sangam 19 - PLSQL still the coolestSangam 19 - PLSQL still the coolest
Sangam 19 - PLSQL still the coolest
Connor McDonald
UKOUG, Oracle Transaction Locks
UKOUG, Oracle Transaction LocksUKOUG, Oracle Transaction Locks
UKOUG, Oracle Transaction Locks
Kyle Hailey
11 Things About11g
11 Things About11g11 Things About11g
11 Things About11g
How Many Slaves (Ukoug)
How Many Slaves (Ukoug)How Many Slaves (Ukoug)
How Many Slaves (Ukoug)
Doug Burns
Oracle trace data collection errors: the story about oceans, islands, and rivers
Oracle trace data collection errors: the story about oceans, islands, and riversOracle trace data collection errors: the story about oceans, islands, and rivers
Oracle trace data collection errors: the story about oceans, islands, and rivers
Cary Millsap
Fatkulin presentation
Fatkulin presentationFatkulin presentation
Fatkulin presentation

What's hot (20)

Deep review of LMS process
Deep review of LMS processDeep review of LMS process
Deep review of LMS process
Dbms plan - A swiss army knife for performance engineers
Dbms plan - A swiss army knife for performance engineersDbms plan - A swiss army knife for performance engineers
Dbms plan - A swiss army knife for performance engineers
Rac 12c optimization
Rac 12c optimizationRac 12c optimization
Rac 12c optimization
Demystifying cost based optimization
Demystifying cost based optimizationDemystifying cost based optimization
Demystifying cost based optimization
Riyaj: why optimizer_hates_my_sql_2010
Riyaj: why optimizer_hates_my_sql_2010Riyaj: why optimizer_hates_my_sql_2010
Riyaj: why optimizer_hates_my_sql_2010
Redo internals ppt
Redo internals pptRedo internals ppt
Redo internals ppt
Advanced rac troubleshooting
Advanced rac troubleshootingAdvanced rac troubleshooting
Advanced rac troubleshooting
Introduction to Parallel Execution
Introduction to Parallel ExecutionIntroduction to Parallel Execution
Introduction to Parallel Execution
OpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developersOpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developers
oracle cloud with 2 nodes processing
oracle cloud with 2 nodes processingoracle cloud with 2 nodes processing
oracle cloud with 2 nodes processing
Tracing Parallel Execution (UKOUG 2006)
Tracing Parallel Execution (UKOUG 2006)Tracing Parallel Execution (UKOUG 2006)
Tracing Parallel Execution (UKOUG 2006)
OakTable World Sep14 clonedb
OakTable World Sep14 clonedb OakTable World Sep14 clonedb
OakTable World Sep14 clonedb
【Maclean liu技术分享】拨开oracle cbo优化器迷雾,探究histogram直方图之秘 0321
【Maclean liu技术分享】拨开oracle cbo优化器迷雾,探究histogram直方图之秘 0321【Maclean liu技术分享】拨开oracle cbo优化器迷雾,探究histogram直方图之秘 0321
【Maclean liu技术分享】拨开oracle cbo优化器迷雾,探究histogram直方图之秘 0321
Sangam 19 - PLSQL still the coolest
Sangam 19 - PLSQL still the coolestSangam 19 - PLSQL still the coolest
Sangam 19 - PLSQL still the coolest
UKOUG, Oracle Transaction Locks
UKOUG, Oracle Transaction LocksUKOUG, Oracle Transaction Locks
UKOUG, Oracle Transaction Locks
11 Things About11g
11 Things About11g11 Things About11g
11 Things About11g
How Many Slaves (Ukoug)
How Many Slaves (Ukoug)How Many Slaves (Ukoug)
How Many Slaves (Ukoug)
Oracle trace data collection errors: the story about oceans, islands, and rivers
Oracle trace data collection errors: the story about oceans, islands, and riversOracle trace data collection errors: the story about oceans, islands, and rivers
Oracle trace data collection errors: the story about oceans, islands, and rivers
Fatkulin presentation
Fatkulin presentationFatkulin presentation
Fatkulin presentation

Similar to A close encounter_with_real_world_and_odd_perf_issues

Debugging Ruby
Debugging RubyDebugging Ruby
Debugging Ruby
Aman Gupta
SCADA Strangelove: Hacking in the Name
SCADA Strangelove: Hacking in the NameSCADA Strangelove: Hacking in the Name
SCADA Strangelove: Hacking in the Name
Positive Hack Days
SCADA Strangelove: взлом во имя
SCADA Strangelove: взлом во имяSCADA Strangelove: взлом во имя
SCADA Strangelove: взлом во имя
Ekaterina Melnik
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoringOSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
Amazon Web Services
Debugging Ruby Systems
Debugging Ruby SystemsDebugging Ruby Systems
Debugging Ruby Systems
Engine Yard
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Amazon Web Services
Oracle Basics and Architecture
Oracle Basics and ArchitectureOracle Basics and Architecture
Oracle Basics and Architecture
Sidney Chen
Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Amazon Web Services
Oracle 11g R2 RAC setup on rhel 5.0
Oracle 11g R2 RAC setup on rhel 5.0Oracle 11g R2 RAC setup on rhel 5.0
Oracle 11g R2 RAC setup on rhel 5.0
Santosh Kangane
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Amazon Web Services
Solve the colocation conundrum: Performance and density at scale with Kubernetes
Solve the colocation conundrum: Performance and density at scale with KubernetesSolve the colocation conundrum: Performance and density at scale with Kubernetes
Solve the colocation conundrum: Performance and density at scale with Kubernetes
Niklas Quarfot Nielsen
Deep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesDeep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instances
Amazon Web Services
Performance tuning ColumnStore
Performance tuning ColumnStorePerformance tuning ColumnStore
Performance tuning ColumnStore
MariaDB plc
Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
Amazon Web Services

Similar to A close encounter_with_real_world_and_odd_perf_issues (20)

Debugging Ruby
Debugging RubyDebugging Ruby
Debugging Ruby
SCADA Strangelove: Hacking in the Name
SCADA Strangelove: Hacking in the NameSCADA Strangelove: Hacking in the Name
SCADA Strangelove: Hacking in the Name
SCADA Strangelove: взлом во имя
SCADA Strangelove: взлом во имяSCADA Strangelove: взлом во имя
SCADA Strangelove: взлом во имя
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoringOSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
Debugging Ruby Systems
Debugging Ruby SystemsDebugging Ruby Systems
Debugging Ruby Systems
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Oracle Basics and Architecture
Oracle Basics and ArchitectureOracle Basics and Architecture
Oracle Basics and Architecture
Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Oracle 11g R2 RAC setup on rhel 5.0
Oracle 11g R2 RAC setup on rhel 5.0Oracle 11g R2 RAC setup on rhel 5.0
Oracle 11g R2 RAC setup on rhel 5.0
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Solve the colocation conundrum: Performance and density at scale with Kubernetes
Solve the colocation conundrum: Performance and density at scale with KubernetesSolve the colocation conundrum: Performance and density at scale with Kubernetes
Solve the colocation conundrum: Performance and density at scale with Kubernetes
Deep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesDeep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instances
Performance tuning ColumnStore
Performance tuning ColumnStorePerformance tuning ColumnStore
Performance tuning ColumnStore
Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2

Recently uploaded

Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
Larry Smarr
20240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 202420240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 2024
Matthew Sinclair
Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
Mark Billinghurst
Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
Comparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdfComparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdf
Andrey Yasko
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
Matthew Sinclair
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
Safe Software
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
Enterprise Wired
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
Toru Tamaki
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
Yevgen Sysoyev
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Lidia A.

Recently uploaded (20)

Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
20240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 202420240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 2024
Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
Comparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdfComparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdf
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck

A close encounter_with_real_world_and_odd_perf_issues

  • 1. A close encounter with real world performance issues By Riyaj Shamsudeen ©OraInternals Riyaj Shamsudeen
  • 2. ©OraInternals Riyaj Shamsudeen 2 Who am I? 17 years using Oracle products/DBA OakTable member Certified DBA versions 7.0,7.3,8,8i 9i Specializes in RAC, performance tuning, Internals and E-business suite Chief DBA with OraInternals Co-author of “Expert Oracle Practices” ‘2010 Email: rshamsud at Blog :
  • 3. ©OraInternals Riyaj Shamsudeen 3 Disclaimer These slides and materials represent the work and opinions of the author and do not constitute official positions of my current or past employer or any other organization. This material has been peer reviewed, but author assume no responsibility whatsoever for the test cases. If you corrupt your databases by running my scripts, you are solely responsible for that. This material should not should not be reproduced or used without the authors' written permission.
  • 4. ©OraInternals Riyaj Shamsudeen 4 Agenda Issue: High Kernel mode CPU usage Issue: Hung RAC cluster Issue: High Kernel mode CPU usage – 2 Issue: ASMM (Skipped due to time constraints)
  • 5. Architecture overview lDinaktsabase users iAnpstpalniccaet i1on 12 server server ©OraInternals Riyaj Shamsudeen 5 server server ... ... 300 dFarotanbtaEsned server ... server 12 300 dCaetnatbraasl e ASM Instance Note, only partial architecture shown due to client confidentiality.
  • 6. dFarotanbtaEsned users dFarotanbtaEsned users ©OraInternals Riyaj Shamsudeen 6 Modular server server ... server 12 300 dCaetnatbraasl e ASM Instance Note, only partial architecture shown due to client confidentiality. dFarotanbtaEsned server server server users IAnpspt l1. 1 2 300 ... server server server IAnpspt l2. 1 2 300 ... server server server IAnpspt l3. 1 2 300 ...
  • 7. Issue There are many application instances each servicing a disjoint group of users. As per the design, any application instance can be shutdown with no user impact and traffic will be automatically rerouted to surviving instances (almost like RAC). Unfortunately, shutdown of any application instance shuts down every application instance leading to a site wide downtime. Database version : Operating system: Solaris 10 ©OraInternals Riyaj Shamsudeen 7
  • 8. users iAnpstpalniccaet i1on After an application restart database link connections are released. This terminates connections in the central database. CPU usage spikes up to 90% Select c1, c2 from t1@central; lDinaktsabase ©OraInternals Riyaj Shamsudeen 8 Details 12 server server server server ... ... 300 dFarotanbtaEsned server ... server 12 300 dCaetnatbraasl e ASM Instance
  • 9. ©OraInternals Riyaj Shamsudeen 9 Symptoms Very high CPU usage in the kernel mode in the central database. ASM instance in that central database server times out and crashes. Why is there such an high CPU usage?
  • 10. Who is using my CPU? In Solaris ( and in many UNIX platforms) mpstat utility can be used to understand the per-processor statistics. Mpstat output shows that almost all the CPUs are used in kernel mode. Tue Sep 9 17:46:34 2008CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl Tue Sep 9 17:46:34 2008 0 561 0 9 554 237 651 219 87 491 0 4349 9 91 0 0 Tue Sep 9 17:46:34 2008 1 1197 1 34 911 0 2412 591 353 630 0 15210 30 63 0 7 Tue Sep 9 17:46:34 2008 2 58 0 9 313 0 613 106 190 105 0 3562 8 90 0 2 Tue Sep 9 17:46:34 2008 3 161 0 26 255 0 492 92 161 530 0 2914 6 92 0 2 Tue Sep 9 17:46:34 2008 4 0 0 0 86 1 2 3 1 63 0 8 0 100 0 0 Tue Sep 9 17:46:34 2008 5 283 0 34 662 0 1269 153 326 211 0 6753 13 77 0 10 Tue Sep 9 17:46:34 2008 6 434 0 43 349 0 589 54 170 1534 0 3002 7 88 0 5 ... Tue Sep 9 17:46:34 2008 12 30 0 0 195 0 279 110 31 80 0 1590 3 97 0 0 Tue Sep 9 17:46:34 2008 13 288 0 9 449 0 844 117 158 127 0 4486 7 85 0 8 Tue Sep 9 17:46:34 2008 14 155 0 0 430 0 744 102 160 83 0 3875 7 80 0 13 Tue Sep 9 17:46:34 2008 15 16 0 0 237 0 359 115 31 124 0 2074 3 91 0 6 ©OraInternals Riyaj Shamsudeen 10
  • 11. Out of 655 seconds total elapsed time 455 seconds spent on latch free waits. Almost all latch waits are For enqueues latches. ©OraInternals Riyaj Shamsudeen 11 Statspack report Snap Id Snap Time Sessions --------- ------------------- -------- C-u-r-s-/-S-e-s-s- Begin Snap: 3131 09-Sep-08 17:46:17 5,030 1.9 End Snap: 3132 09-Sep-08 17:47:16 4,995 2.0 Elapsed: 0.98 (mins) DB Time: 10.55 (mins) Event Waits Time (s) (ms) Time Wait Class ------------------------------ ------------ ----------- ------ ------ ---------- latch free 868 455 525 71.9 Other latch: row cache objects 103 189 1833 29.8 Concurrenc log file sync 885 92 103 14.5 Commit CPU time 85 13.4 db file parallel write 3,868 10 3 1.6 System I/O latch contention is for enqueues latches: Pct Avg Wait Pct Get Get Slps Time NoWait NoWait Latch Name Requests Miss /Miss (s) Requests Miss ------------------------ -------------- ------ ------ ------ ------------ ------ enqueues 1,355,722 3.3 0.0 452 0 N/A
  • 12. Latch contention High CPU usage ©OraInternals Riyaj Shamsudeen 12 Chicken or Egg? But, usual symptoms of latch contention is high CPU usage in user mode. In this case, We will ignore latch contention for now.
  • 13. Let’s reproduce the issue To reproduce kernel mode CPU issue, we will create a database link from an application schema, connecting to a test schema in the central database. Then, we will execute a select over the database link creating a new connection in the central database: select * from dual@central; ©OraInternals Riyaj Shamsudeen 13
  • 14. Truss – trace system calls Identified the connection from test schema @ client database to the test schema @ central database. select sid, serial#, LOGON_TIME,LAST_CALL_ET from v$session where logon_time sysdate-(1/24)*(1/60) SID SERIAL# LOGON_TIME LAST_CALL_ET ---------- ---------- -------------------- ------------ 1371 35028 12-SEP-2008 20:47:30 0 4306 51273 12-SEP-2008 20:47:29 1 --- Starting truss on that process in central database truss -p pid -d -o /tmp/truss.log Better yet.. Truss –d –D –E –p pid -o /tmp/truss.log Logged off from content database and this should trigger a log-off from remote central database. ©OraInternals Riyaj Shamsudeen 14
  • 15. ©OraInternals Riyaj Shamsudeen 15 Truss output Reading truss output in central database connection, we can see ten shmdt calls are consuming time. 18.4630 close(10) = 0 18.4807 shmdt(0x380000000) = 0 18.5053 shmdt(0x440000000) = 0 18.5295 shmdt(0x640000000) = 0 18.5541 shmdt(0x840000000) = 0 18.5784 shmdt(0xA40000000) = 0 18.6026 shmdt(0xC40000000) = 0 18.6273 shmdt(0xE40000000) = 0 18.6512 shmdt(0x1040000000) = 0 18.6752 shmdt(0x1240000000) = 0 18.6753 shmdt(0x1440000000) = 0 Each shmdt call consumed approximately 0.024 seconds or 24ms. 18.5295-18.5053=0.0242
  • 16. ©OraInternals Riyaj Shamsudeen 16 Call: shmdt shmdt calls are used to detach from shared memory segments. Every database disconnect must detach from shared memory segments.
  • 17. ©OraInternals Riyaj Shamsudeen 17 Shmdt calls There are 10 shared memory segments for this SGA. So, there are 10 shmdt calls. ipcs -ma|grep 14382 m 1241514024 0x97e45100 --rw-r----- oracle orainvtr … m 1241514023 0 --rw-r----- oracle orainvtr … m 1241514022 0 --rw-r----- oracle orainvtr … …… m 1241514016 0 --rw-r----- oracle orainvtr … m 889192479 0 --rw-r----- oracle orainvtr … 8GB
  • 18. Fun with numbers Each shmdt call consumes 0.024 seconds For one session ( 10 calls) = 0.24 seconds For 300 connections = 300 * 0.24 = 72 seconds. At best case, with 12 concurrent processes, this would last for 6 seconds or so. This is matching with our observation of 6 seconds high kernel mode CPU usage. ©OraInternals Riyaj Shamsudeen 18
  • 19. Reducing shmdt calls To reduce shmdt calls we need to reduce shared memory segments. Database engine tries to create biggest segment possible at initial startup and slowly reduces segment size, until segments can be created successfully. SHMMAX kernel parameter was set lower, and so, we decided to increase that parameter first and reboot the server. ©OraInternals Riyaj Shamsudeen 19
  • 20. Well, that was embarrassing.. After increasing SHMMAX size we expected to have just one shared memory segment. This will reduce the impact 10 times. Surprise! There were still 10 shared memory segments. ©OraInternals Riyaj Shamsudeen 20
  • 21. 21 After SHMMAX change Started to truss on database startup to see why the instance is creating multiple shared memory segments. 17252: 4.5957 munmap(0xFFFFFD7FFDAE0000, 32768) = 0 17252: 4.5958 lgrp_version(1, ) = 1 17252: 4.5958 _lgrpsys(1, 0, ) = 42 17252: 4.5958 _lgrpsys(3, 0x00000000, 0x00000000) = 19108 17252: 4.5959 _lgrpsys(3, 0x00004AA4, 0x06399D60) = 19108 17252: 4.5959 _lgrpsys(1, 0, ) = 42 17252: 4.5960 pset_bind(PS_QUERY, P_LWPID, 4294967295, 0xFFFFFD7FFFDFB11C) = 0 17252: 4.5960 pset_info(PS_MYID, 0x00000000, 0xFFFFFD7FFFDFB0D4, 0x00000000) = 0 17252: 4.5961 pset_info(PS_MYID, 0x00000000, 0xFFFFFD7FFFDFB0D4, 0x061AA2B0) = 0
  • 22. 22 pset_bind and _lgrpsys Calls _lgrpsys and pset_bind are new and googling these function calls showed that there may be related to Non Uniforn Memory Access (NUMA) architecture. 17252: 4.5957 munmap (0xFFFFFD7FFDAE0000, 32768) = 0 17252: 4.5958 lgrp_version (1, ) = 1 17252: 4.5958 _lgrpsys (1, 0, ) = 42 17252: 4.5958 _lgrpsys (3, 0x00000000, 0x00000000) = 19108 17252: 4.5959 _lgrpsys (3, 0x00004AA4, 0x06399D60) = 19108 17252: 4.5959 _lgrpsys (1, 0, ) = 42 17252: 4.5960 pset_bind (PS_QUERY, P_LWPID, 4294967295, 0xFFFFFD7FFFDFB11C) = 0 17252: 4.5960 pset_info (PS_MYID, 0x00000000, 0xFFFFFD7FFFDFB0D4, 0x00000000) = 0 17252: 4.5961 pset_info (PS_MYID, 0x00000000, 0xFFFFFD7FFFDFB0D4, 0x061AA2B0) = 0
  • 23. 23 NUMA architecture (overview) Memory #1 Memory #2 Memory #3 Memory #4 CPU 0 CPU 1 CPU 2 CPU 3 CPU 4 CPU 5 CPU 6 CPU 7 For cpu0 and cpu1, memory board 1 is local. Other memory areas are remote for cpu0 and cpu1. Access to local memory is faster compared to remote memory access.
  • 24. 24 NUMA architecture (Overview ) Shm 1 Shm 2 Shm 3 Shm 4 Memory #1 Memory #2 Memory #3 Memory #4 CPU 0 CPU 1 CPU 2 CPU 3 CPU 4 CPU 5 CPU 6 CPU 7 dbwr 0 Dbwr 1 Dbwr 2 Dbwr 3 To make use of NUMA technology, Oracle spreads SGA across NUMA nodes.
  • 25. 25 NUMA optimization Binds DBWR to a CPU set. That DBWR handles all writes from that shared memory segment. User processes also tries to use free buffers from the working set of buffers from that NUMA node process is running from. (Update: This turned out to be a Oracle database bug 5173642). LGWR also seems to have some code optimization to better use NUMA technology, but my test cases are not conclusive enough.
  • 26. 26 Locality groups In Solaris, NUMA technology is implemented as locality groups. _lgrpsys and pset_bind calls are to get current locality group information and bind processes to a processor set. Now, we can understand why SGA was split into multiple segments. But, Do we really have that many locality groups in this server?
  • 27. 27 Locality groups Lgrpinfo tool can provide NUMA node details :~#/usr/local/bin/lgrpinfo lgroup 0 (root): Children: 10 12 14 15 17 19 21 23 CPUs: 0-15 Memory: installed 65024 Mb, allocated 2548 Mb, free 62476 Mb Lgroup resources: 1-8 (CPU); 1-8 (memory) Latency: 146 lgroup 1 (leaf): Children: none, Parent: 9 CPUs: 0 1 Memory: installed 7680 Mb, allocated 1964 Mb, free 5716 Mb Lgroup resources: 1 (CPU); 1 (memory) Load: 0.105 Latency: 51 ... There were many locality groups defined and seven of them were leaf node locality groups in this server.
  • 28. 28 Latency? Local access to memory from CPU 1 to memory in the same NUMA node as the CPU has a reference number of 51. Remote access to memory in a remote node has a reference number of 146. -------------------------------------------------------------------------------------------------------- | 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 -------------------------------------------------------------------------------------------------------- 0 | 146 146 113 113 113 113 113 113 146 146 146 146 146 146 146 146 113 146 113 146 146 146 146 146 146 1 | 146 51 81 81 113 113 113 113 146 81 113 113 146 113 146 146 113 146 113 146 146 146 146 146 146 2 | 113 81 51 113 81 113 113 81 113 113 113 81 113 113 113 113 113 113 113 113 113 113 113 113 113 3 | 113 81 113 51 113 81 81 113 113 113 113 113 113 81 113 113 113 113 113 113 113 113 113 113 113 4 | 113 113 81 113 51 81 81 113 113 113 113 113 113 113 113 113 81 113 113 113 113 113 113 113 113 ….. 15 | 146 146 113 113 113 113 113 113 146 146 146 146 146 146 146 113 113 146 113 146 146 146 146 146 146 ...
  • 29. 29 Is NUMA bad? Indeed 10 shared memory segments were created, one for a locality groups. We disabled NUMA to resolve this problem temporarily. NUMA is a great technology. Sequent Dynix/ptx has implemented NUMA technology successfully a decade ago. It’s just that we are encountering an unfortunate side effect of NUMA.
  • 30. 30 Solution We can disable NUMA or reduce number of NUMA nodes. *._enable_NUMA_optimization=FALSE *._db_block_numa = 1 Use patch for bug 819953 disable NUMA instead of underscore parameters (only if needed to disable NUMA) Note 399261.1 and 759565.1 describes these issues. It looks like, there is one shared memory segment per locality group and one segment encompassing all locality groups. One small bootstrap segment is also created.
  • 31. 31 Solution – contd. Another option is to control logout rate. Jonathan Lewis mentioned these parameters to control logout storm rate later. Parameter Meaning Value _logout_storm_retrycnt maximum retry count for logouts 600 timeout in centi-seconds for wait 5 between retries _logout_storm_timeout number of processes that can logout in a 0 second _logout_storm_rate
  • 32. ©OraInternals Riyaj Shamsudeen 32 Agenda Issue: High Kernel mode CPU usage Issue: Hung RAC cluster Issue: High Kernel mode CPU usage - 2 Issue: ASMM ( Skipped due to time constraints )
  • 33. ©OraInternals Riyaj Shamsudeen 33 Problem All RAC instances are stuck for 10-15 minutes intermittently. Application is not responsive during that time period. This happens randomly and no specific correlation with time of day.
  • 34. AWR analysis indicates gc buffer busy waits, many sessions were waiting for GC events. Event Event Class % Event Avg Active Sessions gc buffer busy acquire Cluster 46.61 6.42 CPU + Wait for CPU CPU 21.91 3.02 gc cr block busy Cluster 9.14 1.26 enq: CF - contention Other 4.44 0.61 ©OraInternals Riyaj Shamsudeen 34 AWR analysis Top User Events gc current block busy Cluster 2.50 0.35
  • 35. Event Event Class % Event Avg Active Sessions gc buffer busy acquire Cluster 43.68 11.48 gc cr block busy Cluster 10.92 2.87 CPU + Wait for CPU CPU 9.84 2.59 row cache lock Concurrency 5.12 1.35 ©OraInternals Riyaj Shamsudeen 35 ASH analysis ASH report confirms the issue too. Top User Events gc cr multi block request Cluster 4.90 1.29
  • 36. Why gc buffer busy? GC buffer busy waits indicates that buffer is busy waiting for some sort of Global event. Another session is working on that buffer and that session is waiting for a global cache event. We need to understand why that session 2 is waiting for global cache event. ©OraInternals Riyaj Shamsudeen 36
  • 37. Another node… In RAC, it is essential to review performance statistics from all nodes. One node performance can bring down entire cluster performance or even lead to hung cluster. Event Event Class % Event Avg Active Sessions buffer busy waits Concurrency 30.92 12.00 log file sync Commit 23.61 9.16 log buffer space Configuration 16.88 6.55 CPU + Wait for CPU CPU 7.48 2.90 ©OraInternals Riyaj Shamsudeen 37 Top User Events row cache lock Concurrency 3.31 1.29
  • 38. User sessions are waiting for Global buffer busy waits. 38 Review of waits Node #1 Node #2 Node #3 Node #4 User sessions are waiting for buffer busy waits and log file sync waits. Background sessions are Waiting for CF locks. Event % Event buffer busy waits 30.92 log file sync 23.61 log buffer space 16.88 Event % Activity enq: CF - contention 5.52 CPU + Wait for CPU 2.62 enq: TM - contention 1.28 gc cr block busy 1.27
  • 39. Buffer busy waits Buffer busy waits indicates that buffers are not available and busy undergoing a short change. Buffer busy waits can be caused by DBW trying to write buffers too. Event Event Class % Event Avg Active Sessions buffer busy waits Concurrency 30.92 12.00 log file sync Commit 23.61 9.16 log buffer space Configuration 16.88 6.55 CPU + Wait for CPU CPU 7.48 2.90 ©OraInternals Riyaj Shamsudeen 39 Top User Events row cache lock Concurrency 3.31 1.29
  • 40. Log file sync waits Log file sync waits indicates that log file write mechanism is not fast enough. This could be due to problem with LGWR, Log file I/O performance issue or even OS CPU scheduling issues. Event Event Class % Event Avg Active Sessions buffer busy waits Concurrency 30.92 12.00 log file sync Commit 23.61 9.16 log buffer space Configuration 16.88 6.55 CPU + Wait for CPU CPU 7.48 2.90 ©OraInternals Riyaj Shamsudeen 40 Top User Events row cache lock Concurrency 3.31 1.29
  • 41. Background waits Further review of ASH report indicates that there were waits for background processes too. Few enq: CF contention waits. %Activity is 5.5, but that can be misleading. Event Event Class % Activity Avg Active Sessions enq: CF - contention Other 5.52 0.63 CPU + Wait for CPU CPU 2.62 0.30 enq: TM - contention Application 1.28 0.15 ©OraInternals Riyaj Shamsudeen 41 Top Background Events gc cr block busy Cluster 1.27 0.15
  • 42. ©OraInternals Riyaj Shamsudeen 42 Review User processes in node 3 4 are suffering from global buffer busy waits. User processes in node 2 are suffering from buffer busy waits and log file sync waits. Background processes in node 2 are suffering from CF enqueue waits and buffer busy waits. If there are background processes waiting for locking contention, then that must be resolved first. Every thing else could be just a symptom.
  • 43. gv$lock INST ADDR KADDR SID TY ID1 ID2 LMODE REQ CTIME BLOCK ----- ---------------- ---------------- ---- -- ----- ----- ------ ----- -------- ----- 4 0000001022E12398 0000001022E123F0 4368 CF 0 0 0 4 8 0 4 0000001022E12FE0 0000001022E13038 4369 CF 0 0 0 4 39 0 ©OraInternals Riyaj Shamsudeen 43 ... 4 0000001022E15588 0000001022E155E0 4374 CF 0 0 0 4 34 0 4 0000001022E12BD0 0000001022E12C28 4375 CF 0 0 0 4 39 0 4 0000001022DFBD30 0000001022DFBD88 4387 CF 0 0 2 0 120592 2 4 0000001022E13E98 0000001022E13EF0 4388 CF 0 0 0 5 49 0 ... 1 0000001022E12058 0000001022E120B0 4372 CF 0 0 0 4 41 0 1 0000001022E121F8 0000001022E12250 4373 CF 0 0 0 4 41 0 1 0000001022E12E40 0000001022E12E98 4374 CF 0 0 0 4 41 0 1 0000001022E133F0 0000001022E13448 4376 CF 0 0 0 4 41 0 1 0000001022DFBD30 0000001022DFBD88 4387 CF 0 0 2 0 121783 2 3 0000001022E09BC8 0000001022E09C20 4134 CF 0 0 0 4 99 0 3 0000001022E15A68 0000001022E15AC0 4368 CF 0 0 0 4 39 0 3 0000001022E15658 0000001022E156B0 4369 CF 0 0 0 4 39 0 3 0000001022E15C08 0000001022E15C60 4370 CF 0 0 0 4 39 0 ... 3 0000001022E154B8 0000001022E15510 4376 CF 0 0 0 4 39 0 3 0000001022DFBD30 0000001022DFBD88 4387 CF 0 0 2 0 120855 2 2 0000001022E15318 0000001022E15370 4368 CF 0 0 0 4 40 0 ... 2 0000001022E14EF0 0000001022E14F48 4373 CF 0 0 0 5 81 0 2 0000001022DFBD30 0000001022DFBD88 4387 CF 0 0 2 0 121231 2 38 rows selected. Notice that no process holding CF lock in an incompatible mode.
  • 44. Locking scenario Waiters’ queue Holders’ queue ©OraInternals Riyaj Shamsudeen 44 CF-0-0 lock Hash bucket lock lock lock 4387/1 4387/2 4387/3 lock 4387/4 lock lock lock lock … 4134/3 Resource CF lock is held only in compatible mode. pid 4134 has highest ctime of 99 seconds and state is WAITING. So, it is first in the waiters’ queue. But why waiting?
  • 45. Process is waiting for 283 Problem continued.. Seconds INST_ID ADDR KADDR SID TY ID1 ID2 LMODE REQUEST CTIME BLOCK ---------- ---------------- ---------------- ---------- -- ---------- ---------- ---------- ---------- ---------- ---------- 4 0000001022E12398 0000001022E123F0 4368 CF 0 0 0 4 193 0 4 0000001022E12FE0 0000001022E13038 4369 CF 0 0 0 4 224 0 4 0000001022E13CF8 0000001022E13D50 4370 CF 0 0 0 5 266 0 4 0000001022E0FD20 0000001022E0FD78 4371 CF 0 0 0 4 224 0 4 0000001022E12E40 0000001022E12E98 4372 CF 0 0 0 4 224 0 4 0000001022E126D8 0000001022E12730 4373 CF 0 0 0 4 224 0 4 0000001022E15588 0000001022E155E0 4374 CF 0 0 0 4 219 0 4 0000001022E12BD0 0000001022E12C28 4375 CF 0 0 0 4 224 0 4 0000001022DFBD30 0000001022DFBD88 4387 CF 0 0 2 0 120777 2 4 0000001022E13E98 0000001022E13EF0 4388 CF 0 0 0 5 234 0 2 0000001022E15318 0000001022E15370 4368 CF 0 0 0 4 224 0 2 0000001022E15CD8 0000001022E15D30 4369 CF 0 0 0 4 224 0 2 0000001022E14108 0000001022E14160 4370 CF 0 0 0 4 224 0 2 0000001022E15E90 0000001022E15EE8 4371 CF 0 0 0 4 223 0 2 0000001022E15B38 0000001022E15B90 4372 CF 0 0 0 4 224 0 2 0000001022E14EF0 0000001022E14F48 4373 CF 0 0 0 5 265 0 2 0000001022E154B8 0000001022E15510 4374 CF 0 0 0 4 224 0 2 0000001022E15DA8 0000001022E15E00 4375 CF 0 0 0 4 224 0 2 0000001022DFBD30 0000001022DFBD88 4387 CF 0 0 2 0 121415 2 1 0000001022E13660 0000001022E136B8 4368 CF 0 0 0 4 225 0 1 0000001022E12128 0000001022E12180 4369 CF 0 0 0 4 225 0 1 0000001022E13250 0000001022E132A8 4370 CF 0 0 0 4 225 0 1 0000001022E10CA8 0000001022E10D00 4371 CF 0 0 0 4 249 0 1 0000001022E12058 0000001022E120B0 4372 CF 0 0 0 4 225 0 1 0000001022E121F8 0000001022E12250 4373 CF 0 0 0 4 225 0 1 0000001022E12E40 0000001022E12E98 4374 CF 0 0 0 4 225 0 1 0000001022E133F0 0000001022E13448 4376 CF 0 0 0 4 225 0 1 0000001022DFBD30 0000001022DFBD88 4387 CF 0 0 2 0 121967 2 3 0000001022E09BC8 0000001022E09C20 4134 CF 0 0 0 4 283 0 3 0000001022E0EE68 0000001022E0EEC0 4190 CF 0 0 0 4 18 0 3 0000001022E15A68 0000001022E15AC0 4368 CF 0 0 0 4 223 0 3 0000001022E15658 0000001022E156B0 4369 CF 0 0 0 4 223 0 3 0000001022E15C08 0000001022E15C60 4370 CF 0 0 0 4 223 0 3 0000001022E13590 0000001022E135E8 4371 CF 0 0 0 4 238 0 3 0000001022E13F68 0000001022E13FC0 4372 CF 0 0 0 4 225 0 3 0000001022E15998 0000001022E159F0 4373 CF 0 0 0 4 223 0 3 0000001022E15318 0000001022E15370 4374 CF 0 0 0 4 223 0 3 0000001022E154B8 0000001022E15510 4376 CF 0 0 0 4 223 0 3 0000001022DFBD30 0000001022DFBD88 4387 CF 0 0 2 0 121039 2 ©OraInternals Riyaj Shamsudeen 45
  • 46. Pstack Pstack 4134 #0 0x00000030364cb053 in __select_nocancel () from /lib64/ #1 0x0000000001d92111 in skgpnap () #2 0x000000000752d9b6 in ksliwat () #3 0x000000000752b668 in kslwait () .... #9 0x000000000753ed1b in ksqgtlctx () #10 0x000000000753db0b in ksqgelctx () #11 0x00000000076f5bb8 in kcc_get_enqueue () #12 0x00000000076f329b in kccocx () #13 0x00000000076f3140 in kccbcx () #14 0x0000000005563b0c in kcra_scan_redo () #15 0x000000000556334d in kcra_dump_redo () #16 0x0000000005561fcc in kcra_dump_redo_internal () Usually called if the process dumping due to errors or exceptions. Is there a process dumping errors? ©OraInternals Riyaj Shamsudeen 46
  • 47. At the same time, alert log had entries for that PID 4134 At the end of the trace file it was hung in ‘PINNED BUFFER HISTORY’. ©OraInternals Riyaj Shamsudeen 47 Alert log *** 2009-05-21 10:46:04.109 *** SESSION ID:(4134.4598) 2009-05-21 10:46:04.109 *** CLIENT ID:() 2009-05-21 10:46:04.109 *** SERVICE NAME:(PROD) 2009-05-21 10:46:04.109 *** MODULE NAME:() 2009-05-21 10:46:04.109 *** ACTION NAME:() 2009-05-21 10:46:04.109 Dump continued from file: /plogs/PROD/dump/diag/rdbms/prod/PROD3/trace/PROD3_ora_1004.trc ORA-07445: exception encountered: core dump [ksxpmprp()+42] [SIGSEGV] [ADDR:0x14] [PC:0x33BB5BE] [Address not mapped to object] []
  • 48. That’s a bug! Of course, that’s a bug we were encountering. Bug 8318486: CF ENQUEUE CONTENTION WHILE DUMPING REDO RECORDS IN PINNED BUFFER HISTORY As per the bug, process requests for CF locks, but hangs until cleaned up by pmon. Of course, easy fix is to kill the processes encountering ORA-7445 errors immediately and the long term fix was to fix these bugs (Both ORA-7445 errors and bug 8318486). ©OraInternals Riyaj Shamsudeen 48
  • 49. CF enqueue wierdness INST_ID SID TY ID1 ID2 LMODE REQUEST CTIME BLOCK ---------- ---- -- ---------- ---------- ---------- ---------- ---------- ---------- 2 4387 CF 0 0 2 0 122115 2 1 4113 CF 0 4 4 0 77 0 1 4113 CF 0 0 4 0 77 2 1 4387 CF 0 0 2 0 122667 2 4 4387 CF 0 0 2 0 121476 2 3 4387 CF 0 0 2 0 121739 2 ©OraInternals Riyaj Shamsudeen 49
  • 50. ©OraInternals Riyaj Shamsudeen 50 Agenda Issue: High Kernel mode CPU usage Issue: Hung RAC cluster Issue: High Kernel mode CPU usage - 2 Issue: ASMM ( Skipped due to time constraints )
  • 51. ©OraInternals Riyaj Shamsudeen 51 Problem High Kernel mode usage in an high end servers (Different client from discussion 1). Database was upgraded from 9i to 10gR2 recently. But, the node was running for fine for few weeks with no issues!
  • 52. Mpstat – per processor stats Mpstat indicates many processes using CPU in %sys mode. CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 25 0 107 294 6 1711 232 661 265 0 3702 33 40 0 26 1 21 0 454 205 9 1399 161 547 227 0 3422 44 33 0 23 2 23 0 91 146 5 1301 123 502 255 0 3503 44 34 0 22 3 123 0 1772 161 5 1383 127 544 264 0 3438 44 33 0 23 4 163 0 1651 144 5 1308 109 448 208 0 2984 45 31 0 23 ... 37 94 0 1295 4652 4600 450 44 174 376 0 1554 51 43 0 5 38 418 0 217 128 24 1039 79 376 127 1 3307 49 32 0 20 39 41 0 1310 4904 4863 495 35 174 428 0 2218 51 40 0 9 64 4 0 45 171 18 887 121 340 299 0 5802 31 49 0 20 65 73 0 1188 148 9 1219 116 453 231 0 5418 38 42 0 21 66 171 0 5809 133 27 1247 78 452 220 0 4524 41 38 0 22 67 5 0 278 204 57 1583 110 567 254 0 4898 35 40 0 25 68 0 0 5 41 27 8 9 5 5 0 7 99 1 0 0 69 1 0 79 128 5 1465 87 495 279 0 6097 29 45 0 26 70 6 0 1173 4342 4277 789 63 275 653 0 5596 23 64 0 13 … ©OraInternals Riyaj Shamsudeen 52
  • 53. AWR report for a 30 minute period showed nothing obvious ©OraInternals Riyaj Shamsudeen 53 AWR report Top 5 Timed Events Avg %Total ~~~~~~~~~~~~~~~~~~ wait Call Event Waits Time (s) (ms) Time Wait Class ------------------------------ ------------ ----------- ------ ------ ---------- CPU time 67,946 48.1 db file sequential read 7,007,846 38,738 6 27.4 User I/O gc buffer busy 1,705,205 12,142 7 8.6 Cluster gc cr grant 2-way 2,804,825 6,538 2 4.6 Cluster db file scattered read 761,202 6,330 8 4.5 User I/O I/O wait times are not too abnormal. We can theorize that if there is high I/O, it can result in high kernel mode CPU usage. But, this workload is quite normal for this node.
  • 54. Dtrace and Solaris 10 Dtrace is a great tool to do root cause analysis and available in Solaris 10. Dtrace can be used to see what calls are executed by peeking at CPUs. For example, following dtrace one-liner can break down the system calls executing in the server. dtrace -n 'syscall:::entry { @Calls[probefunc] = count(); }' ©OraInternals Riyaj Shamsudeen 54
  • 55. Dtrace output for kernel mode CPU usage shows CPUs Are spending time in mutex. Not much help here. ©OraInternals Riyaj Shamsudeen 55 Dtrace output unix`lock_set_spl_spin 1845 0.6% unix`utl0 1873 0.6% unix`atomic_add_32 2036 0.7% unix`page_exists 2105 0.7% unix`lock_set 2165 0.7% SUNW,UltraSPARC-IV`send_mondo_set 2716 0.9% genunix`fsflush 2765 0.9% genunix`avl_walk 3622 1.2% unix`disp_lowpri_cpu 3722 1.2% unix`default_lock_delay 6790 2.2% unix`kphysm_del_span_query 6790 2.2% genunix`rm_assize 6830 2.2% unix`_resume_from_idle 11758 3.8% unix`mutex_enter 16469 5.4% unix`disp_getwork 18809 6.1% unix`mutex_delay_default 69892 22.7%
  • 56. One interesting to notice in the mpstat output is that CPUs with high Kernel mode usage also has high amount Of xcalls. ©OraInternals Riyaj Shamsudeen 56 Mpstat again CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl ... 453 33 1 615 76 6 1052 54 290 265 0 3243 38 23 0 39 454 0 0 35 58 6 807 37 227 197 0 3090 51 16 0 33 455 37 0 39 74 6 993 37 244 220 0 3812 36 18 0 46 480 0 0 105694 107 6 926 80 292 299 0 4319 32 35 0 32 481 51 0 214 81 6 842 59 255 233 0 3217 52 19 0 28 482 41 0 43 92 6 1105 64 325 318 0 3377 41 22 0 37 483 68 1 95373 104 6 1060 80 313 355 0 4238 33 28 0 39 484 0 0 23 56 6 746 36 231 156 0 2131 46 16 0 38 485 10 0 1931 64 6 703 43 193 151 0 3659 53 14 0 33 486 0 0 52 39 6 564 17 145 137 0 1513 17 16 0 67 487 1 1 420 30 6 225 14 68 44 0 778 85 5 0 10
  • 57. xcalls Cross calls are CPU-to-CPU interrupts used for Memory consistency. In a server with many memory boards and huge SGA, cross calls are necessary evil. But, excessive and continuous cross calls are not optimal. If many processes are accessing SGA buffers aggressively then that can lead to increased cross calls. Increased cross calls = increased %sys mode CPU usage. ©OraInternals Riyaj Shamsudeen 57
  • 58. AWR report again So, excessive database activity can lead to higher xcalls and can lead to a symptom of high Kernel mode CPU usage. Interestingly, there is just one SQL with very high elapsed and cpu time. One session can’t cause this issue! Elapsed CPU Elap per % Total Time (s) Time (s) Executions Exec (s) DB Time SQL Id ---------- ---------- ------------ ---------- ------- ------------- 20,813 20,784 1 20812.8 14.7 d3zaxrb127axc INSERT INTO ACCOUNTS_HISTORY ( SID, ACT, PARENT, LINEAGE_STRING, DESCRIPTION, LEVEL_NUMBER, PARENT_CHILD_FLAG) SELECT SESSIONID, FLEX_VALUE, ACT, PARENT, DESCR, :B1, RANGE_ATTRIBUTE FROM ACCOUNT… ©OraInternals Riyaj Shamsudeen 58
  • 59. But.. AWR reports does not show SQL statements if the statements are still executing! So, decided to review few hours report instead of a 30 minutes AWR report. 32.1% of DB time spent on one SQL? That can cause problems. Elapsed CPU Elap per % Total Time (s) Time (s) Executions Exec (s) DB Time SQL Id ---------- ---------- ------------ ---------- ------- ------------- 60,400 61,156 22 2745.5 32.1 d3zaxrb127axc INSERT INTO ACCOUNTS_HISTORY ( SID, ACT, PARENT, LINEAGE_STRING, DESCRIPTION, LEVEL_NUMBER, PARENT_CHILD_FLAG) SELECT SESSIONID, FLEX_VALUE, ACT, PARENT, DESCR, :B1, RANGE_ATTRIBUTE FROM ACCOUNT… ©OraInternals Riyaj Shamsudeen 59
  • 60. If you don’t succeed first, try again? Apparently, a report were not completing in time. Front end program timed out and users submitted more reports out of frustration. Client has an alert: If there are many sessions executing the same SQL statement for prolonged period. Unfortunately, these sessions had no sql_id populated in v$session and so alert failed. ©OraInternals Riyaj Shamsudeen 60
  • 61. Opened cursors? But, v$open_cursor shows that there are many sessions with opened cursors on that sql_id. select sid, serial#, module, osuser from v$session where sid in ( Select sid from v$open_cursor where sql_id=' d3zaxrb127axc ' ); SID SERIAL# MODULE OSUSER ---------- ---------- ------------------------------------------------ ----------- 10365 65389 prod 10162 31712 prod 9979 13788 prod 10763 1819 prod 10007 46806 prod 9576 33605 prod ©OraInternals Riyaj Shamsudeen 61 ... 40 rows selected.
  • 62. Execution plan Merge join cartesian step 2 caused the issue. Cardinality estimates at step 3 is 1 and so, CBO chose cartesian join at step 2. ------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | E-Rows | OMem | 1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------- | 1 | NESTED LOOPS | | 1 | | | | | 2 | MERGE JOIN CARTESIAN | | 1 | | | | | 3 | NESTED LOOPS | | 1 | | | | | 4 | NESTED LOOPS | | 1 | | | | | 5 | NESTED LOOPS | | 1 | | | | | 6 | NESTED LOOPS | | 1 | | | | | 7 | NESTED LOOPS | | 1 | | | | | 8 | NESTED LOOPS | | 1 | | | | | 9 | NESTED LOOPS | | 1 | | | | |* 10 | INDEX UNIQUE SCAN | VALUE_SETS_U2 | 1 | | | | |* 11 | INDEX UNIQUE SCAN | VALUE_SETS_U2 | 1 | | | | |* 12 | INDEX UNIQUE SCAN | VALUE_SETS_U2 | 1 | | | | |* 13 | INDEX UNIQUE SCAN | VALUE_SETS_U2 | 1 | | | | | 14 | TABLE ACCESS BY INDEX ROWID| VALUE_SETS | 1 | | | | |* 15 | INDEX UNIQUE SCAN | VALUE_SETS_U2 | 1 | | | | |* 16 | INDEX UNIQUE SCAN | VALUE_SETS_U2 | 1 | | | | | 17 | TABLE ACCESS BY INDEX ROWID | ACCT_HIER_ALL | 1 | | | | |* 18 | INDEX RANGE SCAN | ACCT_HIER_N1 | 1 | | | | |* 19 | INDEX RANGE SCAN | FLEX_VALUE_NM_HIER_U1 | 1 | | | | | 20 | BUFFER SORT | | 97323 | 67M| 2842K| 59M (0)| |* 21 | TABLE ACCESS FULL | FLEX_VALUES_TL | 97323 | | | | |* 22 | TABLE ACCESS BY INDEX ROWID | FLEX_VALUES | 1 | | | | |* 23 | INDEX UNIQUE SCAN | FLEX_VALUES_U1 | 1 | | | | ------------------------------------------------------------------------------------------------------------- ©OraInternals Riyaj Shamsudeen 62
  • 63. ©OraInternals Riyaj Shamsudeen 63 Solution Cardinality estimates on table step 18 and 17 were totally incorrect. An index was added to the table. From 10g onwards, compute statistics is default. This threw away histograms on those columns in the index leading to incorrect cardinality calculations. Recollecting stats with proper histograms completely resolved the issue.
  • 64. ©OraInternals Riyaj Shamsudeen 64 Agenda Issue: High Kernel mode CPU usage Issue: Hung RAC cluster Issue: High Kernel mode CPU usage - 2 Issue: ASMM ( Skipped due to time constraints )
  • 65. Problem Problem: Application was very slow. Database was recently upgraded to 10g. Management was quite unhappy and blamed it on 10g. Even rollback to 9i was on the table as an option. Following report is for 3 minutes duration. Top 5 Timed Events Avg ~~~~~~~~~~~~~~~~~~ wait % TCoatlall -E-v-e-n-t - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -W-a--it-s- - - -T-i-m-e- -(-s-)- - -(-m-s-)- - -T-i-m-e- db file sequential read 96,535 704 7 30.1 SGA: allocation forcing component growth 50,809 498 10 21.3 library cache pin 180 219 1218 9.4 latch: shared pool 2,767 217 78 9.3 log file switch completion 225 216 958 9.2 ©OraInternals Riyaj Shamsudeen 65
  • 66. Statspack report From top 5 timed events, high amount of SGA activity, waits for library cache pin and latch contention etc. Application opens new connections if it “detects” that SQL is hung. Top 5 Timed Events Avg ~~~~~~~~~~~~~~~~~~ wait % TCoatlall -E-v-e-n-t - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -W-a--it-s- - - -T-i-m-e- -(-s-)- - -(-m-s-)- - -T-i-m-e- db file sequential read 96,535 704 7 30.1 SGA: allocation forcing component growth 50,809 498 10 21.3 library cache pin 180 219 1218 9.4 latch: shared pool 2,767 217 78 9.3 log file switch completion 225 216 958 9.2 ©OraInternals Riyaj Shamsudeen 66
  • 67. ©OraInternals Riyaj Shamsudeen 67 SGA resize ? 'SGA: allocation forcing component growth' gives a clue that there is SGA re-alignment occurring. Looking at SGA area section of statspack: Buffer cache increased by 32 MB and Shared pool decreased by 32 MB Prior New Snap Id Cache Size (MB) Siz e ( M B ) D i f f e r e(nMcBe) ------1-8-1- B-u-f-f-e-r- -C-a-c-h-e- - - - -1-,-3-7-6- - - - -1-,-3-4-4- - - - - - - ---3-2- Shared Pool 288 320 32
  • 68. Plotting buffer_cache size from statspack table shows that buffer cache is underwent constant reorganization. select * from perfstat.stats$sgastat where name='buffer_cache' order by snap_id; Row 8 Row 10 Row 12 Row 14 Row 16 Row 18 Row 20 ©OraInternals Riyaj Shamsudeen 68 Buffer_cache Row 2 1480000000 1460000000 1440000000 1420000000 1400000000 1380000000 1360000000 Row 1 Row 4 Row 3 Row 6 Row 5 Row 7 Row 9 Row 11 Row 13 Row 15 Row 17 Row 19 Row 21 1340000000
  • 69. Plotting heaps with KGH:NO ACCESS tag shows turbulent reorganization. select * from perfstat.stats$sgastat where name='KGH: NO ACCESS' order by snap_id Row 10 Row 12 Row 14 Row 16 Row 18 Row 20 Row 22 ©OraInternals Riyaj Shamsudeen 69 Shared_pool Row 4 700000000 600000000 500000000 400000000 300000000 200000000 100000000 Row 3 Row 6 Row 5 Row 8 Row 7 Row 9 Row 11 Row 13 Row 15 Row 17 Row 19 Row 21 0 Column J
  • 70. ©OraInternals Riyaj Shamsudeen 70 Review ASMM algorithms were detecting need for more buffer cache memory. This forced de-allocation from shared pool to buffer cache. This created more library cache latching and parsing issue. So, algorithm detected need for more shared pool memory. Algorithm de-allocated memory from buffer cache and allocated to shared pool, inducing artificial disk reads. Of course, this vicious cycle was continuous and causing performance issues.
  • 71. ©OraInternals Riyaj Shamsudeen 71 ASMM In summary, constant reorganization of SGA areas caused this issue. DBA has setup sga_target and commented out all other memory parameters. Memory was allocated and deallocated constantly from shared pool. V$sga_resize_ops showing constant reorganization of memory areas.
  • 72. ©OraInternals Riyaj Shamsudeen 72 Solution There are many ways this can be resolved. Disabling ASMM completely. Providing a minimum size for all SGA components and leaving little bit for ASMM for automatic resize. Increasing the underscore parameter _memory_broker_stat_interval to higher value like 3600 or higher vlue. or applying few patches
  • 73. ©OraInternals Riyaj Shamsudeen 73 References Oracle support site. Various documents Internal’s guru Steve Adam’s website Jonathan Lewis’ website Julian Dyke’s website ‘Oracle8i Internal Services for Waits, Latches, Locks, and Memory’ by Steve Adams Tom Kyte’s website Blog: