Unexpected High CPU Usage
with Weblogic Server
On Solaris
On Solaris
- Run the
prstat
UNIX command on the Java process.
$ prstat -L -p 9499 1 1
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/LWPID 9499 usera 153M 100M sleep 58 0 0:00.22 0.6% java/8 9499 usera 153M 100M sleep 58 0 0:00.10 0.2% java/10 9499 usera 153M 100M sleep 58 0 0:00.11 0.1% java/9 9499 usera 153M 100M sleep 58 0 0:00.03 0.0% java/5 9499 usera 153M 100M sleep 58 0 0:01.01 0.0% java/1 9499 usera 153M 100M sleep 58 0 0:00.00 0.0% java/12 9499 usera 153M 100M sleep 58 0 0:00.00 0.0% java/11 9499 usera 153M 100M sleep 58 0 0:00.00 0.0% java/14 9499 usera 153M 100M sleep 58 0 0:00.00 0.0% java/13 9499 usera 153M 100M sleep 59 0 0:00.07 0.0% java/7 9499 usera 153M 100M sleep 59 0 0:00.00 0.0% java/6 9499 usera 153M 100M sleep 59 0 0:00.00 0.0% java/4 9499 usera 153M 100M sleep 58 0 0:00.11 0.0% java/3 9499 usera 153M 100M sleep 58 0 0:00.00 0.0% java/2 |
- Run the
pstack
command.
Example:
pstack 9499
and redirect the output to a file.
If you use regular thread libraries
in Solaris (i.e., you do not have
/usr/lib/lwp
in the LD_LIBRARY_PATH
),
a LWP is not mapped to OS thread directly, so you have to take pstack
from process (so check to see if you are using the alternate thread library).
The example above shows the
"java/8" process at the top of
prstat
.
- Examine the
pstack
output for "lwp# 8".
You would find that this maps to
"thread# 76" from the
pstack
output, as below.
----------------- lwp# 8 /
thread# 76 --------------------
ff29d190 poll (e2e81548, 0, bb8) ff24d154 select (0, 0, 0, e2e81548, ff2bf1b4, e2e81548) + 348 ff36b134 select (0, bb8, 7fffffff, fe4c8000, 0, bb8) + 34 fe0f62e4 __1cCosFsleep6FpnGThread_xl_i_ (0, bb8, fe4c8000, 1, 0, 1e2fd8) + 234 fe23f050 JVM_Sleep (2, 0, bb8, fe4de978, fe4c8000, 1e2fd8) + 22c 0008f7ac ???????? (e2e818d4, bb8, 1e2fd8, 984a4, 0, 109a0) 0008c914 ???????? (e2e8194c, 1, fe4d6a80, 98564, 8, e2e81868) fe5324e8 __1cMStubRoutinesG_code1_ (e2e819d8, e2e81c10, a, f6cb5000, 4, e2e818f0) + 3ec fe0cbe94 __1cJJavaCallsLcall_helper6FpnJJavaValue_pnMmethodHandle_pnRJavaCallArguments_pnGThread__v_ (e2e81c08,fe4c8000, e2e81b54, 1e2fd8, 8e764, e2e81c10) +308 fe1f6dbc __1cJJavaCallsMcall_virtual6FpnJJavaValue_nLKlassHandle_nMsymbolHandlee81c08, e2e81b54) + 150pnGThread__v_(f6cb64b8, e2e81b40, e2e81b44, fe4c8000, e2d8) + 60e_5pnGThread__v_ (e2e81c08, e2e81c04, e2e81c00,e2e81bf4, e2e81bec, 1e2f8000, e2e81d10, 1e, e) + 120FpnKJavaThread_pnGThread__v_ (f6817ff8, 1e2fd8, fe4c 7fd70) + 3d8cKJavaThreadDrun6M_v_ (e2e02000, fe4d3e34, fe4c8000, 7fd70, 1e2fd8, fe213ec8 _start (fe4c8000, fe625d10, 0, 5, 1, fe401000) + 20 ff36b728 _thread_start (1e2fd8, 0, 0, 0, 0, 0) + 40 |
- Get a thread dump of the server by doing:
kill -3 <PID>
on the Java Process. - Since lwp# 8 mapped to thread #76, you can calculate the hex value of 76 to be 4c.
This maps to nid=0x4c in the JVM
thread dump:
"Thread-6" prio=5 tid=0x1e2fd8
nid=0x4c waiting on monitor [0xe2e81000..0xe2e819d8]
at java.lang.Thread.sleep(Native Method) at weblogic.management.deploy.GenericAppPoller.run(GenericAppPoller.java:139) |
In this example, the thread using
the most CPU is actually sleeping. The application poller runs when the
server is started in
Development Mode
.
It runs every 30 seconds, so it is apparent that the thread dump was not
captured in time to see activity in this thread.
Ideally, all three steps should be
done quickly and in a row to capture the data as close together in time as
possible. This can be done via a simple shell script like below:
#
# Takes an argument (PID of the WLS process) and loops three times. This will append the prstat information to a file called dump_high_cpu.txt. The thread dump information will either be in file where stdout was redirected or printed on the screen. # for loopnum in 1 2 3 do prstat -L -p $1 1 1 >> dump_high_cpu.txt pstack $1 >> dump_high_cpu.txt kill -3 $1 echo "prstat, pstack, and thread dump done. #" $loopnum sleep 1 echo "Done sleeping." done |
- Then you can then examine that thread to determine what it is doing and if there is a problem.
No comments:
Post a Comment