Red Hat Bugzilla – Bug 208419
SPECsfs NFS V3 workload on RHEL4(optimized and instrumented with lockmeter) using TCP/IP, EXT3 shows major lock contention in sunrpc code
Last modified: 2008-02-05 10:05:51 EST
+++ This bug was initially created as a clone of Bug #208327 +++
This bug almost certainly exists in all of RHEL4. Having seen the same system
time and Oprofile data with a run on RHEL4 (U3 and U4), and seeing virtually the
same hot functions, and looking at RHEL4 sunrpc sources briefly, I'm reasonably
comfortable saying it's here too.
Description of problem:
Running SPECsfs using a RHEL5 beta1 based kernel that is fully optimized
(RHEL5-2.6.17-1.2519.4.21.el5.f2.opt created by Don Zickus) and instrumented
with lockmeter (latest patches from HP) , profiling shows excessive CPU cycles
in the sunrpc routines. Lockmeter shows excessive missing and spinning. Below
is a snippet of the output of the lockstat tool at it's highest contention.
System: Linux bigi.hpperf.rdu.redhat.com 2.6.17-1.2519.4.21el5.f2.opt.lockmeter2
#1 SMP Fri Sep 22 02:32:12 EDT 2006 x86_64
All (8) CPUs
Start time: Fri Sep 22 06:59:26 2006
End time: Fri Sep 22 07:00:26 2006
Symbols from: /proc/kallsyms
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - -
SPINLOCKS HOLD WAIT
UTIL CON MEAN( MAX ) MEAN( MAX )(% CPU) TOTAL NOWAIT SPIN RJECT NAME
7.5% 1.3us( 198ms) 11us( 60ms)(23.9%) 136775128 92.5% 7.5% 0.00%
45.2% 45.5% 1.4us( 83us) 12us( 322us)(21.9%) 19017532 54.5% 45.5% 0%
8.1% 48.2% 1.8us( 59us) 12us( 236us)( 3.2%) 2732401 51.8% 48.2% 0%
3.9% 62.0% 0.9us( 67us) 12us( 278us)( 4.1%) 2644539 38.0% 62.0% 0%
1.7% 44.3% 0.7us( 46us) 11us( 232us)( 1.5%) 1433024 55.7% 44.3% 0%
1.5% 61.3% 0.2us( 52us) 12us( 279us)( 6.6%) 4262578 38.7% 61.3% 0%
29.1% 40.1% 3.4us( 83us) 13us( 322us)( 5.5%) 5212589 59.9% 40.1% 0%
0.75% 13.4% 0.2us( 40us) 13us( 262us)(0.96%) 2732401 86.6% 13.4% 0%
Looking at spinlock calls in the hot functions above, there appears to be a
single lock in the struct svc_serv (sv_lock).
The server system is an HP DL580 4 socket Xeon Extreme with HT disabled yielding
8 logical CPU's. It contains 16GB of RAM, 4 NICS all doing jumbo frames to 4
clients on private VLANS. There are 4 HP MSA1000's directly connected to two
dual port Qlogic FC adaptors.
The benchmark has 64 processes per client communicating evenly to 16 EXT3
filesystems presented by the server. Filesystem options include -J size=4.
Extensive profiling data has been gathered and is available. The data includes:
and can be uploaded. LARGE LOG FILE for each run.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Run benchmark with opt and patched for lockmeter kernel
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
Development Management has reviewed and declined this request. You may appeal
this decision by reopening this request.