Bug 693838

Summary: Severe GFS2 performance regression between 2.6.18-164.2.1 and 2.6.18-194.17.1
Product: Red Hat Enterprise Linux 5 Reporter: Justin I. Nevill <jnevill>
Component: kernelAssignee: Robert Peterson <rpeterso>
Status: CLOSED DUPLICATE QA Contact: Cluster QE <mspqa-list>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 5.5CC: mdimaio, rpeterso, rwheeler
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-04-06 17:17:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Justin I. Nevill 2011-04-05 17:28:57 UTC
Description of problem:
Customer workload (IBM Information Server application suite) suffered severe performance loss when upgrading from kernel 2.6.18-164.2.1 to 2.6.18-194.17.1.

Upgrading to 2.6.18-238.1.1 results in the same performance issue as in 2.6.18-194.17.1.

Version-Release number of selected component (if applicable):
2.6.18-194.17.1

How reproducible:
Without fail.

Steps to Reproduce:
1. Launch multiple instances of the IBM IS application (single instances result in no performance difference between kernel versions).
2.  Measure application job startup times and throughput.
  
Actual results:
* 2.6.18-164.2.1
job_name: startup_time / production_time / total_time / transactions
e_V_P: 0:02 / 0:00 / 0:16 / 1844
s_G_A_f_ST: 0:03 / 2:46 / 3:09 / 922664
s_G_AT_f_ST: 0:02 / 0:00 / 0:20 / 0
E_R_T_C: 0:02 / 0:01 / 0:12 / 27

* 2.6.18-238.1.1
e_V_P: 0:16 / 0:11 / 0:02:37 / 1844
s_G_A_f_ST: 0:22 / 2:53 / 0:05:59 / 922664
s_G_AT_f_ST: 0:40 / 0:09 / 0:03:04 / 0
E_R_T_C: 0:26 / 0:03 / 0:02:07 / 27

You can see total_time went up dramatically, with startup_time (setting up the job) being a significant factor.

Expected results:
Same performance between kernel versions.

Additional info:
When analyzing GFS2 lockdumps the "llcheckpriv" process stood out as causing other processing to hang waiting for it's locks. "llcheckpriv" is part of IBM Tivoli Workload Scheduler LoadLeveler (see http://www.scc.acad.bg/documentation/a2278827.pdf). It creates and opens the job output files under /opt/IBM/InformationServer/, then the work threads write to them. Seems like an area of focus (what specific workload is llcheckpriv generating that is affected by patches between these kernel vers?), since the startup times in the reproducer are one of the main areas of performance degradation.

This is urgent as
1) the customer cannot use 2.6.18-164.2.1 due to BZ539240 causing random reboots of the cluster nodes. Upgrading to 2.6.18-194.17.1 resolved the reboots, but introduced the performance issue.
2) This is a major project with high level management visibility affecting 5 4-node clusters.

Comment 1 Robert Peterson 2011-04-05 18:00:05 UTC
I recommend the customer try the kernel modules located
on my people page:

http://people.redhat.com/rpeterso/Experimental/RHEL5.x/gfs2/kernel-2.6.18-248.el5.bz656032.x86_64.rpm

This kernel contains a bunch of fixes that may improve
their performance, the most notable of which is the patch
to DLM that sets the TCP_NODELAY bit by default.
Let me know how this goes.  There is still an outstanding
issue with this kernel for which we have bug #690555, but
those issues are rare.  I'll set the NEEDINFO bit until I
hear if this kernel helps.

Comment 2 Justin I. Nevill 2011-04-06 01:40:52 UTC
Robert,

I really appreciate your quick response on this! We got the necessary approvals and such done at the customer this afternoon, and the test kernel resolved the issue entirely! It's even faster than the original (-164) testing:

* 2.6.18-248.bz656032
job_name: startup_time / production_time / total_time / transactions
e_V_P: 0:02 / 0:00 / 0:00:12 / 1844
s_G_A_f_ST: 0:03 / 2:35 / 0:02:50 / 922664
s_G_AT_f_ST: 0:02 / 0:00 / 0:00:12 / 0
E_R_T_C: 0:02 / 0:00 / 0:00:07 / 27

Customer wants hotfix asap and backport to 5.5. I'll read through #690555 in the morning to see if there's any concern with not having it included. I'll get those requests in tomorrow as well. Should I do that against this BZ, or do you have another that this one dupes and that resulted in the test kernel RPM you linked?

Thanks,
Justin

Comment 3 Robert Peterson 2011-04-06 17:17:46 UTC
Since that kernel fixed their performance issue, and since
that kernel contains a number of patches for a number of bugs,
the problem they reported could be any number of them:

1. bug #690239 - gfs2: creating large files suddenly slow to a crawl
2. bug #604139 - flock performance with DLM in RHEL 5.5
3. bug #650494 - Bouncing locks in a cluster is slow in GFS2
4. bug #656032 - GFS2 filesystem hang caused by incorrect lock order

It's hard to say which case is causing their specific problem.
I'm guessing it's really #2, bug #604139, but I have no proof.
The problem goes well beyond flocks.  We could use the process
of elimination to figure out which problem is their primary one,
but it seems hardly worth the time it would take, since they're
going to want the fixes for all four bugs anyway.  I'm going to
mark this as a duplicate of bug #604139, since it best fits
that symptom.

*** This bug has been marked as a duplicate of bug 604139 ***