Bug 856532 - Huge CPU load caused by httpd workers
Huge CPU load caused by httpd workers
Status: CLOSED INSUFFICIENT_DATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: mod_auth_kerb (Show other bugs)
5.8
x86_64 Linux
unspecified Severity urgent
: rc
: ---
Assigned To: Web Stack Team
BaseOS QE Security Team
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-09-12 05:04 EDT by Vojtech Juranek
Modified: 2013-03-13 13:02 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-03-13 13:02:53 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Vojtech Juranek 2012-09-12 05:04:41 EDT
Description of problem:
After switching to kerberos authentication (via mod_auth_kerb) we observed huge CPU load caused by httpd workers (example from top is bellow). It appears randomly, but quite often and httpd needs to be restarted to fix it. There is probably some dead lock, as strace shows waiting for futex:

futex(0x43a339d0, FUTEX_WAIT, 6054, NULL

Not sure, if it's related, but in the log we found following error message:

[Wed Sep 12 03:26:23 2012] [error] (120006)APR does not understand this error code: proxy: read response failed from 127.0.0.1:8009 (localhost)
[Wed Sep 12 03:27:45 2012] [error] ajp_read_header: ajp_ilink_receive failed
[Wed Sep 12 03:27:45 2012] [error] (120006)APR does not understand this error code: proxy: read response failed from 127.0.0.1:8009 (localhost)
[Wed Sep 12 03:36:43 2012] [error] [client 10.34.3.225] krb5_get_init_creds_password() failed: KDC reply did not match expectations, referer: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/soa-6.0/30/

The problem appeared once we started to use mod_kerb_auth, so it's very likely it's a bug in mod_auth_kerb (and maybe be related to bugfix for BZ #734098)

Version-Release number of selected component (if applicable):
httpd-2.2.3-65.el5_8
mod_auth_kerb-5.1-3.el5_7.1

How reproducible:
Appears often, but randomly

Steps to Reproduce:
Cannot reproduce reliably
  
Actual results:
Huge CPU load caused by httpd workers

Expected results:
httpd workers don't consume a lot fo CPU

Additional info:
Example of CPU load from top:
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND     
 6050 apache    15   0  730m  33m 4116 S 506.5  0.1   4817:30 httpd.worker                                                                  
16956 apache    18   0  730m  32m 4104 S 383.9  0.1   3513:04 httpd.worker                                                                  
23840 apache    16   0  730m  35m 4100 S 327.6  0.1   3514:37 httpd.worker                                                                  
 8926 apache    15   0  646m  32m 4100 S 248.1  0.1   2618:09 httpd.worker                                                                  
 6052 apache    15   0  690m  33m 4116 S 84.1  0.1 874:47.38 httpd.worker
Comment 1 Joe Orton 2012-09-27 09:45:29 EDT
mod_auth_kerb-5.1-3.el5_7.1 should include the fix for the known threading problem.

Can you get a backtrace from the thread which is hung, or the strace output from one of the workers consuming lots of CPU time?
Comment 2 Joe Orton 2012-09-27 09:47:16 EDT
Also, it would be useful to know whether switching httpd to prefork solves the problem.
Comment 3 Vojtech Juranek 2012-10-04 05:02:55 EDT
Hi,
this is what I got from eng-ops:

[root@jenkins ~]# strace -p 22409 
Process 22409 attached - interrupt to quit

futex(0x47b3a9d0, FUTEX_WAIT, 22419, NULL
22409.pid (END)

Any idea how to get more detail information what is happening there?
Thanks

Note You need to log in before you can comment on or make changes to this bug.