Bug 137306 - Apache child infinite loop
Apache child infinite loop
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: httpd (Show other bugs)
3.0
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Joe Orton
http://issues.apache.org/bugzilla/sho...
:
Depends On:
Blocks: 156320
  Show dependency treegraph
 
Reported: 2004-10-27 09:39 EDT by Cott Lang
Modified: 2007-11-30 17:07 EST (History)
0 users

See Also:
Fixed In Version: RHBA-2005-621
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-09-28 13:04:57 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Cott Lang 2004-10-27 09:39:43 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; rv:1.7.3) Gecko/20040914
Firefox/0.10.1

Description of problem:
An httpd process was consuming 100% cpu for hours.

Strace showed an infinite loop of this:

lseek(14, 12964, SEEK_SET)              = 12964
read(14, "", 49)                        = 0
lseek(14, 12964, SEEK_SET)              = 12964
read(14, "", 49)                        = 0
lseek(14, 12964, SEEK_SET)              = 12964
read(14, "", 49)                        = 0
lseek(14, 12964, SEEK_SET)              = 12964
read(14, "", 49)                        = 0

lsof showed file #14 to be a 12964 byte .jpg file that is served
thousands of times per day.

gdb of the process showed this:

(gdb) where
#0  0x0000002a96dd0229 in lseek () from /lib64/tls/libpthread.so.0
#1  0x0000002a966c3c2b in apr_file_seek () from /usr/lib64/libapr-0.so.0
#2  0x0000002a962b44e0 in apr_bucket_pool_create () from
/usr/lib64/libaprutil-0.so.0
#3  0x0000002a9774f101 in ssl_init_ModuleKill () from
/etc/httpd/modules/mod_ssl.so
#4  0x00000000004319d2 in ap_pass_brigade ()
#5  0x000000000042139f in ap_http_header_filter ()
#6  0x00000000004319d2 in ap_pass_brigade ()
#7  0x0000000000433ed4 in ap_content_length_filter ()
#8  0x00000000004319d2 in ap_pass_brigade ()
#9  0x0000000000422f32 in ap_byterange_filter ()
#10 0x00000000004319d2 in ap_pass_brigade ()
#11 0x0000002a975361fe in ?? () from /etc/httpd/modules/mod_headers.so
#12 0x00000000004319d2 in ap_pass_brigade ()
#13 0x0000002a97c86894 in ?? () from /etc/httpd/modules/mod_expires.so
#14 0x00000000004319d2 in ap_pass_brigade ()
#15 0x0000000000439378 in ap_core_translate ()
#16 0x000000000042654e in ap_run_handler ()
#17 0x0000000000426b99 in ap_invoke_handler ()
#18 0x00000000004236a2 in ap_process_request ()
#19 0x000000000041f36c in _start ()
#20 0x000000000042f8a7 in ap_run_process_connection ()
#21 0x0000000000424cd8 in ap_graceful_stop_signalled ()
#22 0x0000000000424e08 in ap_graceful_stop_signalled ()
#23 0x0000000000425029 in ap_graceful_stop_signalled ()
#24 0x00000000004255e1 in ap_mpm_run ()
#25 0x000000000042b379 in main ()

Thanks for looking at this.


Version-Release number of selected component (if applicable):
httpd-2.0.46-40.ent

How reproducible:
Couldn't Reproduce

Steps to Reproduce:
1. Unable to
2.
3.
    

Additional info:
Comment 1 Joe Orton 2004-10-27 11:59:52 EDT
Thanks for the report and backtrace.

Did you segfault the child and get a core dump, or just attach gdb to
the running pid?  Can you attach any relevant changes to the default
configuration?
Comment 2 Cott Lang 2004-10-27 12:08:41 EDT
I did not get a core dump, I just attached gdb to the pid. It was 2am
and I was rather sleepy. :)

I'm not sure if I have any relevant changes, but my httpd.conf has
radically departed from the default. If you really want a copy of it,
I can email it to you personally, but I do not wish it to be
publically available on bugzilla for security reasons. :)

thanks!
Comment 3 Joe Orton 2004-10-27 14:37:23 EDT
If you could attach it to bugzilla and hit the "Private" button on the
attachment form, that would make visible to me but not the rest of the
world.
Comment 4 Joe Orton 2004-11-16 09:58:59 EST
A code review did not reveal any places where an lseek() loop like
this could occur; furthermore, the backtrace:

#1  0x0000002a966c3c2b in apr_file_seek () from /usr/lib64/libapr-0.so.0
#2  0x0000002a962b44e0 in apr_bucket_pool_create () from
/usr/lib64/libaprutil-0.so.0

seems to be corrupted since there is no call between those functions.

Have you seen this problem recur at all?  If you could get us that
httpd.conf it might be helpful, otherwise in the future if you could
generate a core dump from a spinning process that would be useful.
(generate core dumps by putting "CoreDumpDirectory /var/tmp" in the
configuration somewhere, and then sending the process a SIGSEGV)
Comment 5 Joe Orton 2005-05-31 06:41:38 EDT
This got tracked down upstream just recently.

It's a race between sending the file and the file being truncated by some other
process on the system.
Comment 13 Red Hat Bugzilla 2005-09-28 13:04:57 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2005-621.html

Note You need to log in before you can comment on or make changes to this bug.