From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; rv:1.7.3) Gecko/20040914 Firefox/0.10.1 Description of problem: An httpd process was consuming 100% cpu for hours. Strace showed an infinite loop of this: lseek(14, 12964, SEEK_SET) = 12964 read(14, "", 49) = 0 lseek(14, 12964, SEEK_SET) = 12964 read(14, "", 49) = 0 lseek(14, 12964, SEEK_SET) = 12964 read(14, "", 49) = 0 lseek(14, 12964, SEEK_SET) = 12964 read(14, "", 49) = 0 lsof showed file #14 to be a 12964 byte .jpg file that is served thousands of times per day. gdb of the process showed this: (gdb) where #0 0x0000002a96dd0229 in lseek () from /lib64/tls/libpthread.so.0 #1 0x0000002a966c3c2b in apr_file_seek () from /usr/lib64/libapr-0.so.0 #2 0x0000002a962b44e0 in apr_bucket_pool_create () from /usr/lib64/libaprutil-0.so.0 #3 0x0000002a9774f101 in ssl_init_ModuleKill () from /etc/httpd/modules/mod_ssl.so #4 0x00000000004319d2 in ap_pass_brigade () #5 0x000000000042139f in ap_http_header_filter () #6 0x00000000004319d2 in ap_pass_brigade () #7 0x0000000000433ed4 in ap_content_length_filter () #8 0x00000000004319d2 in ap_pass_brigade () #9 0x0000000000422f32 in ap_byterange_filter () #10 0x00000000004319d2 in ap_pass_brigade () #11 0x0000002a975361fe in ?? () from /etc/httpd/modules/mod_headers.so #12 0x00000000004319d2 in ap_pass_brigade () #13 0x0000002a97c86894 in ?? () from /etc/httpd/modules/mod_expires.so #14 0x00000000004319d2 in ap_pass_brigade () #15 0x0000000000439378 in ap_core_translate () #16 0x000000000042654e in ap_run_handler () #17 0x0000000000426b99 in ap_invoke_handler () #18 0x00000000004236a2 in ap_process_request () #19 0x000000000041f36c in _start () #20 0x000000000042f8a7 in ap_run_process_connection () #21 0x0000000000424cd8 in ap_graceful_stop_signalled () #22 0x0000000000424e08 in ap_graceful_stop_signalled () #23 0x0000000000425029 in ap_graceful_stop_signalled () #24 0x00000000004255e1 in ap_mpm_run () #25 0x000000000042b379 in main () Thanks for looking at this. Version-Release number of selected component (if applicable): httpd-2.0.46-40.ent How reproducible: Couldn't Reproduce Steps to Reproduce: 1. Unable to 2. 3. Additional info:
Thanks for the report and backtrace. Did you segfault the child and get a core dump, or just attach gdb to the running pid? Can you attach any relevant changes to the default configuration?
I did not get a core dump, I just attached gdb to the pid. It was 2am and I was rather sleepy. :) I'm not sure if I have any relevant changes, but my httpd.conf has radically departed from the default. If you really want a copy of it, I can email it to you personally, but I do not wish it to be publically available on bugzilla for security reasons. :) thanks!
If you could attach it to bugzilla and hit the "Private" button on the attachment form, that would make visible to me but not the rest of the world.
A code review did not reveal any places where an lseek() loop like this could occur; furthermore, the backtrace: #1 0x0000002a966c3c2b in apr_file_seek () from /usr/lib64/libapr-0.so.0 #2 0x0000002a962b44e0 in apr_bucket_pool_create () from /usr/lib64/libaprutil-0.so.0 seems to be corrupted since there is no call between those functions. Have you seen this problem recur at all? If you could get us that httpd.conf it might be helpful, otherwise in the future if you could generate a core dump from a spinning process that would be useful. (generate core dumps by putting "CoreDumpDirectory /var/tmp" in the configuration somewhere, and then sending the process a SIGSEGV)
This got tracked down upstream just recently. It's a race between sending the file and the file being truncated by some other process on the system.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2005-621.html