From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113 Description of problem: I am using the worker MPM on Apache. I start Apache with 4 children, each maintaining a minimum of 50 threads. After running for a few hours, apache (e.g. one of the threads) starts to consume 100% CPU. Given a few more hours, there will be enough threads to take up all four CPU in the webserver. When the problem happens, I stop the network load balancer from sending any more HTTP requests to the webserver. With no requests coming in, apache continue to consuming 100% CPU. The problem happens on both the Apache 2.0.46 (/usr/sbin/httpd.worker) that comes with RHEL 3.0, and Apache 2.0.48 that I compile from source. Other webservers on Apache 2.0.48 and the same configuration, but running RHEL 2.1, do not have this problem. When the problem happens, I could use "top", "ps -fmHC httpd.worker" or "ps -fmHC httpd" to find out the thread which is taking up the CPU. Then I use gdb and attach the offending thread, which seems to be spending its time in malloc_consolidate () from /lib/tls/libc.so.6. Other threads of the same parent process are in in _dl_sysinfo_int80 () from /lib/ld-linux.so.2. The parent process itself is in pthread_join () from /lib/tls/libpthread.so.0. I think the parent process and other threads are doing fine. It is just the malloc_consolidate but I cannot tell for sure. Version-Release number of selected component (if applicable): httpd-2.0.46-26.ent How reproducible: Always Steps to Reproduce: 1. Start apache 2. Wait for an hour or two Actual Results: One of the apache threads starts to consume 100% CPU. Given enough time, there will be enough threads consuming all CPU on the system. They continue to consume CPU even when there is no http traffic to the system. Expected Results: All HTTP requests require little or no server processing. Apache with the same configuration on other RHEL 2.1 webservers, and serving the same HTTP requests, take up less than 10% CPU time. Additional info: # uname -a Linux paprika 2.4.21-9.0.1.ELsmp #1 SMP Mon Feb 9 22:26:51 EST 2004 i686 i686 i386 GNU/Linux RHEL 3.0 was installed with all package collections. I have these in /etc/sysctl.conf. net.ipv4.ip_forward = 0 net.ipv4.conf.default.rp_filter = 1 kernel.sysrq = 0 kernel.core_uses_pid = 1 fs.file-max = 65536 net.ipv4.tcp_fin_timeout=30 net.ipv4.tcp_keepalive_time=600 net.ipv4.tcp_syncookies=1 net.ipv4.icmp_ignore_bogus_error_responses=1 I use "ulimit -H -n 8192" in the script that starts apache. /lib/tls/libc.so.6 is a sym link to libc-2.3.2.so, also in /lib/tls. /lib/tls/libpthread.so.0 is a sym link to libpthread-0.60.so, also in /lib/tls.
Hello, thanks for the report. Can you give an idea of what modules are in use here: is the server handling mostly static content, or is mod_include being exercised etc? Can you get a backtrace from gdb out of a thread hung in malloc_consolidate?
It would also be useful to know whether SSL is in use.
SSL is used, mod_include is not. Also ColdFusion MX 6.1 is used and the module is compiled from source. The webserver is handling static content, and sending dynamic content (.cfm files) to the local JRun server via the ColdFusion jrun module. LoadModule access_module modules/mod_access.so LoadModule auth_module modules/mod_auth.so LoadModule log_config_module modules/mod_log_config.so LoadModule setenvif_module modules/mod_setenvif.so LoadModule ssl_module modules/mod_ssl.so LoadModule mime_module modules/mod_mime.so LoadModule status_module modules/mod_status.so LoadModule autoindex_module modules/mod_autoindex.so LoadModule info_module modules/mod_info.so LoadModule dir_module modules/mod_dir.so LoadModule alias_module modules/mod_alias.so LoadModule rewrite_module modules/mod_rewrite.so LoadModule jrun_module /raid/coldfusion/runtime/lib/wsconfig/1/mod_jrun20.so The webserver was restarted this morning and I guess it has not "accumulated" enough time / traffic for any of the httpd processes to consume the CPU yet. I will send in the backtrace when that happens. The same problem happens on Xvnc on the same system (bug 117885), and apache 1.3.27 on another RHEL 3 system that has the same configuration (bug 117886).
# ps -fmHC httpd UID PID PPID C STIME TTY TIME CMD root 25041 1 0 Mar08 ? 00:00:04 /raid/apache/bin/httpd -k start apache 30029 25041 0 10:20 ? 00:00:00 /raid/apache/bin/httpd -k start apache 30031 30029 0 10:20 ? 00:00:00 /raid/apache/bin/httpd -k start apache 30033 30029 0 10:20 ? 00:00:00 /raid/apache/bin/httpd -k start apache 30038 30029 91 10:20 ? 00:44:05 /raid/apache/bin/httpd -k start apache 30053 30029 0 10:20 ? 00:00:00 /raid/apache/bin/httpd -k start apache 30055 30029 0 10:20 ? 00:00:00 /raid/apache/bin/httpd -k start apache 30063 30029 0 10:20 ? 00:00:00 /raid/apache/bin/httpd -k start apache 30064 30029 0 10:20 ? 00:00:00 /raid/apache/bin/httpd -k start apache 30068 30029 0 10:20 ? 00:00:00 /raid/apache/bin/httpd -k start apache 30073 30029 0 10:20 ? 00:00:00 /raid/apache/bin/httpd -k start ... # gdb GNU gdb Red Hat Linux (6.0post-0.20031117.6rh) Copyright 2003 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux-gnu". (gdb) attach 30029 Attaching to process 30029 Reading symbols from /raid/apache/bin/httpd...(no debugging symbols found)...done. Using host libthread_db library "/lib/tls/libthread_db.so.1". Reading symbols from /usr/lib/libz.so.1...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libssl.so.4...(no debugging symbols found)...done. Loaded symbols for /lib/libssl.so.4 Reading symbols from /lib/libcrypto.so.4...(no debugging symbols found)...done. Loaded symbols for /lib/libcrypto.so.4 Reading symbols from /raid/apache/lib/libaprutil-0.so.0...(no debugging symbols found)...done. Loaded symbols for /raid/apache/lib/libaprutil-0.so.0 Reading symbols from /usr/lib/libgdbm.so.2...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libgdbm.so.2 Reading symbols from /lib/libdb-4.1.so...(no debugging symbols found)...done. Loaded symbols for /lib/libdb-4.1.so Reading symbols from /lib/tls/libpthread.so.0...(no debugging symbols found)...done. [Thread debugging using libthread_db enabled] [New Thread -1223506176 (LWP 30029)] [New Thread -1675289680 (LWP 30073)] [New Thread -1622840400 (LWP 30068)] [New Thread -1580880976 (LWP 30064)] [New Thread -1570391120 (LWP 30063)] [New Thread -1486472272 (LWP 30055)] [New Thread -1465492560 (LWP 30053)] [New Thread -1308144720 (LWP 30038)] [New Thread -1255695440 (LWP 30033)] [New Thread -1234715728 (LWP 30031)] Loaded symbols for /lib/tls/libpthread.so.0 Reading symbols from /usr/lib/libexpat.so.0...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libexpat.so.0 Reading symbols from /raid/apache/lib/libapr-0.so.0...(no debugging symbols found)...done. Loaded symbols for /raid/apache/lib/libapr-0.so.0 Reading symbols from /lib/tls/librt.so.1...(no debugging symbols found)...done. Loaded symbols for /lib/tls/librt.so.1 Reading symbols from /lib/tls/libm.so.6...(no debugging symbols found)...done. Loaded symbols for /lib/tls/libm.so.6 Reading symbols from /lib/libcrypt.so.1...(no debugging symbols found)...done. Loaded symbols for /lib/libcrypt.so.1 Reading symbols from /lib/libnsl.so.1...(no debugging symbols found)...done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/tls/libc.so.6...(no debugging symbols found)...done. Loaded symbols for /lib/tls/libc.so.6 Reading symbols from /usr/kerberos/lib/libgssapi_krb5.so.2...(no debugging symbols found)...done. Loaded symbols for /usr/kerberos/lib/libgssapi_krb5.so.2 Reading symbols from /usr/kerberos/lib/libkrb5.so.3...(no debugging symbols found)...done. Loaded symbols for /usr/kerberos/lib/libkrb5.so.3 Reading symbols from /usr/kerberos/lib/libcom_err.so.3...(no debugging symbols found)...done. Loaded symbols for /usr/kerberos/lib/libcom_err.so.3 Reading symbols from /usr/kerberos/lib/libk5crypto.so.3...(no debugging symbols found)...done. Loaded symbols for /usr/kerberos/lib/libk5crypto.so.3 Reading symbols from /lib/libresolv.so.2...(no debugging symbols found)...done. Loaded symbols for /lib/libresolv.so.2 Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done. Loaded symbols for /lib/ld-linux.so.2 Reading symbols from /lib/libnss_files.so.2...(no debugging symbols found)...done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /raid/apache/modules/mod_access.so...(no debugging symbols found)...done. Loaded symbols for /raid/apache/modules/mod_access.so Reading symbols from /raid/apache/modules/mod_auth.so...(no debugging symbols found)...done. Loaded symbols for /raid/apache/modules/mod_auth.so Reading symbols from /raid/apache/modules/mod_log_config.so...(no debugging symbols found)...done. Loaded symbols for /raid/apache/modules/mod_log_config.so Reading symbols from /raid/apache/modules/mod_setenvif.so...(no debugging symbols found)...done. Loaded symbols for /raid/apache/modules/mod_setenvif.so Reading symbols from /raid/apache/modules/mod_ssl.so...(no debugging symbols found)...done. Loaded symbols for /raid/apache/modules/mod_ssl.so Reading symbols from /raid/apache/modules/mod_mime.so...(no debugging symbols found)...done. Loaded symbols for /raid/apache/modules/mod_mime.so Reading symbols from /raid/apache/modules/mod_status.so...(no debugging symbols found)...done. Loaded symbols for /raid/apache/modules/mod_status.so Reading symbols from /raid/apache/modules/mod_autoindex.so...(no debugging symbols found)...done. Loaded symbols for /raid/apache/modules/mod_autoindex.so Reading symbols from /raid/apache/modules/mod_info.so...(no debugging symbols found)...done. Loaded symbols for /raid/apache/modules/mod_info.so Reading symbols from /raid/apache/modules/mod_dir.so...(no debugging symbols found)...done. Loaded symbols for /raid/apache/modules/mod_dir.so Reading symbols from /raid/apache/modules/mod_alias.so...(no debugging symbols found)...done. Loaded symbols for /raid/apache/modules/mod_alias.so Reading symbols from /raid/apache/modules/mod_rewrite.so...(no debugging symbols found)...done. Loaded symbols for /raid/apache/modules/mod_rewrite.so Reading symbols from /raid/coldfusion/runtime/lib/wsconfig/1/mod_jrun20.so...(no debugging symbols found)...done. Loaded symbols for /raid/coldfusion/runtime/lib/wsconfig/1/mod_jrun20.so Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols found)...done. Loaded symbols for /lib/libgcc_s.so.1 0xb75ebc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 (gdb) info threads 10 Thread -1234715728 (LWP 30031) 0xb75ebc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 9 Thread -1255695440 (LWP 30033) 0xb75ebc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 8 Thread -1308144720 (LWP 30038) 0xb7235123 in malloc_consolidate () from /lib/tls/libc.so.6 7 Thread -1465492560 (LWP 30053) 0xb75ebc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 6 Thread -1486472272 (LWP 30055) 0xb75ebc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 5 Thread -1570391120 (LWP 30063) 0xb75ebc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 4 Thread -1580880976 (LWP 30064) 0xb75ebc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 3 Thread -1622840400 (LWP 30068) 0xb75ebc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 2 Thread -1675289680 (LWP 30073) 0xb75ebc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 1 Thread -1223506176 (LWP 30029) 0xb75ebc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 (gdb) thread 8 [Switching to thread 8 (Thread -1308144720 (LWP 30038))]#0 0xb7235123 in malloc_consolidate () from /lib/tls/libc.so.6 (gdb) bt #0 0xb7235123 in malloc_consolidate () from /lib/tls/libc.so.6 #1 0xb7234769 in _int_malloc () from /lib/tls/libc.so.6 #2 0xb7233b0d in malloc () from /lib/tls/libc.so.6 #3 0xb70d42dd in tvToMillis () from /raid/coldfusion/runtime/lib/wsconfig/1/mod_jrun20.so #4 0xb70d13cf in ?? () from /raid/coldfusion/runtime/lib/wsconfig/1/mod_jrun20.so #5 0xb2074810 in ?? () #6 0x000002df in ?? () #7 0x00000004 in ?? () #8 0x00000026 in ?? () #9 0x08203490 in ?? () #10 0x00000018 in ?? () #11 0xb2074268 in ?? () #12 0x000002de in ?? () #13 0xb2074810 in ?? () #14 0x082016f0 in ?? () #15 0xb2074268 in ?? () #16 0xb70d2d54 in longerThan () from /raid/coldfusion/runtime/lib/wsconfig/1/mod_jrun20.so Previous frame identical to this frame (corrupt stack?) (gdb) detach Detaching from program: /raid/apache/bin/httpd, process 30029
We can only deal with bug reports for bugs in software which is included in Red Hat Enterprise Linux; it's not clear what "/raid/apache/bin/httpd" is in your backtraces above. Backtraces into malloc are usually some kind of memory corruption bugs (such as a double-free); since the module calling malloc here seems to be the third-party ColdFusion software, that would be the first suspect of the cause of the bug; you should report this issue to the module vendor in the first place. #0 0xb7235123 in malloc_consolidate () from /lib/tls/libc.so.6 #1 0xb7234769 in _int_malloc () from /lib/tls/libc.so.6 #2 0xb7233b0d in malloc () from /lib/tls/libc.so.6 #3 0xb70d42dd in tvToMillis () from /raid/coldfusion/runtime/lib/wsconfig/1/mod_jrun20.so #4 0xb70d13cf in ?? () from /raid/coldfusion/runtime/lib/wsconfig/1/mod_jrun20.so To track this down further, you could use debugging tricks such as running httpd.worker under ElectricFence: # ef /usr/sbin/httpd.worker or enabling extra malloc checking in glibc: # MALLOC_CHECK_=2 /usr/sbin/httpd.worker There is one known obscure memory handling bug in httpd.worker but it looks unlikely that it is the cause this bug; you can try the test update packages at: http://people.redhat.com/jorton/Taroon-httpd/ to eliminate that possibility. Please re-open this report if a likely issue in the httpd package is identified.