Red Hat Bugzilla – Bug 160034
httpd segfaults in mod_auth_ldap /util_ldap after several requests
Last modified: 2007-11-30 17:07:18 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.8) Gecko/20050513 Fedora/1.0.4-1.3.1 Firefox/1.0.4
Description of problem:
httpd+mod_ssl+mod_auth_ldap on RHEL 4 2.6.9-5.ELsmp i686 SMP.
Configuration uses a single AuthLDAPURL for a single directory. The system had been running smoothly for several weeks with less than 10 distinct users. Two days ago we opened it up to entire enterprise (potentially 3000+ users) and have had to restart apache four times.
It is our belief that our ldap cache becomes corrupted after a purge as described in the following apache bugzilla report http://issues.apache.org/bugzilla/show_bug.cgi?id=24801
The problem is that util_ald_cache_purge() fails to keep the linked list of the cache intact as it removes nodes.
#24801 contains a small patch to keep the purge from corrupting the cache:
The patch was incorporated into 2.0.53 and remarked as follows:
*) Fix the re-linking issue when purging elements from the LDAP cache
PR 24801. [Jess Holle <jessh ptc.com>]
Please consider applying this patch in the next release of httpd-2.0.52 or migrating to 2.0.53+.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Restart apache.
2. Wait 2-15 hours
Actual Results: Log messages like this:
[Thu Jun 09 11:39:35 2005] [notice] child pid 17099 exit signal Segmentation fault (11)
Expected Results: No segfaults
Thanks for the report, John.
Hi Joe -
I applied the patch mentioned above and it did not solve my problem, or at least
not solve it completely.
On the bright side however, further digging lead me a simple but very damaging
bug in apr-util-0.9.4 in find_block_of_size(): if it finds a block in the free
list that is bigger than desired, it will split it into two blocks -- the
desired size block, and the remainder of the old block. However, when it
inserts the new block into the free list, it creates a loop with the prev
pointer, causing any subsequent move on the new block to corrupt the list.
Here is the relevant code change in SVN for apr-util:
Prior to this patch, I would crash several times a day. Since the patch, the
server has worked perfectly.
Thanks for the further analysis, John.
During testing, we found a couple of additional bugs to those you reference
above. Experimental test update packages, queued for inclusion in U2, are
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.