Bug 504963 - heap corruptions in slapd under heavy load
Summary: heap corruptions in slapd under heavy load
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: glibc
Version: 11
Hardware: x86_64
OS: Linux
low
high
Target Milestone: ---
Assignee: Andreas Schwab
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 516995
TreeView+ depends on / blocked
 
Reported: 2009-06-10 08:41 UTC by Howard Chu
Modified: 2010-01-30 06:44 UTC (History)
9 users (show)

Fixed In Version: 2.11.1-1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-01-12 23:43:02 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Howard Chu 2009-06-10 08:41:27 UTC
Description of problem:
glibc frequently asserts with "free(): invalid pointer" inside OpenLDAP slapd when under heavy load. The same slapd code runs fine on the same system when statically linked to glibc 2.9-3.

Version-Release number of selected component (if applicable):
2.10.1-2

How reproducible:
The problem was detected while running the Samba4 torture test with Samba being backed by OpenLDAP, and this is the most reliable mechanism found so far to reproduce the issue.


Steps to Reproduce:
1. configure samba4 with --enable-developer and build it
2. ulimit -c unlimited
3. "TEST_LDAP=yes OPENLDAP_ROOT=/usr/local make test"

  
Actual results:
At some point within the series of tests, slapd will abort and leave a core file. All of the subsequent tests will fail with multiple error messages. The backtrace from the core file will indicate an assert failure while freeing a pointer which was known to be valid.


Expected results:
The tests should run to completion.


Additional info:
The slapd code is known to be good. The tests run to completion using valgrind, ElectricFence, tcmalloc, and libumem. The tests also run cleanly using glibc-2.9-3. The failure also occurred with 2.10.1-1.

Searching bugzilla turned up bug #494631 but that appears to be unrelated; slapd explicitly requests 8MB thread stacks so this is not the same issue MySQL was having, nor does it have anything to do with whatever emacs was doing.

Comment 1 Andrew Bartlett 2009-06-10 11:27:25 UTC
Howard, perhaps you can point at the OpenLDAP version (or a snapshot tarball) and GIT revision of Samba4 you last used in your tests?

We should probably work on a script to build and install exactly the right stuff, to help whoever has to look at this reproduce.

Comment 2 Howard Chu 2009-06-10 16:54:35 UTC
An OpenLDAP HEAD checkout as of -D 2009/06/09 will do. Likewise a Samba git pull from the HEAD as of commit ad0d8032068fc9.

Comment 3 Ulrich Drepper 2009-06-10 18:54:01 UTC
(In reply to comment #0)
> Additional info:
> The slapd code is known to be good.

And you can proof this how?


> The tests run to completion using valgrind,
> ElectricFence, tcmalloc, and libumem. The tests also run cleanly using
> glibc-2.9-3.

None of this means anything.  The newer malloc allows a lot more concurrency.  All the testing mechanisms basically serialize execution.  Who knows what races are exposed.

Check the objects which are freed and determine where the pointers are coming from.

Comment 4 Howard Chu 2009-06-10 19:32:56 UTC
(In reply to comment #3)
> (In reply to comment #0)
> > Additional info:
> > The slapd code is known to be good.
> 
> And you can proof this how?
> 
> 
> > The tests run to completion using valgrind,
> > ElectricFence, tcmalloc, and libumem. The tests also run cleanly using
> > glibc-2.9-3.
> 
> None of this means anything.  The newer malloc allows a lot more concurrency. 
> All the testing mechanisms basically serialize execution.  Who knows what races
> are exposed.
> 
> Check the objects which are freed and determine where the pointers are coming
> from.  

All mallocs in OpenLDAP code are wrapped by ber_memalloc in our liblber. We can also enable a debug wrapper here that allocates space for head and tail signatures, and poisons the memory upon free. With the debug wrapper enabled we see that the head and tail signatures are valid, meaning the memory was definitely a valid malloc'd block. Otherwise our own assert() would trigger first. We take no additional locks here so the debug wrapper will have no effect on the concurrency of the underlying malloc library. Also, tcmalloc and libumem are already very high concurrency allocators, and we've been using them for years. Any races would have been deteced long ago.

Comment 5 Howard Chu 2009-06-20 06:05:33 UTC
Not sure if this is useful... I grabbed the malloc.c from git 
http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=malloc/malloc.c;hb=f6887a0d9a55f5c80c567d9cb153c1c6582410f9 and dropped it on top of the 2.10.1-2 srpm and recompiled with -g and no optimization. I then attached to slapd with gdb while running the test, and set a breakpoint at free_check() hooks.c:281. During this run, instead of tripping the breakpoint, I got a SEGV in mem2chunk_check() hooks.c:166. That in itself was odd; usually we hit the malloc_printerr assert, and never see a SEGV.

#0  0x00007fbc207e1d85 in mem2chunk_check (mem=0x189d680, magic_p=0x0)          
    at hooks.c:166                                                              
#1  0x00007fbc207e226f in free_check (mem=0x189d680, caller=0x5558cf)           
    at hooks.c:279                                                              
#2  0x00007fbc207e47a1 in *__GI___libc_free (mem=0x189d680) at malloc.c:3677    
#3  0x00000000005558cf in ber_memfree_x (p=0x189d690, ctx=0x0) at memory.c:155  
#4  0x0000000000553a6b in ber_free_buf (ber=0x7fbc198a2250) at io.c:192         
#5  0x00000000004aa9bf in send_paged_response (op=0x10b9fb0,                    
    rs=0x7fbc19a23c60, lastid=0x0, tentries=<value optimized out>)              
    at search.c:1323                                                            
#6  0x00000000004ab6b7 in hdb_search (op=0x10b9fb0, rs=0x7fbc19a23c60)          
    at search.c:1025                                                            
#7  0x0000000000498b67 in overlay_op_walk (op=0x10b9fb0, rs=0x7fbc19a23c60,     
    which=<value optimized out>, oi=0xe43c20, on=0x0) at backover.c:669         
#8  0x0000000000499663 in over_op_func (op=0x10b9fb0, rs=0xfe8f352113b8d670,    
    which=330880624) at backover.c:721                                          
#9  0x00000000004354b9 in fe_op_search (op=0x10b9fb0, rs=0x7fbc19a23c60)        
    at search.c:366                                                             
#10 0x0000000000498b67 in overlay_op_walk (op=0x10b9fb0, rs=0x7fbc19a23c60,     
    which=<value optimized out>, oi=0xe33400, on=0x0) at backover.c:669         
#11 0x0000000000499663 in over_op_func (op=0x10b9fb0, rs=0xfe8f352113b8d670,    
    which=330880624) at backover.c:721                                          
#12 0x0000000000435ccc in do_search (op=0x10b9fb0, rs=0x7fbc19a23c60)           
    at search.c:217
#13 0x000000000043360e in connection_operation (ctx=0x7fbc19a23dc0,             
    arg_v=<value optimized out>) at connection.c:1115                           
#14 0x0000000000433d65 in connection_read_thread (ctx=<value optimized out>,    
    argv=<value optimized out>) at connection.c:1248                            
#15 0x0000000000520db0 in ldap_int_thread_pool_wrapper (                        
    xpool=<value optimized out>) at tpool.c:685                                 
#16 0x00007fbc20ae186a in start_thread (arg=<value optimized out>)              
    at pthread_create.c:297                                                     
#17 0x00007fbc2084bf8d in clone ()                                              
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112                            
#18 0x0000000000000000 in ?? () 
(gdb) l 166                                                                     
161             ((char*)p<mp_.sbrk_base ||                                      
162              ((char*)p + sz)>=(mp_.sbrk_base+main_arena.system_mem) )) ||   
163            sz<MINSIZE || sz&MALLOC_ALIGN_MASK || !inuse(p) ||               
164            ( !prev_inuse(p) && (p->prev_size&MALLOC_ALIGN_MASK ||           
165                                 (contig && (char*)prev_chunk(p)<mp_.sbrk_bas
e) ||                                                                           
166                                 next_chunk(prev_chunk(p))!=p) ))            
167           return NULL;                                                      
168         magic = MAGICBYTE(p);                                               
169         for(sz += SIZE_SZ-1; (c = ((unsigned char*)p)[sz]) != magic; sz -= c
) {                                                                             
170           if(c<=0 || sz<(c+2*SIZE_SZ)) return NULL; 
(gdb) frame 0                                                             
#0  0x00007fbc207e1d85 in mem2chunk_check (mem=0x189d680, magic_p=0x0)  
    at hooks.c:166                                                   
166                                 next_chunk(prev_chunk(p))!=p) ))   
(gdb) p p                                                                  
$11 = (mchunkptr) 0x189d670                                                
(gdb) p *p                                                              
$12 = {prev_size = 2240, size = 4096, fd = 0xfdcffffffff,                 
  bk = 0xffffffffffffffff, fd_nextsize = 0x40001020530,                    
  bk_nextsize = 0x1000400}                                                 
(gdb) p *(mchunkptr)(0x189d670-2240)                                        
$13 = {prev_size = 218314183196732881, size = 2241, fd = 0x7fbc20ad73d8, 
  bk = 0x7fbc20ad73d8, fd_nextsize = 0x189cdb0, bk_nextsize = 0x189cdb0}   
(gdb) 

So, it's pretty puzzling that there's a SEGV here, looks like prev_chunk(p) is perfectly valid, and next_chunk(prev_chunk(p)) ought to be == p.

Worth noting that none of the additional diagnostics in the new malloc.c tripped.

Comment 6 Tom Lane 2009-08-25 17:51:34 UTC
I am seeing similar misbehavior in mysql.  Maybe it's mysql's fault, but ...

Comment 7 Tom Lane 2009-08-25 22:39:43 UTC
After poking around in malloc.c and hooks.c, I've got a fairly basic question: it looks like free() will execute mem2chunk_check() without any sort of lock, and mem2chunk_check() supposes that it can look at the previous chunk (specifically its length word).  Isn't this completely unsafe in a threaded environment?
In particular, there's no obvious reason that some other thread couldn't be in the middle of splitting the previous chunk, leading to momentary inconsistency in the prev/next links.  This would explain the SEGV noted above, because it's happening exactly on  the next_chunk(prev_chunk(p))!=p check.

If this analysis is correct, then MALLOC_CHECK_ simply doesn't work in a threaded program.  Is it supposed to?

Comment 8 Ulrich Drepper 2009-11-23 01:29:37 UTC
I added some changes to free_check.

BTW: until the last comment there was no mention at all that MALLOC_CHECK_ is used.  That's obviously critical information.  We almost never use that envvar anywhere else.

Anyway, it should be fixed upstream.  Not sure whether Andreas pulled that patch into the F12 build already.

Comment 9 Howard Chu 2009-11-23 02:32:55 UTC
(In reply to comment #8)
> I added some changes to free_check.
> 
> BTW: until the last comment there was no mention at all that MALLOC_CHECK_ is
> used.  That's obviously critical information.  We almost never use that envvar
> anywhere else.
> 
> Anyway, it should be fixed upstream.  Not sure whether Andreas pulled that
> patch into the F12 build already.  

Thanks for the update, I'll give it a whirl.

Just to clarify, we only looked into using MALLOC_CHECK_ to aid in diagnostics when the problems first started happening. The original problem was detected without any setting of MALLOC_CHECK_.

Comment 10 Andreas Schwab 2009-11-23 10:15:31 UTC
What info are you looking for?

Comment 11 Ulrich Drepper 2009-11-23 15:01:18 UTC
(In reply to comment #10)
> What info are you looking for?  

See comment #8:

Did you add the malloc patch

cc49a5a Restore locking in free_check.

to F12 glibc (and backport to F11)?

Comment 12 Andreas Schwab 2009-11-23 15:08:09 UTC
No.

Comment 13 Fedora Update System 2009-11-30 16:35:33 UTC
glibc-2.11-4 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/glibc-2.11-4

Comment 14 Fedora Update System 2009-12-02 04:25:49 UTC
glibc-2.11-4 has been pushed to the Fedora 12 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update glibc'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F12/FEDORA-2009-12470

Comment 15 Fedora Update System 2010-01-05 22:45:13 UTC
glibc-2.11.1-1 has been pushed to the Fedora 12 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update glibc'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F12/FEDORA-2009-12470

Comment 16 Fedora Update System 2010-01-12 23:42:36 UTC
glibc-2.11.1-1 has been pushed to the Fedora 12 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 17 Kevin Kofler 2010-01-30 06:44:20 UTC
Is this the same as http://sourceware.org/bugzilla/show_bug.cgi?id=10282 which upstream KDE developers have been complaining about? (They enable _MALLOC_CHECK for prereleases so they can catch some memory bugs.)

What about F11 (which this was originally filed against)?


Note You need to log in before you can comment on or make changes to this bug.