Bug 510349 - Perftest crashes with glibc errors
Perftest crashes with glibc errors
Status: NEW
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: distribution (Show other bugs)
Development
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: mrg-program-list
MRG Quality Engineering
:
Depends On:
Blocks: 512268
  Show dependency treegraph
 
Reported: 2009-07-08 14:46 EDT by Tom Tracy
Modified: 2014-09-11 07:22 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 512268 (view as bug list)
Environment:
Last Closed:
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
full dump of perftest crash (7.44 KB, text/plain)
2009-07-08 14:47 EDT, Tom Tracy
no flags Details
Coredump with bt (20.22 KB, text/plain)
2009-07-09 15:24 EDT, Tom Tracy
no flags Details
diff to try for theory in comments (1.28 KB, patch)
2009-07-15 22:03 EDT, Carl Trieloff
no flags Details | Diff
Patch to make corruption test thread-safe (1.01 KB, patch)
2009-07-16 12:49 EDT, Ulrich Drepper
no flags Details | Diff

  None (edit)
Description Tom Tracy 2009-07-08 14:46:42 EDT
Description of problem: Running perftest, getting the following error after running for at least 5 minutes

*** glibc detected *** /usr/bin/perftest: free(): invalid next size (fast): 0x00002aaac0000ab0 ***


Version-Release number of selected component (if applicable):qpidc-perftest-0.5.752581-22.el5


How reproducible: Yes,


Steps to Reproduce:
1. run perftest with high number of byte size (greater than 64 for pubs/subs)
2. let perftest run in loop greater than 50
3. get *** glibc detected *** /usr/bin/perftest: free(): invalid next size (fast): 0x00002aaac0000ab0 ***
  
Actual results: see attachement


Expected results: test to complete without failure


Additional info: kernel version 2.6.18-156.el5 (happens also using 155 kernel. Have also seen this on other machines testing rhel54 beta
Comment 1 Tom Tracy 2009-07-08 14:47:59 EDT
Created attachment 350980 [details]
full dump of perftest crash
Comment 2 Gordon Sim 2009-07-09 05:40:07 EDT
I've been unable to reproduce this so far on 2.6.18-128.1.16.el5; could you state the exact options to perftest with which you observed the crash?
Comment 3 Tom Tracy 2009-07-09 09:53:13 EDT
Gordon
        It is reproducible on rhel54 kernels 2.6.18-155.el5 and 2.6.18-156.el5. Actually use your scripts and happens when running perftest in a loop. Anything over 4k message size produces this error consistently. If you want to check it out on the actual machines, let me know and will give you the keys. Both systems have been updates with rhel54 beta1 packages.
Comment 4 Carl Trieloff 2009-07-09 13:10:20 EDT
I am guessing we may need to rebuild specifically for RHEL 5.4. If this is the case then we should raise this a binary compatibility issue for RHEL 5.4

i.e. if that is the case, it would be a major issue for RHEL5.4
Comment 5 Tom Tracy 2009-07-09 15:24:59 EDT
Created attachment 351126 [details]
Coredump with bt
Comment 6 Tom Tracy 2009-07-13 10:05:54 EDT
Rebuilt systems with a fresh install dated 07082009. Running perftest afterwards produced the same crash though it took longer for the crash to happen. So upgrading from 5.3 is not the root cause to this issue
Comment 7 Jakub Jelinek 2009-07-13 12:54:10 EDT
Why do you think this has to do anything with binary compatibility?
Most likely there is just a memory handling bug in the test (whatever that is).

Start with running it under valgrind resp. ElectricFence.
Comment 8 Carl Trieloff 2009-07-13 13:00:34 EDT
could be but the same test is fine under REHL 5.1, 5.2, 5.3. why would that change in RHEL 5.4?
Comment 9 Jakub Jelinek 2009-07-13 13:09:03 EDT
Because of #494758, I believe the malloc change backport has been even requested by MRG.
Of course there is a possibility it could be a glibc bug, but much more likely is just that the changes change timing of free/realloc sufficiently to unhide some previously hidden bugs in programs.
Comment 10 Tim Burke 2009-07-14 08:05:22 EDT
In order to determine if this is caused by the new malloc implementation, bug #494758, can't the same test be run with and without enabling that feature by using the MALLOC_PER_THREAD=1 environment variable?  By default this should be off.  Does the error occur when this is disabled?
Comment 11 Jakub Jelinek 2009-07-14 08:51:44 EDT
Repeating some comments from IRC:

interesting would be also if all the failures have very similar backtrace or not (e.g. if all the failures are from free of std::string)
the error that got reported can be triggered e.g. by 
int main (void)
{
  char *p = malloc (18);
  memcpy (p, "0123456789ABCDEFGHIJKLMNOPQR", 28);
  free (p);
  return 0;
}
it is the first error reported if the 16 bytes after the allocated buffer are overwritten and free is called on it
if you have the box where you got the core on, it would be interesting if you could look at the corefile there
and print 16 bytes before the printed address (0x00002aaac0000ab0 in the dump) and then say 128 bytes after it
also, if you are able to reproduce it while running under gdb, putting a breakpoint on __backtrace so that malloc isn't called (which might further clobber the area after the buffer)
and when it hits, printing the buffer from the address printed - 16 onwards would be very useful
Comment 12 Carl Trieloff 2009-07-14 10:24:14 EDT
I believe the error is regardless of the MALLOC_PER_THREAD feature, from discussions with Tom. i.e. happens with or without it enabled.
Comment 13 Tom Tracy 2009-07-14 10:27:20 EDT
Yes, the error occurs with/without the MALLOC_PER_THREAD is enabled or not.
Comment 14 Tom Tracy 2009-07-15 15:25:26 EDT
Jackub
        The following cores were produce for these libraries

/root/glibc-test-1/  core 20537 located in /amqp
/root/glibc-test-4  core.24824 located in /amqp


Tom
Comment 15 Jakub Jelinek 2009-07-15 18:09:17 EDT
Some info for Uli/Andreas, the glibc-test-1 build is stock RHEL-5 glibc with:
--- sysdeps/unix/sysv/linux/libc_fatal.c        2009-07-15 11:06:56.000000000 -0400
+++ sysdeps/unix/sysv/linux/libc_fatal.c        2009-07-14 12:42:03.000000000 -0400
@@ -143,7 +143,7 @@ __libc_message (int do_abort, const char
 
   if (do_abort)
     {
-      if (do_abort > 1 && written)
+      if (0 && do_abort > 1 && written)
         {
           void *addrs[64];
 #define naddrs (sizeof (addrs) / sizeof (addrs[0]))
--- malloc/malloc.c        2009-07-15 11:05:33.000000000 -0400
+++ malloc/malloc.c        2009-07-14 12:44:27.000000000 -0400
@@ -4675,6 +4675,7 @@ _int_malloc(mstate av, size_t bytes)
   }
 }
 
+void *mxxp[10];
 /*
   ------------------------------ free ------------------------------
 */
@@ -4747,6 +4748,12 @@ _int_free(mstate av, mchunkptr p)
         || __builtin_expect (chunksize (chunk_at_offset (p, size))
                              >= av->system_mem, 0))
       {
+        mxxp[0] = p;
+        mxxp[1] = size;
+        mxxp[2] = chunk_at_offset (p, size)->size;
+        mxxp[3] = av;
+        mxxp[4] = av->system_mem;
+        mxxp[5] = chunksize (chunk_at_offset (p, size));
         errstr = "free(): invalid next size (fast)";
         goto errout;
       }
patch on top of it.  The crash is that
    if (__builtin_expect (chunk_at_offset (p, size)->size <= 2 * SIZE_SZ, 0)
        || __builtin_expect (chunksize (chunk_at_offset (p, size))
                             >= av->system_mem, 0))
      {
        errstr = "free(): invalid next size (fast)";
        goto errout;
      }
while freeing 0x2aaabc0048c0 address.  The memory in core dump contains:
0x2aaabc0048b0: 0x0000000000000020      0x0000000000000034
0x2aaabc0048c0: 0x0000000000000009      0x0000000000000009
0x2aaabc0048d0: 0x00000000ffffffff      0x7473657466726570      "perftest1\0b"
0x2aaabc0048e0: 0x0000000000620031      0x0000000000020721
0x2aaabc0048f0: 0x00002aaabc004890      0x00002aaabc003f20
0x2aaabc004900: 0x0000000000000000      0x0000000000000000
0x2aaabc004910: 0x5858585858585858      0x5858585858585858
0x2aaabc004920: 0x5858585858585858      0x5858585858585858
and mxxp contains:
0x2ae6da216ca0 <mxxp>:	0x00002aaabc0048b0	0x0000000000000030
0x2ae6da216cb0 <mxxp+16>:	0x0000000000036721	0x00002aaabc000020
0x2ae6da216cc0 <mxxp+32>:	0x0000000000025000	0x0000000000020720
and
p *((heap_info *)($rbx & 0xfffffffffc000000))->ar_ptr
shows:
...
next = 0x2aaab8000020, next_free = 0x0, system_mem = 151552, max_system_mem = 245760}
So, av->system_mem in the core file matches that saved in mxxp[4], mxxp[0] also makes sense (pointer being freed - 16), size (mxxp[1]) matches what the core contains, mxxp[3] is correct too and mxxp[5] also.  But mxxp[2] shows clearly that while mxxp array was being filled in some other thread must have been modifying the memory, 0x36721 is clearly what caused this failure, but as you see, it wasn't there a few instructions afterwards.
The disassembly of the interesting portions of _int_free is:
.L819:
leaq    (%rsi,%rbp), %rcx       #, temp.1468
movq    8(%rcx), %rdx   # <variable>.size, D.11114
cmpq    $16, %rdx       #, D.11114
jbe     .L667   #,
movq    %rdx, %rax      # D.11114, tmp159
andq    $-8, %rax       #, tmp159
cmpq    2168(%rdi), %rax        # <variable>.system_mem, tmp159
jae     .L667   #,
...
.L667:
movq    mxxp@GOTPCREL(%rip), %rax       #, tmp160
leaq    .LC25(%rip), %r13       #, errstr
movq    %rdx, 16(%rax)  # D.11114, mxxp[2]
movq    2168(%r15), %rdx        # <variable>.system_mem, <variable>.system_mem
movq    %rbx, (%rax)    # p, mxxp[0]
movq    %rbp, 8(%rax)   # size, mxxp[1]
movq    %r15, 24(%rax)  # av, mxxp[3]
movq    %rdx, 32(%rax)  # <variable>.system_mem, mxxp[4]
movq    8(%rcx), %rdx   # <variable>.size, tmp168
andq    $-8, %rdx       #, tmp168
movq    %rdx, 40(%rax)  # tmp168, mxxp[5]
jmp     .L647   #
At .L819 is the
    if (__builtin_expect (chunk_at_offset (p, size)->size <= 2 * SIZE_SZ, 0)
        || __builtin_expect (chunksize (chunk_at_offset (p, size))
                             >= av->system_mem, 0))
test and at .L667 the body:
      {
        mxxp[0] = p;
        mxxp[1] = size;
        mxxp[2] = chunk_at_offset (p, size)->size;
        mxxp[3] = av;
        mxxp[4] = av->system_mem;
        mxxp[5] = chunksize (chunk_at_offset (p, size));
        errstr = "free(): invalid next size (fast)";
        goto errout;
      }
Note that mxxp[2] & -8 clearly is different from mxxp[5], but if nothing was changing this memory at the same time it would have to be equal.
mxxp[2] contains the cached value in $rdx (see movq 8(%rcx), %rdx earlier which reads the chunk_at_offset (p, size)->size into %rdx), and the if test used %rdx and %rdx & -8.  But mxxp[5] reads from 8(%rcx) again, and there has been a change.
Comment 16 Carl Trieloff 2009-07-15 21:28:38 EDT
Here is the code from the app.

setup, nothing here in the test

            if (opts.uniqueData) {
                offset = 5;
                data += "data:";//marker (requested for latency testing tool scripts)
                data += string(sizeof(size_t), 'X');//space for seq no
                data += session.getId().str();
                if (opts.size > data.size()) {
                    data += string(opts.size - data.size(), 'X');
                } else if(opts.size < data.size()) {
                    cout << "WARNING: Increased --size to " << data.size()
                         << " to honour --unique-data" << endl;
                }
            } else {
                size_t msgSize=max(opts.size, sizeof(size_t));
                data = string(msgSize, 'X');                
            }


then later, this is where the in-place memory replacement happens:

  for (size_t i=0; i<opts.count; i++) {
    const_cast<std::string&>(msg.getData()).replace(offset, sizeof(size_t), 
           reinterpret_cast<const char*>(&i), sizeof(size_t));


So I could maybe see that we may have two threads in IO on the string (I can go look for that.) but from the code I don't see that we are doing is in the BZ higher up as all the allocation is bring done by string in the test.

{
  char *p = malloc (18);
  memcpy (p, "0123456789ABCDEFGHIJKLMNOPQR", 28);
  free (p);
  return 0;
}

i.e. Are looking for 
1.)  mem being freed then accessed, this I can go try prove if it exists in the code
2.)  resize to larger than externally allocated mem (this we don't do) see code above.
Comment 17 Carl Trieloff 2009-07-15 22:03:12 EDT
Created attachment 353938 [details]
diff to try for theory in comments


Gordon,

The one theory I can come up with is that Message holds the message body and hands it to the IO layer, and we then replace the message body while the IO layer is reading the message body. I didn't find this in the code but don't know that layer well. You know this part better than me -- could this be possible?

if my theory has any merit the following patch will correct the issue...
Comment 18 Gordon Sim 2009-07-16 05:16:59 EDT
The message is converted into frames; the content frame that contains the data initialises its own string object from the message data string. It is these frames that are then accessed by the IO thread. I.e. once the messageTransfer() command retursn the Message is no longer referenced by any lower layers, neither is the actual data string object.

I don't know enough about the internal details of the std::string implementation, to describe what happens there. However if the memory for the string is not copied when initialising the copy, then it would have to be copied 'on write' at the point where the original is modified (i.e. the next replace()).
Comment 19 Carl Trieloff 2009-07-16 08:05:25 EDT
that blows my theory.

I think we need to understand what std::string is doing in the replace() method.
Comment 20 Jakub Jelinek 2009-07-16 11:52:19 EDT
For Uli/Andreas, perhaps important point in the #c15 data is that av->system_mem is smaller than av->max_system_mem and the bad, too large, ->size, is smaller than av->max_system_mem.  So my current theory is that this _int_free call that crashed was for a fastbin chunk below the topmost one in the arena, and was concurrent with another thread doing sYSTRIm or heap_trim.
E.g. heap_trim does:
    ar_ptr->system_mem -= heap->size;
    arena_mem -= heap->size;
    delete_heap(heap);
(where delete_heap is expensive syscall)
...
    top(ar_ptr) = top_chunk = p;
    set_head(top_chunk, new_size | PREV_INUSE);
So it first decreases system_mem, then only after some time changes the top chunk's size to a smaller one.  Now, is there anything that would prevent this from happening concurrently with the _int_free of the chunk below it?
Is the free(): invalid next size (fast) test the only one which would suffer from this concurrency?  If yes, I guess for ATOMIC_FASTBINS we could just compare with av->max_system_mem instead of av->system_mem.
Comment 21 Andreas Schwab 2009-07-16 12:05:22 EDT
I don't think using av->max_system_mem would be thread-safe either.  It is only updated at last after the heap is grown.
Comment 22 Jakub Jelinek 2009-07-16 12:16:55 EDT
Perhaps.  Maybe we could keep the
if (__builtin_expect (chunk_at_offset (p, size)->size <= 2 * SIZE_SZ, 0)
        || __builtin_expect (chunksize (chunk_at_offset (p, size))
                             >= av->system_mem, 0))
test, but if it fails, grab the mutex if we don't hold it yet and retry the test.

Also, I'm wondering about whether ATOMIC_FASTBINS shouldn't imply TRIM_FASTBINS so that the topmost chunk is never in a fastbin.  May be it is unnecessary, but certainly something to analyze.
Comment 23 Ulrich Drepper 2009-07-16 12:38:32 EDT
I agree with the system_mem change.  Let's take the lock and redo the check.  That's the best way forward.

diff --git a/malloc/malloc.c b/malloc/malloc.c
index 0c0182e..5167d23 100644
--- a/malloc/malloc.c
+++ b/malloc/malloc.c
@@ -4799,8 +4799,27 @@ _int_free(mstate av, mchunkptr p)
 	|| __builtin_expect (chunksize (chunk_at_offset (p, size))
 			     >= av->system_mem, 0))
       {
-	errstr = "free(): invalid next size (fast)";
-	goto errout;
+#ifdef ATOMIC_FASTBINS
+	/* We might not have a lock at this point and concurrent modifications
+	   of system_mem might have let to a false positive.  Redo the test
+	   after getting the lock.  */
+	if (! have_lock
+	    && (mutex_lock(&av->mutex),
+		locked = 1,
+		chunk_at_offset (p, size)->size <= 2 * SIZE_SZ
+		|| chunksize (chunk_at_offset (p, size)) >= av->system_mem))
+#endif
+	  {
+	    errstr = "free(): invalid next size (fast)";
+	    goto errout;
+	  }
+#ifdef ATOMIC_FASTBINS
+	if (! have_lock)
+	  {
+	    (void)mutex_unlock(&av->mutex);
+	    locked = 0;
+	  }
+#endif
       }
 
     if (__builtin_expect (perturb_byte, 0))
Comment 24 Ulrich Drepper 2009-07-16 12:41:50 EDT
Incorrect patch.  The condition is wrong.  Better:

diff --git a/malloc/malloc.c b/malloc/malloc.c
index 0c0182e..a459a2b 100644
--- a/malloc/malloc.c
+++ b/malloc/malloc.c
@@ -4799,8 +4799,29 @@ _int_free(mstate av, mchunkptr p)
 	|| __builtin_expect (chunksize (chunk_at_offset (p, size))
 			     >= av->system_mem, 0))
       {
-	errstr = "free(): invalid next size (fast)";
-	goto errout;
+#ifdef ATOMIC_FASTBINS
+	/* We might not have a lock at this point and concurrent modifications
+	   of system_mem might have let to a false positive.  Redo the test
+	   after getting the lock.  */
+	if (have_lock
+	    || ({ assert (locked == 0);
+		  mutex_lock(&av->mutex);
+		  locked = 1;
+		  chunk_at_offset (p, size)->size <= 2 * SIZE_SZ
+		    || chunksize (chunk_at_offset (p, size)) >= av->system_mem;
+	      }))
+#endif
+	  {
+	    errstr = "free(): invalid next size (fast)";
+	    goto errout;
+	  }
+#ifdef ATOMIC_FASTBINS
+	if (! have_lock)
+	  {
+	    (void)mutex_unlock(&av->mutex);
+	    locked = 0;
+	  }
+#endif
       }
 
     if (__builtin_expect (perturb_byte, 0))
Comment 25 Ulrich Drepper 2009-07-16 12:49:35 EDT
Created attachment 354013 [details]
Patch to make corruption test thread-safe

Note You need to log in before you can comment on or make changes to this bug.