Bug 615102 - yum segfault inside malloc_consolidate within mock
Summary: yum segfault inside malloc_consolidate within mock
Keywords:
Status: CLOSED DUPLICATE of bug 607650
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: rpm
Version: 6.0
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Panu Matilainen
QA Contact: BaseOS QE Security Team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-07-15 23:32 UTC by Dave Malcolm
Modified: 2011-03-15 13:56 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-08-09 06:37:28 UTC


Attachments (Terms of Use)
Backtrace from crash (7.32 KB, text/plain)
2010-07-15 23:32 UTC, Dave Malcolm
no flags Details
Output from "rpm -qa | sort" on the machine in question (22.96 KB, text/plain)
2010-07-15 23:34 UTC, Dave Malcolm
no flags Details
DSOs mapped in the process: output from: gdb -c core.2928 --eval-command="python print '\n'.join([dso.filename for dso in gdb.objfiles()])" --batch|sort > dsos.txt (5.75 KB, text/plain)
2010-07-15 23:40 UTC, Dave Malcolm
no flags Details
RPMS that contribute one of the DSOs in the process (710 bytes, text/plain)
2010-07-15 23:43 UTC, Dave Malcolm
no flags Details

Description Dave Malcolm 2010-07-15 23:32:17 UTC
Created attachment 432252 [details]
Backtrace from crash

Description of problem:
(Similar to bug 612853, but a segfault inside malloc_consolidate, rather than an abort)

nirik set up a test RHEL6Beta2 box for EPEL (iirc), saw python crashing running yum inside of mock.

We managed to get a coredump from the crash; attaching backtrace.

Not sure how to reproduce at this stage.

Comment 1 Dave Malcolm 2010-07-15 23:34:27 UTC
Created attachment 432253 [details]
Output from "rpm -qa | sort" on the machine in question

Comment 2 Dave Malcolm 2010-07-15 23:40:10 UTC
Created attachment 432255 [details]
DSOs mapped in the process: output from: gdb -c core.2928 --eval-command="python print '\n'.join([dso.filename for dso in gdb.objfiles()])" --batch|sort > dsos.txt

Comment 4 Dave Malcolm 2010-07-15 23:43:40 UTC
Created attachment 432256 [details]
RPMS that contribute one of the DSOs in the process

Output from:
(for f in $(cat dsos.txt|grep "^/"); do  rpm  -qf $f ; done) | sort|uniq > rpms-with-dsos.txt

Comment 5 Dave Malcolm 2010-07-15 23:49:27 UTC
(gdb) frame 0
#0  malloc_consolidate (av=0x308a97ae80) at malloc.c:5153
(gdb) print p
$3 = (struct malloc_chunk *) 0x62732f3d48544150

0x62732f3d48544150 isn't a valid pointer in this process, but interpreted as little-endian ASCII is
  PATH=/sb
(perhaps "PATH=/sbin", or a fragment of an environ var?)

Comment 6 Dave Malcolm 2010-07-15 23:56:08 UTC
IIRC, this was on a KVM guest (see https://bugzilla.redhat.com/show_bug.cgi?id=612853#c7).  Don't know if that's relevant yet.

Comment 7 RHEL Product and Program Management 2010-07-15 23:57:37 UTC
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release. It has
been denied for the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 8 Kevin Fenzi 2010-07-16 01:18:09 UTC
Happy to provide any info on the host/kvm/libvirt env if required.

Comment 9 Dave Malcolm 2010-07-19 21:22:29 UTC
I'm not sure where this bug is; it could be in any of the DSOs in the process.

Reassigning to rpm, given that frames 3 through 11 are all in librpm (and frame 12 is in rpm-python).  Any ideas?  Feel free to reassign.

I've tried running yum under "valgrind"; have seen this error:
  Updating       : rpm-debuginfo-4.8.0-12.el6.x86_64                                                                                                     1/6 
==25726== Syscall param pwrite64(buf) points to uninitialised byte(s)
==25726==    at 0x3DCE20EE83: ??? (syscall-template.S:82)
==25726==    by 0x3E5D92D200: __os_io (os_rw.c:92)
==25726==    by 0x3E5D91B883: __memp_pgwrite (mp_bh.c:399)
==25726==    by 0x3E5D91BAE4: __memp_bhwrite (mp_bh.c:168)
==25726==    by 0x3E5D929A6B: __memp_sync_int (mp_sync.c:551)
==25726==    by 0x3E5D8C9669: __db_sync (db_am.c:577)
==25726==    by 0x3E5D8DF003: __db_sync_pp (db_iface.c:1866)
==25726==    by 0x3E5E015AA8: ??? (in /usr/lib64/librpm.so.1.0.0)
==25726==    by 0x3E5E02148F: rpmdbAdd (in /usr/lib64/librpm.so.1.0.0)
==25726==    by 0x3E5E0351EE: ??? (in /usr/lib64/librpm.so.1.0.0)
==25726==    by 0x3E5E035A96: ??? (in /usr/lib64/librpm.so.1.0.0)
==25726==    by 0x3E5E03572A: ??? (in /usr/lib64/librpm.so.1.0.0)
==25726==  Address 0x18c04f1e is not stack'd, malloc'd or (recently) free'd

Comment 10 Dave Malcolm 2010-07-19 22:12:47 UTC
(In reply to comment #9)
> ==25726== Syscall param pwrite64(buf) points to uninitialised byte(s)
> ==25726==    at 0x3DCE20EE83: ??? (syscall-template.S:82)
> ==25726==    by 0x3E5D92D200: __os_io (os_rw.c:92)
> ==25726==    by 0x3E5D91B883: __memp_pgwrite (mp_bh.c:399)
> ==25726==    by 0x3E5D91BAE4: __memp_bhwrite (mp_bh.c:168)
> ==25726==    by 0x3E5D929A6B: __memp_sync_int (mp_sync.c:551)
> ==25726==    by 0x3E5D8C9669: __db_sync (db_am.c:577)
> ==25726==    by 0x3E5D8DF003: __db_sync_pp (db_iface.c:1866)
> ==25726==    by 0x3E5E015AA8: ??? (in /usr/lib64/librpm.so.1.0.0)
> ==25726==    by 0x3E5E02148F: rpmdbAdd (in /usr/lib64/librpm.so.1.0.0)
> ==25726==    by 0x3E5E0351EE: ??? (in /usr/lib64/librpm.so.1.0.0)
> ==25726==    by 0x3E5E035A96: ??? (in /usr/lib64/librpm.so.1.0.0)
> ==25726==    by 0x3E5E03572A: ??? (in /usr/lib64/librpm.so.1.0.0)
> ==25726==  Address 0x18c04f1e is not stack'd, malloc'd or (recently) free'd    
This valgrind report seems to be in bdb:
==25776== Syscall param pwrite64(buf) points to uninitialised byte(s)
==25776==    at 0x3DCE20EE63: __pwrite_nocancel (syscall-template.S:82)
==25776==    by 0x3E5D92D200: __os_io (os_rw.c:92)
==25776==    by 0x3E5D91B883: __memp_pgwrite (mp_bh.c:399)
==25776==    by 0x3E5D91BAE4: __memp_bhwrite (mp_bh.c:168)
==25776==    by 0x3E5D929A6B: __memp_sync_int (mp_sync.c:551)
==25776==    by 0x3E5D8C9669: __db_sync (db_am.c:577)
==25776==    by 0x3E5D8DF003: __db_sync_pp (db_iface.c:1866)
==25776==    by 0x3E5E015AA8: db3sync (db3.c:213)
==25776==    by 0x3E5E02069F: rpmdbRemove (rpmdb_internal.h:465)
==25776==    by 0x3E5E034F69: rpmpsmStage (psm.c:1598)
==25776==    by 0x3E5E0357F5: rpmpsmStage (psm.c:1459)
==25776==    by 0x3E5E03572A: rpmpsmStage (psm.c:1504)
==25776==  Address 0x1390a425 is not stack'd, malloc'd or (recently) free'd

Comment 11 Panu Matilainen 2010-08-09 06:37:28 UTC

*** This bug has been marked as a duplicate of bug 607650 ***


Note You need to log in before you can comment on or make changes to this bug.