Bug 612627 - /usr/bin/python : malloc(): memory corruption running yum.
Summary: /usr/bin/python : malloc(): memory corruption running yum.
Status: CLOSED DUPLICATE of bug 607650
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: python
Version: 6.0
Hardware: All
OS: Linux
Target Milestone: rc
: ---
Assignee: Dave Malcolm
QA Contact: BaseOS QE - Apps
Depends On:
TreeView+ depends on / blocked
Reported: 2010-07-08 16:01 UTC by Michael De La Rue
Modified: 2014-01-21 06:18 UTC (History)
2 users (show)

Clone Of:
Last Closed: 2010-09-17 20:44:26 UTC

Attachments (Terms of Use)
logfiles from machine where yum is misbehaving (49.37 KB, application/x-bzip-compressed-tar)
2010-07-09 16:29 UTC, Michael De La Rue
no flags Details

Description Michael De La Rue 2010-07-08 16:01:40 UTC
Description of problem:
during package install yum puts out memory corruption error and locks up

Version-Release number of selected component (if applicable):

How reproducible:
happened one time and the next run of the command worked fine.  

Steps to Reproduce:
0. text mode install of RHEL; see also bug #612611
1. yum install nss-pam-pdapd libpst
2. crash happens
Actual results:
*** glibc detected *** /usr/bin/python : malloc(): memory corruption: 0x000000000251ed30

Expected results:
packages should be installed

Additional info:
following install worked fine;
there's no log writing around the time of the failure;
the system seems otherwise stable and nothing out of the ordinary.

Comment 2 James Antill 2010-07-08 16:18:43 UTC
doubt it's python, but david has all the magic scripts to work out whose fault it is :).

Comment 3 Dave Malcolm 2010-07-08 17:25:21 UTC
Thanks for filing this bug.

Do you have a core dump of the python process (yum)?

Do you have a list of packages that were installed?  (ideally from when the yum command began)  Otherwise, what's the output of "rpm -qa" ?

How far did "yum" get before the error?  Was there any more output beforehand?

I'm guessing this is an x86_64 machine (from the 64-bit address cited in comment #0, and from bug 612611) - is that correct?


(In reply to comment #2)
> doubt it's python, but david has all the magic scripts to work out whose fault
> it is :).
I have some magic scripts to help analyze gdb backtraces, but sadly they can't help us here.

Comment 4 Michael De La Rue 2010-07-09 16:25:07 UTC
I've not managed to get a coredump because ulimit was set (though I don't think the process tried to dump; it just locked up)  unfortunately I ended up killing it so I can't now force a coredump either.  

I got more output from a rerun, but the crash is different and unfortunately not reproducible any more.  

[root@localhost ~]# yum install nss-pam-ldapd
Loaded plugins: refresh-packagekit, rhnplugin
This system is not registered with RHN.
RHN support will be disabled.
Setting up Install Process
Resolving Dependencies
There are unfinished transactions remaining. You might consider running yum-complete-transaction first to finish them.
--> Running transaction check
---> Package nss-pam-ldapd.x86_64 0:0.7.5-3.el6 set to be updated
--> Processing Dependency: /lib64/security/pam_ldap.so for package: nss-pam-ldapd-0.7.5-3.el6.x86_64
--> Processing Dependency: nscd for package: nss-pam-ldapd-0.7.5-3.el6.x86_64
--> Running transaction check
---> Package nscd.x86_64 0:2.12-1.2.el6 set to be updated
---> Package pam_ldap.x86_64 0:185-5.el6 set to be updated
--> Finished Dependency Resolution

Dependencies Resolved

 Package              Arch          Version             Repository         Size
 nss-pam-ldapd        x86_64        0.7.5-3.el6         rhel6-beta        147 k
Installing for dependencies:
 nscd                 x86_64        2.12-1.2.el6        rhel6-beta        196 k
 pam_ldap             x86_64        185-5.el6           rhel6-beta         87 k

Transaction Summary
Install       3 Package(s)
Upgrade       0 Package(s)

Total download size: 430 k
Installed size: 771 k
Is this ok [y/N]: y
Downloading Packages:
(1/3): nscd-2.12-1.2.el6.x86_64.rpm                      | 196 kB     00:00     
(2/3): nss-pam-ldapd-0.7.5-3.el6.x86_64.rpm              | 147 kB     00:00     
(3/3): pam_ldap-185-5.el6.x86_64.rpm                     |  87 kB     00:00     
Total                                            18 kB/s | 430 kB     00:24     
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
*** glibc detected *** /usr/bin/python: malloc(): smallbin double linked list corrupted: 0x0000000003af2770 ***
Write failed: Broken pipe

Comment 5 Michael De La Rue 2010-07-09 16:29:17 UTC
Created attachment 430714 [details]
logfiles from machine where yum is misbehaving

here is the entire /var/log directory;  included are the anaconda.yum.log and the yum.log files which should answer your about which packages were installed.  I don't think there's much more apart from package databases etc.

Comment 6 Dave Malcolm 2010-07-09 17:27:08 UTC
Thanks for providing the log files.

Unfortunately, without a coredump it's going to be very hard to track this down.

Based on the error message, there's been some kind of corruption of the heap.

The error in comment #0: "malloc(): memory corruption" is emitted by line 4396 of malloc.c within the implementation of "malloc" whilst scanning through recently freed chunks of memory: one of the chunks has a corrupt-looking value for its size field.

The error in comment #4: "malloc(): smallbin double linked list corrupted:" is emitted by line 4341 of malloc.c, again within the implementation of "malloc": again, a bookkeeping field within a free chunk of memory has become corrupt.

So in both cases the bookkeeping fields that lurk between chunks of memory are getting corrupt.  This could indicate a dynamically-allocated block of memory being used after being freed, or a write beyond the bounds of a block of memory, or various other things going wrong.  It may be a reference counting issue: perhaps something is missing a Py_INCREF a PyObject* leading to the object being freed (however if that's the case, the object is likely to be actually allocated by the arena allocator, and that may contradict the size suggested by the smallbin message).

So there's likely a bug _somewhere_ in one of the DSOs within the python process that was running yum.  Figuring out where the bug is is likely to be very difficult.  The anaconda log you provided may give some clues as to the scope of the search.

Comment 7 RHEL Product and Program Management 2010-07-15 14:50:59 UTC
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release. It has
been denied for the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 8 Dave Malcolm 2010-09-17 19:56:15 UTC
I noticed in the dmesg that this is a guest machine running on top of KVM:
e.g. this line:
  Booting paravirtualized kernel on KVM

This makes me suspect that this is another duplicate of bug 607650 (a hypervisor bug).

Do you remember what version of the hypervisor where you running on?

Comment 9 Dave Malcolm 2010-09-17 20:44:26 UTC

*** This bug has been marked as a duplicate of bug 607650 ***

Note You need to log in before you can comment on or make changes to this bug.