Bug 466474 - [RHEL5.3] *** glibc detected *** /usr/bin/python: double free or corruption (!prev): 0x000 0000080d55e90 ***
[RHEL5.3] *** glibc detected *** /usr/bin/python: double free or corruption (...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: anaconda (Show other bugs)
5.3
s390x Linux
medium Severity medium
: beta
: ---
Assigned To: Anaconda Maintenance Team
Release Test Team
http://rhts.redhat.com/cgi-bin/rhts/t...
: TestBlocker
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-10-10 09:04 EDT by Jeff Burke
Modified: 2009-01-20 16:35 EST (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-01-20 16:35:44 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jeff Burke 2008-10-10 09:04:52 EDT
Description of problem:
Description of problem:
* RHEL5.3-Server-20081008.nightly
* Error appears through automate RHTS testing
* Behavior appears s390 dependent

  The install of the latest RHTEL5.3 tree fails. It fails during, or
immediately after formatting the DASDs.

Version-Release number of selected component (if applicable):
RHEL5.3-Server-20081008.nightly

How reproducible:
always

Steps to Reproduce:
1. Try a nfs install from pxe with a ks.cfg file.

  
Actual results:
console-zseries: Logging into the hosting VM system to restart the VM guest
console-zseries: logging in ... sending username
LOGON AF
ENTER PASSWORD  (IT WILL NOT APPEAR WHEN TYPED):
console-zseries: pasword prompted ...
console-zseries: sending password ...
console-zseries: Password authorized

00: z/VM Version 5 Release 2.0, Service Level 0501 (64-bit),
00: built on IBM Virtualization Technology
00: There is no logmsg data
00: FILES: 0003 RDR, 0001 PRT,   NO PUN
00: RECONNECTED AT 16:01:06 EDT THURSDAY 10/09/08
console-zseries: Connected to VM
226 File send OK.
9266960 bytes transferred in 2.773 seconds. Transfer rate 3341.85 Kbytes/sec.
00: 0000003 FILES PURGED
00: RDR FILE 0069 SENT FROM AF       PUN WAS 0069 RECS 048K CPY  001 A NOHOLD NO
KEEP
00: RDR FILE 0070 SENT FROM AF       PUN WAS 0070 RECS 0004 CPY  001 A NOHOLD NO
KEEP
00: RDR FILE 0071 SENT FROM AF       PUN WAS 0071 RECS 116K CPY  001 A NOHOLD NO
KEEP
00: 0000003 FILES CHANGED
audit: initializing netlink socket (disabled)
type=2000 audit(1223582480.627:1): initialized
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
Initializing Cryptographic API
alg: No test for crc32c (crc32c-generic)
ksign: Installing public key data
Loading keyring
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
RAMDISK driver initialized: 16 RAM disks of 40000K size 4096 blocksize
md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: bitmap version 4.39
Channel measurement facility using extended format (autodetected)
TCP bic registered
Initializing IPsec netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
Freeing unused kernel memory: 128k freed
Starting the zSeries initrd to configure networking. Version is 1.1
dasd(eckd): 0.0.0191: 3390/0A(CU:3990/01) Cyl:25 Head:15 Sec:224
dasd(eckd): 0.0.0191: (4kB blks): 18000kB at 48kB/trk linux disk layout
 dasda:CMS1/  LI0191: dasda1
qdio: loading QDIO base support version 2
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
IPv6 over IPv4 tunneling driver
qeth: loading qeth S/390 OSA-Express driver
qeth: Device 0.0.0600/0.0.0601/0.0.0602 is a OSD Express card (level: 0757)
with link type OSD_100 (portname: FOOBAR)
qeth: Hardware IP fragmentation not supported on eth0
qeth: VLAN enabled
qeth: Multicast enabled
qeth: IPV6 enabled
qeth: Broadcast enabled
qeth: Using SW checksumming on eth0.
qeth: Outbound TSO not supported on eth0
eth0      Link encap:Ethernet  HWaddr 00:09:6B:1A:47:D7
          inet addr:192.168.5.215  Bcast:192.168.5.255  Mask:255.255.255.0
          inet6 addr: fe80::9:6b03:491a:47d7/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:4096  Metric:1
          RX packets:1 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:84 (84.0 B)  TX bytes:174 (174.0 B)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

sit0      Link encap:IPv6-in-IPv4
          NOARP  MTU:1480  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
20:01:27 INFO    : inserted /tmp/cdrom.ko
20:01:27 INFO    : load module set done
20:01:27 INFO    : modules to insert scsi_mod sd_mod sr_mod
20:01:28 INFO    : loaded scsi_mod from /modules/modules.cgz
20:01:28 INFO    : loaded sd_mod from /modules/modules.cgz
20:01:28 INFO    : loaded sr_mod from /modules/modules.cgz
SCSI subsystem initialized
20:01:28 INFO    : inserted /tmp/scsi_mod.ko
20:01:28 INFO    : inserted /tmp/sd_mod.ko
20:01:28 INFO    : inserted /tmp/sr_mod.ko
20:01:28 INFO    : load module set done
20:01:28 INFO    : modules to insert scsi_transport_fc zfcp
20:01:28 INFO    : loaded scsi_transport_fc from /modules/modules.cgz
20:01:28 INFO    : loaded zfcp from /modules/modules.cgz
20:01:28 INFO    : inserted /tmp/scsi_transport_fc.ko
20:01:28 INFO    : inserted /tmp/zfcp.ko
20:01:28 INFO    : load module set done
20:01:28 INFO    : modules to insert scsi_transport_iscsi libiscsi iscsi_tcp
20:01:28 INFO    : loaded scsi_transport_iscsi from /modules/modules.cgz
20:01:28 INFO    : loaded libiscsi from /modules/modules.cgz
20:01:28 INFO    : loaded iscsi_tcp from /modules/modules.cgz
Loading iSCSI transport class v2.0-724.
20:01:28 INFO    : inserted /tmp/scsi_transport_iscsi.ko
20:01:28 INFO    : inserted /tmp/libiscsi.ko
iscsi: registered transport (tcp)
20:01:28 INFO    : inserted /tmp/iscsi_tcp.ko
20:01:28 INFO    : load module set done
20:01:28 INFO    : modules to insert iscsi_ibft
20:01:28 INFO    : load module set done
20:01:28 INFO    : modules to insert dasd_mod
20:01:28 INFO    : loaded dasd_mod from /modules/modules.cgz
20:01:28 INFO    : inserted /tmp/dasd_mod.ko
20:01:28 INFO    : load module set done
20:01:28 INFO    : modules to insert dasd_diag_mod dasd_fba_mod dasd_eckd_mod
20:01:28 INFO    : loaded dasd_diag_mod from /modules/modules.cgz
20:01:28 INFO    : loaded dasd_fba_mod from /modules/modules.cgz
20:01:28 INFO    : loaded dasd_eckd_mod from /modules/modules.cgz
20:01:28 INFO    : inserted /tmp/dasd_diag_mod.ko
20:01:28 INFO    : inserted /tmp/dasd_fba_mod.ko
dasd(eckd): 0.0.303f: 3390/0A(CU:3990/01) Cyl:3339 Head:15 Sec:224
dasd(eckd): 0.0.303f: (4kB blks): 2404080kB at 48kB/trk compatible disk layout
 dasda:VOL1/  0X303F: dasda1 dasda2
dasd(eckd): 0.0.313f: 3390/0A(CU:3990/01) Cyl:3339 Head:15 Sec:224
dasd(eckd): 0.0.313f: (4kB blks): 2404080kB at 48kB/trk compatible disk layout
 dasdb:VOL1/  0X313F: dasdb1
20:01:29 INFO    : inserted /tmp/dasd_eckd_mod.ko
20:01:29 INFO    : load module set done
20:01:29 INFO    : looking for usb controllers
20:01:29 INFO    : no firewire controller found
20:01:29 INFO    : Trying to detect vendor driver discs
20:01:29 INFO    : found nothing
20:01:29 INFO    : getting kickstart file
20:01:29 INFO    : only have one network device: eth0
20:01:29 INFO    : dnsservers is 172.16.52.28
20:01:29 INFO    : setting specified hostname of z215.z900.redhat.com
20:01:29 INFO    : doing kickstart... setting it up
rtnl_link_change() failure in nl_set_device_mtu(): Operation not supported
20:01:29 ERROR   : unable to set eth0 mtu to 4096 (code -5)
20:01:33 INFO    : file location: http://192.168.33.237/ks/hosts/z215.z900.redha
t.com/ks.cfg
20:01:33 INFO    : transferring http://192.168.33.237//ks/hosts/z215.z900.redhat
.com/ks.cfg to a fd
20:01:33 INFO    : setting up kickstart
20:01:33 INFO    : kickstartFromNfs
20:01:33 INFO    : results of nfs, host is curly.devel.redhat.com, dir is /vol/e
ngineering/devarchive/redhat/nightly/RHEL5.3-Server-20081008.nightly/5/s390x/os,
 opts are '(null)'
20:01:33 INFO    : setting language to en_US.UTF-8
20:01:33 INFO    : need to set up networking
20:01:33 INFO    : going to pick interface
20:01:33 INFO    : going to do getNetConfig
20:01:33 INFO    : dnsservers is 172.16.52.28
20:01:33 INFO    : setting specified hostname of z215.z900.redhat.com
20:01:33 INFO    : doing kickstart... setting it up
rtnl_link_change() failure in nl_set_device_mtu(): Operation not supported
20:01:34 ERROR   : unable to set eth0 mtu to 4096 (code -5)
20:01:38 INFO    : starting to STEP_URL
20:01:38 INFO    : going to do nfsGetSetup
20:01:38 INFO    : host is curly.devel.redhat.com, dir is /vol/engineering/devar
chive/redhat/nightly/RHEL5.3-Server-20081008.nightly/5/s390x/os, opts are '(null
)'
20:01:38 INFO    : mounting nfs path curly.devel.redhat.com:/vol/engineering/dev
archive/redhat/nightly/RHEL5.3-Server-20081008.nightly/5/s390x/os
20:01:39 INFO    : can access /mnt/source/images/stage2.img
20:01:39 INFO    : mntloop loop0 on /mnt/runtime as /mnt/source/images/stage2.im
g fd is 8
Unable to identify CD-ROM format.
VFS: Can't find an ext2 filesystem on dev loop0.
20:01:40 INFO    : got url nfs://mnt/source/.
20:01:40 INFO    : Loading SELinux policy
md: raid6 personality registered for level 6
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
20:01:42 INFO    : inserted /tmp/raid456.ko
20:01:42 INFO    : inserted /tmp/msdos.ko
20:01:42 INFO    : inserted /tmp/jbd2.ko
20:01:42 INFO    : inserted /tmp/crc16.ko
20:01:42 INFO    : inserted /tmp/ext4dev.ko
GFS2 (built Oct  4 2008 00:26:01) installed
20:01:42 INFO    : inserted /tmp/gfs2.ko
Lock_Nolock (built Oct  4 2008 00:26:10) installed
20:01:42 INFO    : inserted /tmp/lock_nolock.ko
device-mapper: uevent: version 1.0.3
device-mapper: ioctl: 4.11.5-ioctl (2007-12-12) initialised: dm-devel@redhat.com
20:01:42 INFO    : inserted /tmp/dm-mod.ko
20:01:42 INFO    : inserted /tmp/dm-zero.ko
20:01:42 INFO    : inserted /tmp/dm-log.ko
20:01:42 INFO    : inserted /tmp/dm-mirror.ko
20:01:42 INFO    : inserted /tmp/dm-snapshot.ko
20:01:42 INFO    : inserted /tmp/scsi_dh.ko
device-mapper: multipath: version 1.0.5 loaded
20:01:42 INFO    : inserted /tmp/dm-multipath.ko
device-mapper: multipath round-robin: version 1.0.0 loaded
20:01:42 INFO    : inserted /tmp/dm-round-robin.ko
device-mapper: multipath emc: version 0.0.3 loaded
20:01:42 INFO    : inserted /tmp/dm-emc.ko
20:01:42 INFO    : inserted /tmp/dm-crypt.ko
20:01:43 INFO    : inserted /tmp/aes_generic.ko
20:01:43 INFO    : inserted /tmp/sha256.ko
20:01:43 INFO    : load module set done
20:01:43 INFO    : looking for USB mouse...
20:01:43 INFO    : NFS install method detected, will use RHupdates/
20:01:43 INFO    : Running anaconda script /usr/bin/anaconda
Running anaconda, the Red Hat Enterprise Linux Server system installer - please
wait...
eth0: no IPv6 routers present
Running pre-install scripts
Please wait while formatting drive dasda...
[-- MARK -- Thu Oct  9 16:05:00 2008]
dasd(eckd): 0.0.303f: (4kB blks): 2404080kB at 48kB/trk compatible disk layout
dasda: detected capacity change from 0 to 2461777920
 dasda:(nonl)/        : dasda1
 dasda:VOL1/  0X303F:
In progress...
Please wait while formatting drive dasdb...
dasd(eckd): 0.0.313f: (4kB blks): 2404080kB at 48kB/trk compatible disk layout
dasdb: detected capacity change from 0 to 2461777920
 dasdb:(nonl)/        : dasdb1
 dasdb:VOL1/  0X313F:
In progress...
*** glibc detected *** /usr/bin/python: double free or corruption (!prev): 0x000
0000080d55e90 ***

Expected results:
This should install

Additional info:
I may have put this BZ in the wrong bucket (anaconda) But hopefully they will know who to pull in.
Comment 4 Bill Peck 2008-10-10 15:32:38 EDT
I just did an install of RHEL5.3-Server-20081008.nightly on z213 and it worked fine.  So maybe this is an intermittent problem?
Comment 5 David Cantrell 2008-10-10 18:50:41 EDT
Are we calling this test infrastructure or a real bug?  I don't think it's a bug in pyparted.
Comment 6 Jeff Burke 2008-10-10 23:18:18 EDT
I think it is still to early to say. We still have conflicting data. We have see this on several systems now. 
 z215.z900.redhat.com 
 z209.z900.redhat.com
 z206.z900.redhat.com
 z202.z900.redhat.com 

Unless it is unique to certain s/390 systems, or it is a timing issue. 
The system that Bill used for his test was
 z213.z900.redhat.com

The test infrastructure is pretty static. It may deviate a little based on the tests that are selected to run. Some packages may be added to the install but that is about it.

Running a query in RHTS. Bills reserve workflow is the only one that passed.
 http://tinyurl.com/4rq4og
Comment 8 Jeff Burke 2008-10-13 14:01:43 EDT
RHEL5.3-Server-20081012.nightly - Failed

Full log:
http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=4634043

<snip>
Running anaconda, the Red Hat Enterprise Linux Server system installer - please
wait...
eth0: no IPv6 routers present
Running pre-install scripts
Please wait while formatting drive dasda...
dasd(eckd): 0.0.303d: (4kB blks): 2404080kB at 48kB/trk compatible disk layout
dasda: detected capacity change from 0 to 2461777920
 dasda:(nonl)/        : dasda1
 dasda:VOL1/  0X303D:
In progress...
.
.
.
.
dasd(eckd): 0.0.363d: (4kB blks): 2404080kB at 48kB/trk compatible disk layout
dasdg: detected capacity change from 0 to 2461777920
 dasdg:(nonl)/        : dasdg1
 dasdg:VOL1/  0X363D:
In progress...
Please wait while formatting drive dasdh...
dasd(eckd): 0.0.373d: (4kB blks): 2404080kB at 48kB/trk compatible disk layout
dasdh: detected capacity change from 0 to 2461777920
 dasdh:(nonl)/        : dasdh1
 dasdh:VOL1/  0X373D:
In progress...
*** glibc detected *** /usr/bin/python: double free or corruption (!prev): 0x000
0000080da5100 ***
</snip>
Comment 10 David Cantrell 2008-10-13 20:26:32 EDT
I'm not sure this is a pyparted bug.  Here is the only instance of free() in pyparted:

void
py_ped_exception_string_clear (void)
{
        if (py_exception_string != NULL) {
                free (py_exception_string);
                py_exception_string = NULL;
        }
}

I think this is a libparted bug.  I seem to remember hitting glibc double free errors under 5.1 or 5.2 (or both) and it being a strange libparted problem.  I'm going to reassign to Joel because I think this is a libparted issue at this point.

Joel this could be pyparted or libparted or both.  I'm not sure yet, but I really don't think it's pyparted after looking through the code.  I think libparted is segfaulting for some other reason.
Comment 11 Joel Andres Granados 2008-10-14 05:26:21 EDT
I really doubt this is a regression related with parted because I reverted all the s390 changes that went into parted.
Could this be an issue with python or glibc?

Brok: Think we can setup a s390 machine to test parted and see how it partitions a dasd drive.  If its a parted bug I would expect to see it while parted makes the partition.

Im going to check the frees in libparted and see which one is not checked.
Comment 12 Bill Peck 2008-10-14 09:04:33 EDT
I watched some s390 installs last night and it doesn't happen everytime.  When it does happen it happens right after all the low level formatting

Please wait while formatting drive dasdg...
dasd(eckd): 0.0.363d: (4kB blks): 2404080kB at 48kB/trk compatible disk layout
 dasdg: unknown partition table
 dasdg:VOL1/  0X363D:

which is where it would be partitioning the drives it just low level formatted.
Comment 13 Bill Peck 2008-10-14 11:22:20 EDT
After talking with Brock we think the biggest difference between RHTS and Kate is that RHTS does an initlabel on all drives before partitioning.  I've scheduled a job on RHTS with 4 recipes doing no initlabel and 4 recipes doing initlabel.

http://rhts.redhat.com/cgi-bin/rhts/jobs.cgi?id=32485

We'll wait for these results and see if there is anything.
Comment 14 Joel Andres Granados 2008-10-14 11:37:17 EDT
Bill:
comment 12, Seem very logical from my point of view.  It has yet to reproduce
in parted though.  I still think patching up the dangling frees is the best
thing I can do at this point.

Dcantrell:
commit 5241e33a5193f5d0a73d36853645ca0ac10e8530 for anaconda.  If I'm reading
the diff right, you changed the order in which we call dasdFmt and create the
pedDevice and pedDisk.  Any particular reason for this change?  Do you think we
would gain something changing this order back?

Seeing the reported behavior it probably a race condition between
anaconda/pyparted and parted.  I'll still make the parted changes but I would
still like to know the answer to my quesiton to dcantrell.
Comment 15 Joel Andres Granados 2008-10-14 11:40:28 EDT
(In reply to comment #13)
> After talking with Brock we think the biggest difference between RHTS and Kate
> is that RHTS does an initlabel on all drives before partitioning.  I've

How exactly is this done, with dasdFmt?  What is the exact line?  Its relevant so I can test it with parted.
Comment 16 Bill Peck 2008-10-14 11:47:45 EDT
clearpart --all --initlabel --drives=dasda,dasdb,dasdc,dasdd,dasde,dasdf,dasdg,dasdh

The above is all on  one line.  

clearpart --all

Thats the difference between the two recipes so far.

without the --initlabel arg I've successfully installed 3 out of 4.  Just waiting on the last one now.  Then it will do the installs with the initlabel.  I'm guessing I will have failures when it gets to those recipes.
Comment 17 Bill Peck 2008-10-14 14:49:27 EDT
The RHTS job finished:

4 recipes with no initlabel in them all passed.
2 out of 4 recipes with initlabel showed the double free error.

I've asked Brock to confirm my findings.
Comment 18 David Cantrell 2008-10-14 15:53:10 EDT
(In reply to comment #14)
> Dcantrell:
> commit 5241e33a5193f5d0a73d36853645ca0ac10e8530 for anaconda.  If I'm reading
> the diff right, you changed the order in which we call dasdFmt and create the
> pedDevice and pedDisk.  Any particular reason for this change?  Do you think we
> would gain something changing this order back?

This was a patch to support picking up DASDs with the FBA discipline ("VDSK" devices).  I didn't mean to move around the order of dasdFmt and pedDevice/pedDisk, I think that was by accident.  Here's what I think it should be:


diff --git a/partedUtils.py b/partedUtils.py
index 77b8511..2ed84c6 100644
--- a/partedUtils.py
+++ b/partedUtils.py
@@ -1160,11 +1160,11 @@ class DiskSet:
                 # FIXME: need the right fix for z/VM formatted dasd
                 if rhpl.getArch() == "s390" and drive[:4] == "dasd" and \
                    not self.isDisciplineFBA(drive):
-                    dev = parted.PedDevice.get(deviceFile)
-                    disk = parted.PedDisk.new(dev)
-
                     if self.dasdFmt(drive):
                         raise LabelError, drive
+
+                    dev = parted.PedDevice.get(deviceFile)
+                    disk = parted.PedDisk.new(dev)
                 else:
                     disk = labelDisk(deviceFile)
             except parted.error, msg:


Like you said, make sure we get the dasdFmt() done before we call parted.
Comment 20 Brock Organ 2008-10-14 16:26:49 EDT
I've verified the behavior using the --initlabel line as bill specified in kate ... looks like we've identified the root cause ...
Comment 21 Joel Andres Granados 2008-10-15 04:32:43 EDT
Can you please test with http://jgranado.fedorapeople.org/temp/initlabel.img to see if the problem persists.  This has the change expressed in comment 18.  This plus the changes to the parted free statements should do the trick.
Be sure that the install tree that you are working with has anaconda version 11.1.2.142.  If you have a nightly you should be ok.
Comment 22 Joel Andres Granados 2008-10-15 04:44:31 EDT
Forgot to specify that initlabel.img is an update image and should be used with the update=http....
Comment 23 Joel Andres Granados 2008-10-15 08:01:18 EDT
I have a patch for a couple of places where this *might* be happening.  But I would like to test the anaconda change before building parted with the new changes.

Bpeck:
remeber that its updates=http://..... (dont forget the 's')
do you think we can get a test today?
Comment 24 Bill Peck 2008-10-15 08:40:58 EDT
Job is scheduled now.  I'll let you know in an hour or two..
Comment 25 Bill Peck 2008-10-15 15:23:00 EDT
The updates image you provided worked perfect with --initlabel.
Comment 26 Joel Andres Granados 2008-10-15 15:30:32 EDT
moving this to modified... thx bpeck
Comment 27 Joel Andres Granados 2008-10-15 15:31:09 EDT
oh, and it was an anaconda bug.
Comment 35 errata-xmlrpc 2009-01-20 16:35:44 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0164.html

Note You need to log in before you can comment on or make changes to this bug.