Bug 617554

Summary:	anaconda traceback when installing onto a system where all disks have a whole disk format
Product:	Red Hat Enterprise Linux 6	Reporter:	Hans de Goede <hdegoede>
Component:	anaconda	Assignee:	David Cantrell <dcantrell>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Release Test Team <release-test-team-automation>
Severity:	medium	Docs Contact:
Priority:	low
Version:	6.0	CC:	atodorov, borgan, hdegoede, jchadima, mganisin, syeghiay
Target Milestone:	rc
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:	anaconda-13.21.62-1	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:	617438	Environment:
Last Closed:	2010-11-10 19:52:05 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Hans de Goede 2010-07-23 12:18:25 UTC

+++ This bug was initially created as a clone of Bug #617438 +++

Description of problem:
Trying upgrade existing system with centos 5 into rhel 6

Version-Release number of selected component (if applicable):
RHEL6.0-20100715.2-Workstation-x86_64-DVD1.iso

How reproducible:
Always

Steps to Reproduce:
1. boot install dvd
2. try to partition the disks

Actual results:
Various kind of errors occurs, there is no way to partition the discs.

Expected results:
Installed system :)

Additional info:
the system is x86_64 with 2core cpu, 8GB ram and 2 1TB HDDs with the guid tables

--- Additional comment from jchadima on 2010-07-22 23:14:26 EDT ---

The option use entire hard disks causes python exception.
The option use linux partitions cause "there is no room on the discs" message.
The manual partitioning shows the discs and enable select among them, but they are unknown  type and cannot be modified.

currently the discs are in raid1 array running centos 5.

--- Additional comment from pm-rhel on 2010-07-22 23:22:31 EDT ---

Since this issue was entered in bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.

--- Additional comment from pm-rhel on 2010-07-22 23:38:26 EDT ---

This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

--- Additional comment from clumens on 2010-07-23 00:20:31 EDT ---

Please attach the complete /tmp/anaconda-tb-* file you are seeing to this bug report.  In addition, titles like "anaconda still unusable" are highly subjective and downright incorrect.  Plenty of people are using anaconda every day to do installs of RHEL6.

Also please note that upgrades from RHEL5 to RHEL6 are not supported, let alone from Centos 5 to RHEL6.  But without the exact details of what you are trying to do and how it's failing, it's impossible for me to be any more specific than this.

--- Additional comment from jchadima on 2010-07-23 02:30:56 EDT ---

I've tried to install rhel6 into a box known running rhel5-like system.

--- Additional comment from jchadima on 2010-07-23 03:25:02 EDT ---

Created an attachment (id=433886)
tracebak in the case of use entiere disc

--- Additional comment from jchadima on 2010-07-23 03:27:38 EDT ---

Created an attachment (id=433888)
attached discs

--- Additional comment from jchadima on 2010-07-23 03:29:58 EDT ---

Created an attachment (id=433889)
automatic partititioning

--- Additional comment from jchadima on 2010-07-23 03:32:15 EDT ---

Created an attachment (id=433890)
manual partitioning

--- Additional comment from jchadima on 2010-07-23 03:33:15 EDT ---

fdisk -l from the running system


Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          32      257008+  83  Linux
/dev/sda2              33      121601   976502992+  fd  Linux raid autodetect

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1          32      257008+  83  Linux
/dev/sdb2              33      121601   976502992+  fd  Linux raid autodetect

Disk /dev/md0: 999.9 GB, 999938981888 bytes
2 heads, 4 sectors/track, 244125728 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md0 doesn't contain a valid partition table

--- Additional comment from hdegoede on 2010-07-23 05:17:36 EDT ---

Hi Jan,

When anaconda backtraces it writes a file with all the logs we need to the ramdisk from which anaconda is running. This file can be found under the name of
/tmp/anaconda-tb-*

Please try another install and when anaconda backtraces switch to the shell at tty2 (ctrl + alt + F2) and collect the logfile (you can for example scp it out of the installer environment), then attach the logfile here.

The screenshots are of limited use as they contain only a very small subset of all the information we need.

--- Additional comment from jchadima on 2010-07-23 07:28:47 EDT ---

Created an attachment (id=433922)
anaconda log

this is only one cas of failure, there are still 2 another with no backtrace...

--- Additional comment from hdegoede on 2010-07-23 07:44:36 EDT ---

Jan,

Thanks for the logs!

I see the following in the logs:
11:17:00,959 DEBUG   : type detected on 'sda' is 'hpt45x_raid_member'
11:17:00,959 DEBUG   : type detected on 'sdb' is 'hpt45x_raid_member'

So these 2 disks were once part of a BIOS RAID / firmware RAID set. From there on things start falling apart. Part of the problem is that anaconda does not properly
recognize 'hpt45x_raid_member' as meaning the disks are (were ?) part of a BIOS RAID set.

Let me try to explain:

1) When choosing use entire disk, anaconda will replace the hpt45x_raid_member "format" with a normal partition table, so this option should work but you get the attached backtrace, this is a bug. I'll clone this bug for this part of the issues you are seeing.

2) When choosing remove linux partitions, no linux partitions to remove are found hence the no no freespace message, this is ok.

3) When doing manual partitioning, you cannot edit the disks as they have what we call a whole disk format (the entire disks are recognized as raid set members), thus you cannot edit them.

So summarizing I see 2 issues here:

1) anaconda does not recognize hpt45x_raid_member as meaning BIOS RAID member
2) anaconda backtraces when choosing use entire disk and all disks in the system have a whole disk using format

I'll post an updates.img for you to test to fix 1, and clone this bug for 2.

Comment 1 Hans de Goede 2010-07-23 12:22:39 UTC

Dave,

I think this happens whenever we encounter a system where all disks have a whole disk format, and thus storage.anaconda.id.bootloader.drivelist is empty.

Here is the traceback from:
https://bugzilla.redhat.com/attachment.cgi?id=433922

anaconda 13.21.56 exception report
Traceback (most recent call first):
  File "/usr/lib/anaconda/storage/partitioning.py", line 962, in allocatePartitions
    if disk.name == storage.anaconda.id.bootloader.drivelist[0]:
  File "/usr/lib/anaconda/storage/partitioning.py", line 856, in doPartitioning
    allocatePartitions(storage, disks, partitions, free)
  File "/usr/lib/anaconda/storage/partitioning.py", line 223, in doAutoPartition
    exclusiveDisks=anaconda.id.storage.clearPartDisks)
  File "/usr/lib/anaconda/dispatch.py", line 208, in moveStep
    rc = stepFunc(self.anaconda)
  File "/usr/lib/anaconda/dispatch.py", line 126, in gotoNext
    self.moveStep()
  File "/usr/lib/anaconda/gui.py", line 1338, in nextClicked
    self.anaconda.dispatch.gotoNext()
IndexError: list index out of range

Regards,

Hans

Comment 2 Hans de Goede 2010-07-23 12:28:17 UTC

Oh, I forgot this traceback happens when choosing automatic partitioning with use the entire disk.

Comment 3 David Cantrell 2010-07-27 01:03:10 UTC

Could this just be:

diff --git a/storage/partitioning.py b/storage/partitioning.py
index e173909..4522008 100644
--- a/storage/partitioning.py
+++ b/storage/partitioning.py
@@ -967,7 +967,8 @@ def allocatePartitions(storage, disks, partitions, freespace):
         req_disks.sort(key=lambda d: d.name, cmp=storage.compareDisks)
         boot_index = None
         for disk in req_disks:
-            if disk.name == storage.anaconda.id.bootloader.drivelist[0]:
+            if disk.name in storage.anaconda.id.bootloader.drivelist and \
+               disk.name == storage.anaconda.id.bootloader.drivelist[0]:
                 boot_index = req_disks.index(disk)
 
         if boot_index is not None and len(req_disks) > 1:


?

Comment 4 Hans de Goede 2010-07-27 07:50:10 UTC

(In reply to comment #3)
> Could this just be:
> 
> diff --git a/storage/partitioning.py b/storage/partitioning.py
> index e173909..4522008 100644
> --- a/storage/partitioning.py
> +++ b/storage/partitioning.py
> @@ -967,7 +967,8 @@ def allocatePartitions(storage, disks, partitions,
> freespace):
>          req_disks.sort(key=lambda d: d.name, cmp=storage.compareDisks)
>          boot_index = None
>          for disk in req_disks:
> -            if disk.name == storage.anaconda.id.bootloader.drivelist[0]:
> +            if disk.name in storage.anaconda.id.bootloader.drivelist and \
> +               disk.name == storage.anaconda.id.bootloader.drivelist[0]:
>                  boot_index = req_disks.index(disk)
> 
>          if boot_index is not None and len(req_disks) > 1:
> 
> 
> ?    

Hi,

I'm not sure that this is the entire solution. Yes this will stop the traceback, but I wonder if the code will eventually still find a disk to put /boot on when drivelist is empty.

I think that besides this patch we may also need up update the bootloader drivelist after clearpartitioning has run in the case where whole disk formats have been converted to disklabels (while keeping / honouring the user selected boot disk).

Regards,

Hans

Comment 6 Alexander Todorov 2010-08-03 12:13:20 UTC

(In reply to comment #1)
> Dave,
> 
> I think this happens whenever we encounter a system where all disks have a
> whole disk format, and thus storage.anaconda.id.bootloader.drivelist is empty.
> 

Hi Dave,
how can I cause a disk to have whole disk format so I can test this?

Comment 7 David Cantrell 2010-08-03 20:38:52 UTC

(In reply to comment #6)
> (In reply to comment #1)
> > Dave,
> > 
> > I think this happens whenever we encounter a system where all disks have a
> > whole disk format, and thus storage.anaconda.id.bootloader.drivelist is empty.
> > 
> 
> Hi Dave,
> how can I cause a disk to have whole disk format so I can test this?    

The original problem was reported for RAID devices where the RAID members were whole disk devices, such as /dev/sda and /dev/sdb, rather than having partitions on those devices as the raidset members.

This bug could probably also be triggered if you take one disk in the system and format it as ext3.  Such as:

    dd if=/dev/zero of=/dev/sda bs=1k count=20k
    mke2fs -j -v /dev/sda

Blank out the beginning of the disk and then just make the entire disk a single filesystem.  I am only guessing this second method works, but if it does, it could save some time by not having to create a RAID device before running the installer.

NOTE:  The whole disk format has to exist before you start up the installer.  The bug happens based on what anaconda sees on the target disk(s) during installation.

Comment 8 Alexander Todorov 2010-08-10 13:01:08 UTC

I wasn't able to reproduce with a whole vdb disk formated with ext4 and anaconda 13.21.50 :(

Comment 9 Hans de Goede 2010-08-10 13:26:50 UTC

(In reply to comment #8)
> I wasn't able to reproduce with a whole vdb disk formated with ext4 and
> anaconda 13.21.50 :(    

I think the bug was introduced pretty recently, try 13.21.60.

Comment 10 Alexander Todorov 2010-08-11 11:41:03 UTC

Cound't reproduce with 13.21.60 either.

Comment 11 Hans de Goede 2010-08-11 11:52:49 UTC

(In reply to comment #10)
> Cound't reproduce with 13.21.60 either.    

Bummer, I hit this when doing some other testing. AFAIK the direct cause for me hitting this backtrace was bug 620359, which caused my Intel Firmware RAID set to not be recognized. So all anaconda saw where to RAID set members, which resulted in the bootloader drive list being empty, but the same should happen with whole disk formats.

Oh wait, you say you are making vdb a whole disk format, so vda still is partitioned ?

Note that to reproduce *all* disks must have a whole disk format, see the summary.

Comment 13 Alexander Todorov 2010-08-11 13:30:07 UTC

Making both disks in a whole disk format I managed to reproduce and then tested woth the latest snapshot. No traceback. Installation completed. Moving to VERIFIED.

Comment 14 releng-rhel@redhat.com 2010-11-10 19:52:05 UTC

Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.