Bug 583906

Summary: Backtrace in clearpart_gui when only uninitialized disks are found
Product: [Fedora] Fedora Reporter: Jens Tingleff <jensting>
Component: anacondaAssignee: Hans de Goede <hdegoede>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: rawhideCC: awilliam, hdegoede, jlaska, jonathan, sandro, sunspot007, vanmeeuwen+fedora, vleolml, vsharapo
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: anaconda_trace_hash:eddb067acb4d83d2237cc7fdc97d30f6f54feffd21aec33c3d2518dd0a0dd049 AcceptedBlocker
Fixed In Version: anaconda-14.21-1.fc14 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 585105 (view as bug list) Environment:
Last Closed: 2010-10-19 22:24:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 538277    
Attachments:
Description Flags
Attached traceback automatically from anaconda.
none
Attached traceback automatically from anaconda.
none
Attached traceback automatically from anaconda.
none
Proposed Patch
none
lsmod from livecd while installer was running
none
lsmod from dvd while anaconda was running
none
Attached traceback automatically from anaconda.
none
Attached traceback automatically from anaconda.
none
Attached traceback automatically from anaconda. none

Description Jens Tingleff 2010-04-20 07:39:58 UTC
The following was filed automatically by anaconda:
anaconda 13.37.2 exception report
Traceback (most recent call first):
  File "/usr/lib/anaconda/iw/cleardisks_gui.py", line 146, in getScreen
    self.bootDisk = sorted(names, self.anaconda.storage.compareDisks)[0]
  File "/usr/lib/anaconda/gui.py", line 1393, in setScreen
    new_screen = self.currentWindow.getScreen(anaconda)
  File "/usr/lib/anaconda/gui.py", line 1314, in nextClicked
    self.setScreen ()
IndexError: list index out of range

Comment 1 Jens Tingleff 2010-04-20 07:40:03 UTC
Created attachment 407748 [details]
Attached traceback automatically from anaconda.

Comment 2 Jens Tingleff 2010-04-20 07:46:16 UTC
This is for an install onto a single Intel Matrix RAID10 array. There are three Windows partitions and a slightly confused picture regarding the rest of the partitions (I.e. another distribution failed to install on the free space and I haven't inspected the partition table since).

Basically, after going into non-trivial disk layout, I'm offered the RAID drive at which point it all goes wrong.

Best Regards

Jens

Comment 3 Hans de Goede 2010-04-20 09:50:03 UTC
Jens,

Hi, looking closely at the logs it seems you were told the disk had an invalid disklabel and were asked if you wanted to re-initialize and you then clicked no.

Is this correct, iow did you get a question to re-initialize the raidset and
did you click no ?

I believe you did and this resulted in 0 disks being available to install to, which resulted in the traceback you posted.

So it seems we have 2 issues here:
1) your partition table is damaged
2) when there is only 1 "disk", and its disklabel is not recognized, and
   the user chooses to ignore it, we traceback.

2) is something which we can fix in anaconda. 1 is something you'll have
to fix your self. Try switching to tty2 (ctrl + alt + F2) when you get the initialize disk question and then run "fdisk /dev/md127", fdisk may auto fix some things, and maybe you can see something like a clearly wrong partition which you can remove. Then choose "w" to write the fixed table (danger, back first!).

Once you've hopefully fixed your disk, verify this by doing
parted /dev/md127 p

If that now reads the table successfully things are ok now, reboot and the next
install attempt you should not get the initialize disk question.

Thanks,

Hans

Comment 4 Jens Tingleff 2010-04-20 20:32:01 UTC
Indeed, that is correct. It's all coming back to me...

Seriously, I had been experimenting with three different distributions, and not started from scratch every time.

I did indeed get a question to re-initialise and thought that it was "merely" asking me if I wanted to start from scratch so I selected "no."

I tried starting from scratch with Fedora (initialised the RAID array, installed Windows, started Fedora installer and defined a mount point for a primary partition) and it worked, i.e. I got installed and can now boot.

So, after fixing the problem with my partition table, the install worked.

Thanks for the prompt help

Best Regards

Jens

Comment 5 Hans de Goede 2010-04-21 08:14:48 UTC
Jens,

Thanks for the prompt answer, that means that the only anaconda issue we have is:

2) when there is only 1 "disk", and its disklabel is not recognized, and
   the user chooses to ignore it, we traceback.

I'll write a fix for that.

Comment 6 Hans de Goede 2010-04-23 08:23:17 UTC
(In reply to comment #5)
> Jens,
> 
> Thanks for the prompt answer, that means that the only anaconda issue we have
> is:
> 
> 2) when there is only 1 "disk", and its disklabel is not recognized, and
>    the user chooses to ignore it, we traceback.
> 
> I'll write a fix for that.    

Fixed in master.

Comment 7 Vassili Leonov 2010-10-06 06:29:21 UTC
Created attachment 451814 [details]
Attached traceback automatically from anaconda.

Comment 8 Sandro Mathys 2010-10-14 18:07:08 UTC
Created attachment 453523 [details]
Attached traceback automatically from anaconda.

Comment 9 Sandro Mathys 2010-10-14 18:24:54 UTC
I seem to just have reproduced this with the F14 Final TC 1.1 KDE i686 livecd.

I think I also have an Intel Matrix RAID controller and an RAID10 is active. But I didn't get any disklabel popup that I could choose yes/no from.

Instead, after I chose the specialized storage device, I was told:

"Disks sda, sdb, sdc, sdd contain BIOS RAID metadata, but are not part of any recognized BIOS RAID sets. Ignoring disks sda, sdb, sdc, sdd."

...which is really weird as I just chose that very bios raid in the specialized storage device part. Also, the set was recognized in F14 Alpha/Beta.

So I clicked okay on that popup (only choice) and went ahead. When trying to review the partitioning layout I got the above traceback.

Weird: out of 4 tries I could only reproduce this 3 times. I think I did the very same for 4 times, though.

Re-assigning/-opening this bug and adding blocker according to this (beta) test case:
https://fedoraproject.org/wiki/QA:Testcase_Install_to_BIOS_RAID

Comment 10 Brian Lane 2010-10-15 16:18:59 UTC
Note that comment 9 is different than the original bug. With anaconda 14.19 and an uninitialized disk it does not backtrace if you choose not to initilaize the disk. It shows the dialog that hansg added, so the problem in comment 9 must be taking a different path to hit the traceback.

Comment 11 Adam Williamson 2010-10-15 16:32:40 UTC
Reviewed at the blocker meeting of 2010/10/15. Accepted as a blocker under (Beta) criterion "The rescue mode of the installer must be able to detect and mount (read-write and read-only) LVM, encrypted, and RAID (BIOS, hardware, and software) installations ". Anaconda team to investigate. The patch for the initial issue is in the F14 tree, so current reporter must be hitting it via a different path as Brian says.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 12 Brian Lane 2010-10-15 17:28:02 UTC
I think I have a patch for the traceback, but I don't have any experience with BIOS RAID so I do not know what potential side-effects the fix could have. Sandro reports that the patch does stop the traceback but that the install won't proceed (because there is nothing to install to). Maybe we need to open a separate bug for the BIOS RAID issue.

I'd like to get hansg's input on this.

Comment 13 Brian Lane 2010-10-15 17:28:49 UTC
Created attachment 453762 [details]
Proposed Patch

Comment 14 Hans de Goede 2010-10-15 17:43:02 UTC
Quoting my reply to the patch on anaconda-devel list (-:

This look correct to me. hidden formats are either raid or multipath
members. If those are present a raid-set (which in this case counts
as a disk as we're talking firmware raid here not regulat mdraid) or
a multipath disk (rather then a member) should also be present. If not
then indeed we have no usable disks.

Ack.

Regards,

Hans

Comment 15 Sandro Mathys 2010-10-15 18:20:05 UTC
Created attachment 453773 [details]
lsmod from livecd while installer was running

Comment 16 Sandro Mathys 2010-10-15 18:20:40 UTC
Created attachment 453774 [details]
lsmod from dvd while anaconda was running

Comment 17 Sandro Mathys 2010-10-15 18:34:46 UTC
Okay, time for an update from my side after having poked a bit deeper into it and after talking about it with Brian on IRC.

- I can still reproduce this in 3/4 of all tries with the LiveCD
- I can NOT reproduce this at all with the DVD

Therefore Brian asked me to attach the different lsmod outputs from both media which I just did.

Let me make my report of how this currently runs a bit more detailed.

- There's 4 disks in this system, sd{a,b,c,d} - and only those four
- All are 100% consumed in the bios raid with level 10
- The bios raid storage is shown in the installer in the specialized storage tab
- The bios raid storage works (there's windows on it and the dvd installer can write to it)

Sequence:
1) I choose the bios raid storage from the specialized storage tab
2) I get an error msg about my single disks:
"Disks sda, sdb, sdc, sdd contain BIOS RAID metadata, but are not part of any
recognized BIOS RAID sets. Ignoring disks sda, sdb, sdc, sdd."
3) if I continue I'll get the traceback once I try to review the partitioning layout

The fix Brian mentioned steps in AFTER 2) and shows "no disks found" which only "back" and "exit" as options, even if I chose the bios raid storage before.

So while Brian's fix might fix the traceback, it doesn't solve my two problems:
- error message about the single disks is shown
- can't install on bios raid

If there's anything more I can try kindly let me know.

Comment 18 Chris Lumens 2010-10-18 16:25:16 UTC
The lsmod differences are largely iscsi-related, though the livecd does not have linear loaded which might be important:

--- dvd 2010-10-18 12:22:13.000000000 -0400
+++ livecd      2010-10-18 12:22:19.000000000 -0400
@@ -1,35 +1,36 @@
-aes_generic
+acpi_cpufreq
 async_memcpy
 async_pq
 async_raid6_recov
 async_tx
 async_xor
 cbc
-cramfs
+cpufreq_ondemand
 dm_crypt
 dm_multipath
 dm_round_robin
 drm
 drm_kms_helper
-edd
-exportfs
-ext2
 fat
 fcoe
 gf128mul
 i2c_algo_bit
 i2c_core
+i2c_i801
+iTCO_vendor_support
+iTCO_wdt
+ip6_tables
+ip6t_REJECT
+ip6table_filter
 ipv6
-iscsi_ibft
-iscsi_tcp
+joydev
 libfc
 libfcoe
-libiscsi
-libiscsi_tcp
-linear
 lrw
+microcode
 mii
-pcspkr
+mperf
+nf_conntrack_ipv6
 r8169
 radeon
 raid0
@@ -39,11 +40,26 @@
 raid6_pq
 scsi_tgt
 scsi_transport_fc
-scsi_transport_iscsi
+serio_raw
 sha256_generic
+snd
+snd_hda_codec
+snd_hda_codec_atihdmi
+snd_hda_codec_realtek
+snd_hda_intel
+snd_hwdep
+snd_page_alloc
+snd_pcm
+snd_seq
+snd_seq_device
+snd_timer
+soundcore
 squashfs
+sunrpc
 ttm
+uinput
+usb_storage
 vfat
-xfs
+xhci_hcd
 xor
 xts

Comment 19 Chris Lumens 2010-10-18 16:28:59 UTC
Just to rule that difference out, I'd try modprobing that module before clicking on the live installer icon.

Comment 20 Sandro Mathys 2010-10-18 17:02:39 UTC
modprobe linear just before starting the installer did not change the behaviour.

As agreed on today's blocker review meeting, I opened a new bug (#643967) for the remaining issues as the original problem (the traceback) was treated with a working patch.

I'm not sure what the current state of that patch is, so I'll leave it to bcl or clumens to change the status of this bug accordingly.

Comment 21 James Laska 2010-10-18 20:41:59 UTC
(In reply to comment #20)
> modprobe linear just before starting the installer did not change the
> behaviour.
> 
> As agreed on today's blocker review meeting, I opened a new bug (#643967) for
> the remaining issues as the original problem (the traceback) was treated with a
> working patch.

Thanks, moving this bug to MODIFIED.  This should be picked up by anaconda-14.21-1 which will be available shortly.  As you note above, we can track the remaining BIOS RAID issue as bug#643967.

Comment 22 Fedora Update System 2010-10-19 03:22:30 UTC
anaconda-14.21-1.fc14 has been submitted as an update for Fedora 14.
https://admin.fedoraproject.org/updates/anaconda-14.21-1.fc14

Comment 23 Fedora Update System 2010-10-19 09:06:52 UTC
anaconda-14.21-1.fc14 has been pushed to the Fedora 14 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update anaconda'.  You can provide feedback for this update here: https://admin.fedoraproject.org/updates/anaconda-14.21-1.fc14

Comment 24 Fedora Update System 2010-10-19 22:23:23 UTC
anaconda-14.21-1.fc14 has been pushed to the Fedora 14 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 25 Vasiliy Sharapov 2011-03-03 23:39:35 UTC
Created attachment 482178 [details]
Attached traceback automatically from anaconda.

Comment 26 sunspot007 2011-08-01 18:04:01 UTC
Created attachment 516188 [details]
Attached traceback automatically from anaconda.

Comment 27 sunspot007 2011-08-06 18:20:35 UTC
Created attachment 517011 [details]
Attached traceback automatically from anaconda.