Bug 643967
Summary: | LiveCD fails to use BIOS RAID, DVD works | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Sandro Mathys <sandro> | ||||||||
Component: | anaconda | Assignee: | Anaconda Maintenance Team <anaconda-maint-list> | ||||||||
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | low | ||||||||||
Version: | 14 | CC: | anaconda-maint-list, awilliam, dcantrell, hdegoede, jonathan, vanmeeuwen+fedora | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | All | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2011-12-21 18:02:14 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Sandro Mathys
2010-10-18 16:57:35 UTC
Blocking F14Blocker after failing this test case: https://fedoraproject.org/wiki/QA:Testcase_Install_to_BIOS_RAID Created attachment 454194 [details]
anaconda-logs.tgz (/tmp/*log*)
I've tested and reproduced this failure using the F-14-TC1-Desktop-x86_64 live image (with anaconda-14.20-1 and python-pyblock-0.49-2 installed). This system uses an nVidia RAID controller
Created attachment 454212 [details]
anaconda-logs.tgz (/tmp/*log*) -- using anaconda-14.20-1, lvm2
Just retested using a custom-built x86_64 live image containing the following updates:
* kernel-2.6.35.6-44.fc14.x86_64
* anaconda-14.20-1.fc14.x86_64
* lvm2-2.02.73-3.fc14.x86_64
* python-pyblock-0.51-1.fc14.x86_64
The problem remains :(
Just voicing opinion here, I think it might be acceptable to common bug the fact that you can't do (some?) bios raid installs from live images, and instead need to use the DVD. I'm going to test my dmraid set here, which has been pretty solid, just for more data points, but as it stands, I'd classify this as Nice to Have as opposed to release blocker. So I've been testing with TC1, and it has some interesting results. After a fresh creation of the disk array, I can install to it with the Desktop Live image, reboot and things are happy. If I try subsequent installs with the live, things go south. There is no traceback, and it detects the array OK, but when i try to install to the array even using all the whole disk, the installer reports that there is not enough free space on the device to perform the installation. So, I still think we can just common bugs this. Hi all, As the former anaconda BIOS RAID person and someone who clearly still cares about Fedora and anaconda, I've taken the liberty to look into the log attached here and attached to bug 583906. So there are 2 completely different failures being seen here. 1) looking at the logs from jlaska's first installation attempt on the nvidia-raid set. There is the following: 18:52:08,150 DEBUG storage: device 'nvidia_cgfidbddp2' not in exclusiveDisks 18:52:08,150 DEBUG storage: ignoring nvidia_cgfidbddp2 (/devices/virtual/block/dm-4) jlaska, it lookes like you went the advanced / server storage route and found a bug there with dmraid using BIOS RAID's even when the raid set is in exclusive disks we wrongly ignore the partitions on the raid-set. That would be bug 1. 2) looking at the first logs of the installation attempted in bug 583906 it says: devices to scan for multipath: ['sda', 'sda1', 'sda2', 'sda3', 'sda4', 'sdb', 'sdb1', 'sdb2', 'sdb3', 'sdb4', 'sdc', 'sdc1', 'sdc2', 'sdc5', 'sdc6', 'sdd', 'sdd1', 'sdd2', 'sdd5', 'sdd6', 'sr0'] Note there are no md* devices there, so the livecd failed to activate the raid set on boot. This means that there either is an issue / incompatibility between the set and mdraid, but then I would expect DVD installs to also fail, or there is an issue with the initscripts / udev rules which makes them fail to bring up the set. Note that once anaconda is started, it is way too late to bring up the raid set, as quite likely some partitions on the raw disks are already in use directly making the disks in use and thus making it impossible to activate the raid set at this point. IOW I believe this is not an anaconda bug. So I think this bug should be split, given that the logs for the nvidia issue (1) are attached here, I think it would be best to open a second bug for the failing of activating mdraid using BIOS RAID arrays from the livecd. Regards, Hans note that we still have rd_NO_MD and rd_NO_DM in the boot parameters for the live CD. That's one related difference between the live image and the DVD that I can think of. We do this specifically so the arrays won't be constructed at boot time as we felt that wasn't appropriate for the live environment (generally the live environment shouldn't touch permanent storage on the system until you expressly tell it to). We don't have 'noismwraid' (or whatever exactly it is) any more, though. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers dlehman has a proposed patch for this: http://fpaste.org/S0GQ/ Sandro will test soon. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers the proposed patch is for James' bug, the one Hans calls 'bug 1', not Sandro's bug, which Hans calls 'bug 2'. James, can you please test the patch when you get time? Thanks. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Removing rd_NO_MD and rd_NO_DM options on the boot line didn't change the behavior. I think I just hit something, though. my bios raid is /dev/md127 - but for an unknown reason the livecd seems to create /dev/md0 sometimes. I currently *think*: - if md0 is present, the I hit the issue and md127 isn't created - if md0 is not present, I do NOT hit the issue and md127 is created Just that md0 shows up, vanishes, shows up again...with no reboot, i.e. while running the livecd. I didn't find out yet what I do to create/remove it, i.e. what triggers md0 showing up/vanishing. Does anyone have a guess what is responsible for md0 (instead of md127)? Created attachment 454420 [details]
/var/log/messages showing md0 and errors
I split jlaska's bug off as https://bugzilla.redhat.com/show_bug.cgi?id=644616 . This report should be for Sandro's bug. Jesse can report another, if he likes, as his issue is clearly not jlaska's or Sandro's :) -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers hansg, adamw and I we were able to find the root cause and a valid workaround for this bug (i.e. my bug, not James' which seems to be a different one after all). Because there's an available easy workaround, we decided to lift blocker status. ##### Let me first mention the WORKAROUND here: DON'T do anything that might mount, write to or activate your disks/raid arrays before starting or while running the installer. This includes not to start the installer again after it's already been running once already. If you violated this rule, reboot before you start the installer! ##### Now let me try to explain what actually happens here :) Because I have a raid10, there's both, /dev/md0 and /dev/md127. Both are necessary and valid . 1) Now if I boot the livecd, NONE of them are activated. 2) If I start the installer, both are still missing. If I get to the point where the devices are examined, BOTH are activated. This means I do NOT hit the issue here. 3) If I now cancel the installation, BOTH are still activated. (note: this is very different from 1) 4) If I start the installer, ONE is DEactivated the other is still activated. If I get to the point where the devices are examined, BOTH are activated (or rather tried to). But because one is already activated (i.e. the disks it consists of or not already busy/in use) this will fail. This means I DO hit the issue here. 4) If I now cancel the installation, still only ONE is activated. (note: different from both, 1 and 3) 5) If I start the installer, ONE is DEactivated and with the other still DEactivated - now NONE of them are activated. <start at 1 again> ##### So the above mentioned workaround makes sure the installation always starts with situation 1 as situation 3 is fatal. The fix still needed in this bug needs to deal with situation 3, i.e. instead of only deactivating one of both raid arrays, ALL should be deactivated when the installer is started. hans, you said you had a potential fix for this, right? sandro, did you test it yet? I lose track. If we have a fix for this I'd quite like to take it for RC1 if we can make it in time. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Hans made me test some patch which didn't seem to help - not sure what patch that was exactly but I think he indeed said it should fix this issue in some way. Hans, please clarify - and re-post patch if it applies here, thanks. (In reply to comment #14) > hans, you said you had a potential fix for this, right? sandro, did you test it > yet? I lose track. If we have a fix for this I'd quite like to take it for RC1 > if we can make it in time. > Erm, no. dlehman has a patch which should fix the issue with running the installer a second time on the same livecd boot. The much bigger issue of the livecd not activating the RAID sets has not been fixed yet (and I don't know if it even has been looked into yet). This (failing to activate the BIOS RAID sets on livecd boot) should be considered a blocker, as it can cause serious data corruption. Regards, Hans Well, if you fix the bigger issue, the workaround won't work anymore and this minor bug here becomes a blocker again as well :) Hans, is dlehman's fix in anaconda 14.22 already? (In reply to comment #17) > Well, if you fix the bigger issue, the workaround won't work anymore and this > minor bug here becomes a blocker again as well :) > You may have a valid point there. > Hans, is dlehman's fix in anaconda 14.22 already? No. I can no longer reproduce any of the bugs mentioned in this report with F14 Final TC6 i686 Gnome LiveCD. As a reminder: so far I used a F14 Final TC1.1 i686 KDE LiveCD. So either this issue vanished between TC1.1 and TC6 or it's KDE-only. Unfortunately, there's no TC6 KDE spin available to verify the latter. Can reproduce with rc1 kde live - not sure what's so different between the kde and gnome spins in regard of bios raid :/ i.e. live boot -> no /dev/md* at all Wait, ignore that last comment - let's just pretend it's not there at all. Still need to reproduce this, but obviously I need some sleep before I do so. you got tc1 and rc1 mixed up, didn't you? =) -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers I wished I did! It's much worse than that - I used the wrong system to reproduce, i.e. there was no such thing as a bios raid in there :/ okay, step AWAY from the crack pipe, sandro =) -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers So, this is still a bug? Or it's fixed now? What's the status? on the basis of comment #13 there's an issue with running the installer twice on Intel BIOS RAID (possibly only RAID 0). It'd be good if Sandro could say for sure whether he can still reproduce with RC1, his later posts seem confused. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Okay, I can reproduce my initial issue on both, KDE and Gnome RC1 x86_64 live spins right now. The workaround still works. Regarding the raid assembly I see a really weird behaviour and the two spins indeed do behave differently. Right now it seems to be assembled after boot, but not in the same way on both spins. Also, they seem to be activated/deactivated differently after running anaconda several times. And yet I see the original issue with every second run of anaconda. In case it doesn't seem to make sense what I wrote here - the behaviour I see really doesn't seem to make much sense to me either. Maybe it would make sense after another couple of tried with both spins but I currently lack the time to do that. I hope hansg can eventually reproduce a difference with the two spins and maybe some other weird behaviour, though. anaconda doesn't really clean up after itself following a run, since it's not expected to be run again. This is yet another thing that sucks about the livecd. If you cannot produce the problem on the initial first run of anaconda, I am inclined to not really do anything about it. |