Created attachment 321029 [details] anaconda log of the issue Description of problem: Anaconda does not see the nvidia fakeraid set during installation. The test board used is ASUS P5N32-E SLI with nforce i680 chipset. Outside anaconda, dmraid -ay creates the /dev/mapper nodes as expected and dmraid -s -c identifies the set. Partitions on it can be mounted. Even with the nodes already created in /dev/mapper, anaconda does not offer them in the partitioning stage. Version-Release number of selected component (if applicable): Tested with F8 installation media, F9 installation media and live cd, F10 beta installation media and live cd. How reproducible: Boot install media and live cd, continue up to the stage of partitioning. Actual results: Anaconda sees each drive composing the raid set, separately. Expected results: Anaconda should see the raid set and offer partitioning and installation on it. It should also be possible to see to detect the already existing /dev/mapper nodes in case they are created outside the installer. Additional info: There is no nvraid metadata corruption on the array and in fact partitions on it can be mounted manually. Also tested the raid set with Ubuntu Intrepid Ibex live cd's installer which can see and install on the raid set once dmraid -ay has been run.
Created attachment 321030 [details] output of dmraid -tay -dddd -vvvv -f nvidia
mmm. what version of dmraid are you using. Im still seeing the failures in dmraid. FYI, I'm probably the one that has the wrong dmramd version. Just want to confirm.
just to be clear: you see this behavior in f8 and f9? Because I tested isw on f8 installs and everything works fine. couldn't do it with f9 because of the samsung harddrive bug, but I'm pretty sure it works there as well. Only in rawhide I see failures.
Yes this happens in F8 and F9 as well. The only difference is that in F8 I get one extra line after 'scanning for dmraids' in the log as I noticed now: WARNING: /usr/lib/anaconda/dmraid.py:129 Userwarning: device node created in /tmp isys.makeDevInode (d, dp) That's the only difference. I also just tried F9 i686 live cd to rule out architecture. Same behaviour. My rawhide live cd comes with 1.0.0.rc14-8.fc10 but I've also tried updating via yum to 1.0.0.rc15-1.fc10 which is the latest offered in the depositories. No change, exactly the same log output.
dmraid.py:129 points to this function: def nonDegraded(rs): log.debug("got raidset %s (%s)" % (rs, string.join(rs.member_devpaths))) log.debug(" valid: %s found_devs: %s total_devs: %s" % (rs.valid, rs.rs.found_devs, rs.rs.total_devs)) if not rs.valid and not degradedOk: log.warning("raid %s (%s) is degraded" % (rs, rs.name)) #raise DegradedRaidWarning, rs return False return True This function's output doesn't appear in my log. The set is certainly not degraded though.
(In reply to comment #4) > Yes this happens in F8 and F9 as well. The only difference is that in F8 I get > one extra line after 'scanning for dmraids' in the log as I noticed now: mmmm, I'm beginning to think that this has to do with your specific dmraid. That is, this bug will be reproducible using nvidia dmrad but will not show when using isw (like I am using) and with regards to comment #5, anacondas dmraid module must detect the dmraid sets first. so it those are not being detected, the code in nonDegrade will not execute. Furthermore, the fact that the raid sets are not being detected can be python-pyblocks fault or dmraid libraries fault.
I see. Perhaps there is an underlying bug in python-pyblock/dmraid lib that affects both cases? Otherwise they do seem separate bugs. Indeed, with my very limited python knowledge it seems to me that the rs object is not valid. Could someone check into this? I'm not comfortable in trying to debug this myself due to it being written in python and my lack of knowledge regarding fakeraid implementations...
I looked into this today. can you please do the following test and see what comes out? <snip> # `yum install python-pyblock -y` # python >>>import block >>>rs = block.getRaidSets(["/dev/sda","/dev/sdb"...]) >>> print rs </snip> In my environment it show the same errors as before. If this test show a valid list of raid devices, with no errors, I'd bet that the bug is in anaconda :)
Here's what the call to block.getRaidSets outputs: >>> import block >>> rs = block.getRaidSets(["/dev/sda", "/dev/sdb", "/dev/sdc", "/dev/sdd", "/dev/sde", "/dev/sdf"]) ERROR: pdc: zero sectors on /dev/sde ERROR: pdc: setting up RAID device /dev/sde ERROR: only one argument allowed for this option Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python2.5/site-packages/block/__init__.py", line 187, in getRaidSets for rs in c.get_raidsets(disks): block.dmraid.GroupingError: nvidia_efeegdab-1 Note: The /dev/sde related errors appear with running dmraid as well. I believe they appear because /dev/sde and /dev/sdf and not part of a raid set. Unplugging the drives removes those 3 errors but has no effect on the issue. That was my first troubleshooting step:). What I also noted though is that excluding those 2 drives from the call parameters didn't make any difference in the output.
Correction on the note: the third error line (ERROR: only one argument allowed for this option) is not present when running dmraid and it's not related to the non raid drives.
Verified that it is the same error with the extra drives unplugged: >>> import block >>> rs = block.getRaidSets(["/dev/sda", "/dev/sdb", "/dev/sdc", "/dev/sdd"]) ERROR: only one argument allowed for this option Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python2.5/site-packages/block/__init__.py", line 187, in getRaidSets for rs in c.get_raidsets(disks): block.dmraid.GroupingError: nvidia_efeegdab-1
Ok, now Im pretty sure that its pyblock. For now at least :) anaconda might need work as well. Let me work on some dmraid library tests and get back to you once I have something so you can run on your system.
FWIW I have nvidia dmraid and I'm seeing none of these problems, so this may be even more specific than just one class of dmraid. I'm moving this over to Target unless we get some more evidence that it hits a wider audience.
Could you try RAID 0+1 on your nvidia dmraid? As a troubleshooting step, I did try the set on the EVGA i680 one and the Asus P5N-E SLI. I see the same behaviour there.
Unfortunately I don't have enough disks to do anything other than a mirror or a two disk stripe. This isn't a preview blocker, hardly a release blocker. I'll move it over to F10Blocker just to get more visibility on it, but when push comes to shove, I don't think we'd delay F10 for this issue.
Although it does prevent installation of Fedora on this configuration (and I am almost certain it has to do with 0+1 and maybe not only on nvidia), I understand what you are saying. And it does sound very reasonable. It is sad that this managed to survive F8 and F9 though. I'll probably have to switch to another distro if this survives F10 as the only way to install Fedora on the dual-boot machines we use, would be to install on another drive and then manually move the installation on dmraid. Right now we temporarily use a vmware solution, hoping there will be some breakthrough with this.
I have an Asus Striker Extreme with the nVidia 680i chipset and have the exact same issues and can confirm the following: * this issue effects both F8 and F9. * this problem effects i386 and x64 systems. * this problem effects RAID 0 and 1 devices. The bizzare fact is that i was initially able to install F8_i386 but was never able to use rescue mode or reinstall after that. I was able to install F9_x64 with thanks to VMware and tar zcf.
note: bug 409931 appears to be the same issue as this bug
No https://bugzilla.redhat.com/show_bug.cgi?id=409931 is entirely different. In 409931's case installation freezes when dmraid is enabled. In this case there's no freeze. Anaconda just doesn't see the RAID set and installation can continue on non-dmraid drives normally without requiring nodmraid boot option.
You're right, but as this bug is effecting nVidia fake raid and we both have Asus mother boards I'm thinking that there may be some common ground. From browsing my log files while attempting an install I can see that anaconda is searching for partition information on the physical disk and not the raid set. If this can be of any help i have created a summary post https://bugzilla.redhat.com/show_bug.cgi?id=409931#c37 Also for the record when Anaconda starts normally, without any extra switches, it fails to 'create' the raid set nodes in /dev/mapper. It is also unable to see the existing nodes that are created when I manually run "dmraid -ay". As bug 409931 concerns an actual locking crash and this does not i'm happy to leave the two separated. thanks, Phil
Could you please attach your anaconda.log? Also let us know which RAID modes you have experienced this with?
Could you also do the test found on Comment #8 From Joel Andres Granados and check the if the output is identical to Comment #11 From Panagiotis Kalogiratos? If not could you also post the result of the test?
I've tried this in both the rescue mode, during installation and even with my live system and i get a Segmentation Fault. This effects both raid 0 and 1. stay tuned and i'll get an anaconda.log sorted
Do it while booting a live cd. You'll have a stable environment to install python-pyblock with yum, that way.
Worth a try with dmraid-1.0.0.15.rc1-2.fc10, but I don't know that that will fix nvidia.
Since this particular issue has been present since at least F8, this is not a regression, and we're not going to consider it a blocker for F10.
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle. Changing version to '10'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
I can ack that bug. F10 not installable using F10 (Final) LiveMedia. At Least not on the SoftRaid Devices. bug 468649 prevented me to detect my ICH9R SoftRaid (in Preview/Rawhide) reported @ bug 468649, comment 10 before. This seems to be fixed. Because I'm able to manually enable the Raid (dmraid -ay). It also creates the necessary nodes. But I have to enable dmraid manually ! The LIVE Image doesnt try to enable dmraid's on boot. Further anaconda (/usr/bin/liveinst) refuses to use the dm-devices as install disks (it only displays both disks seperately / sda and sdb), also after I enabled the raid manually. Btw: 00:1f.2 RAID bus controller: Intel Corporation 82801 SATA RAID Controller (rev 02) Subsystem: Hewlett-Packard Company Unknown device 2819 Is this problem only related to LIVE Media ?
Fixing version to align with rawhide again. Sorry for the noise.
No, it is not related to Live media. It happens with installation media as well. Also there's nothing fixed. We were always able to manually create the nodes via dmraid -ay. It's just anaconda that doesn't "see" them. Also if you read the discussion thoroughly, you'll see that anaconda doesn't run the dmraid binary or check /dev/mapper for existing devices, it uses python-pyblock and dmraidlib libraries to detect the raid set and apparently that's where the problem is. Checking /dev/mapper for existing dmraid nodes would be added (and useful imho) functionality but apparently when it was coded the logic behind it was that if it can't be detected using the libraries then it can't exist in /dev/mapper :)
Same problem here with Intel ICH8R RAID using Fedora 10 Final. I currently run Fedora 9, and it sees the RAID.
Apologies for an extended absence but I have been void of net access. This is a little off topic but i'm an RHCE who's hacked open many an initrd to make fedora work with my pc and would like to volunteer for the testing of any potential fixes. who do i contact? Note: From my testing this bug adversely effects every method of installing Fedora 8/9/10 i386/x86_64 be it boot iso, pre-upgrade or live media. If there is any extra information required I will try to dig it up.
I have the same problem here with a Dell Precision 5400 machine, Intel 5400 Chipset (aka Seaburg). lspci reports "RAID bus controller: Intel Corporation 631xESB/632xESB SATA RAID Controller (rev 09)". Fedora 9 works with the SATA RAID, when I tried to install Fedora 10, it didn't recognize the RAID, but reported two independent disks. To find out, if this was only an anaconda issue, I installed Fedora 9 (freshly) and then did a yum-based upgrade to Fedora 10 (as described in http://fedoraproject.org/wiki/YumUpgradeFaq), to avoid having to use the F10 installation media. Unfortunately this didn't solve the problem either. So I'll have to reinstall a new F9 and wait (somewhat frustrated) for updated F10 setup media, apparently...
Guys, why is this marked as Priority: medium Severity: medium In what world is this a "medium" severity bug?!? This is the 1st Fedora ever that does fails miserably in the installer after the first couple of screens on all the boxes I got my hands on (about 4). And all this while the official party line is that it's the first Fedora release that didn't go out with significant bugs outstanding... This is an obvious regression from previous releases, and a serious one at that. Any chance to see a fix any time soon?
It's what the moderators and administrators see fit as doing so we should trust that their decision is informed and one of sane reasoning. Like I mentioned in a previous post I am more than willing and able to help guinea pig any potential fixes. A work around if you have got a functioning installation of Fedora is to follow the upgrade path linked by Wolfi from post #33. I attempted this last night and succeeded with exception of GRUB. It tried to, unsuccessfully, probe the BIOS drives and failed therefore could not properly install itself - both grub-install and grub. My solution to this was to use Ubuntu to manually mount my raid0 and 1 drives and then run grub with --device-map=/mnt/root/boot/grub/device.map The reason the yum upgrade worked was that it does not use Anaconda and instead uses what the system does - dmraid and its libs. From what I have deciphered about this bug Panagiotis in post #30 hit the nail on the head. As systems work very reliably and stably with dmraid (any version) and its supporting libs we can narrow this issue down to pyblock and its inability to properly use the now updated dmraid-libs. I will try and acquire a copy of F7 as I started having issues from F8 and report back on any con/progress..
Just want to state, that i have the same problem here with Intel ICH9R RAID using Fedora 10 Final DVD. Anaconda is not able to see the Raid, even if manually activated via "dmraid -ay", it always sees the single drives. Fedora 8 Final DVD installation works fine and without any manual action. lspci | grep -i sata 00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA AHCI Controller (rev 02) best regards
For those of you who are having trouble with fakeraid 1 or 0 setups and F-10, I have made 2 updates.img images available here which have a good chance of fixing your troubles. If I understand the original reporter properly then he us using raid 10 (or 1 + 0) or 01 (4 disk setups), this is currently not supported in Fedora (fixing this is planned for F-10). Plain fakeraid 0 or 1 setups using 2 disks, should work. But they do not in certain cases in F-10 (also see bug 474399). For all those with a 2 disk (raid 0 or 1) setup who are having trouble in F-10 please give these updates.img imagea a try, chances are good they will fix your issues: http://people.atrpms.net/~hdegoede/updates474399-i386.img http://people.atrpms.net/~hdegoede/updates474399-x86_64.img To use this with an i386 install using isw "hardware" raid type the following at the installer bootscreen (press <tab> to get to the cmdline editor): updates=http://people.atrpms.net/~hdegoede/updates474399-i386.img For an x86_64 install use: updates=http://people.atrpms.net/~hdegoede/updates474399-i386.img Please let me know if this resolves the issue for you.
Hans you are correct the system used is indeed a 4 disk fakeraid 0+1. I will not test your patches as you said they do not apply to this case. Since you are more knowledgeable on the issue, is there any way to force anaconda to install on a RAID 0+1 set when the device mapper nodes are already created via dmraid? Bypassing the built-in python-pyblock/dmraid lib path? Also, is RAID 5 currently supported?
(In reply to comment #38) > Hans you are correct the system used is indeed a 4 disk fakeraid 0+1. I will > not test your patches as you said they do not apply to this case. > Ack. > Since you are more knowledgeable on the issue, is there any way to force > anaconda to install on a RAID 0+1 set when the device mapper nodes are already > created via dmraid? Bypassing the built-in python-pyblock/dmraid lib path? > > Also, is RAID 5 currently supported? I'm afraid the answer to both of those questions is no. Fixing both for F-11 is high on my todo list (but not at the top I'm afraid). For F-10 all I can do in the cases where dmraid does not work 1 one go, is to advice to not use dmraid but instead use software raid. With that said I really appreciate all testing people have been doing for me, thanks! And I hope they will continue to do this, without wide testing coverage chances are some dmraid types will once again fail in F-11!
> I'm afraid the answer to both of those questions is no. Fixing both for F-11 is > high on my todo list (but not at the top I'm afraid). For F-10 all I can do in > the cases where dmraid does not work 1 one go, is to advice to not use dmraid > but instead use software raid. Roger that. Unfortunately we can not use softraid because we need dual boot capability on these machines we're setting it up on. The rest of our systems are working fine with softraid:) > > With that said I really appreciate all testing people have been doing for me, > thanks! And I hope they will continue to do this, without wide testing coverage > chances are some dmraid types will once again fail in F-11! And we do appreciate your hard work and time devoted to this:)
Panagitis: Can you please test with http://jgranado.fedorapeople.org/temp/raid10.img. In my tests this allowed installer to correctly detect the raid10. thx
(In reply to comment #41) > Panagitis: > > Can you please test with http://jgranado.fedorapeople.org/temp/raid10.img. In > my tests this allowed installer to correctly detect the raid10. > > thx Just tested with x86_64 installation media. Freezes after keyboard selection, anaconda log's last line is: DEBUG : scanning for dmraid on drives ['sda', 'sdb', 'sdc', 'sdd', 'sde', 'sdf'] Nothing out of the ordinary in anaconda log or dmesg. You didn't mention if the image is architecture-specific so I am downloading i386 DVD image to test. The i386 F10 image I had available was an outdated rawhide one :/ Will update on that in about 20 minutes.
F10 i386 installation media testing only helped in finding out dmmodule.so is ELFCLASS64 :)
Sorry for the confusion. The image has the python bindings from pyblock, they were build on an x86_64 machine. So if you test an f10, x86_64 install it will probably work for you.
That is what I tested at Comment #42. As I said it freezes just after I click next on keyboard selection. Let me know how I can troubleshoot further for you.
additional work is in this next image. Would appreciated if you tested with it. It works for me. its x86_64. and it works on f11 and f10. http://jgranado.fedorapeople.org/temp/raid10-2.img
Exactly the same behaviour as the previous image. I do not know if it has anything to do with it but I should let you know that my test machine has 2 extra disks connected to the same controller which are not part of the raid set. I can see this error related to the above information on tty1: ERROR: pdc: zero sectors on /dev/sde ERROR: pdc: setting up RAID device /dev/sde ERROR: only one argument allowed for this option This error has always been present though as long as those extra disks are connected. Unfortunately I am unable to disconnect these drives right now, for the purpose of testing (The chassis is locked and I don't have the keys with me).
ok, lets try to get past keyboard selection. (based on comment 42, thats where its going boom). Can you please try an install with a ks file that at least defines the keyboard and language so you can be dropped in the root passwd window. Something like: <snip> lang en_US.UTF-8 keyboard us <snap> This added to your ks, would get you passed the keyboard selection.
I will try setting language and keyboard with kickstart in a couple of hours (no test machine here). Are you sure though that this would achieve anything? Although it gets stuck after clicking next on keyboard selection, anaconda.log shows that it freezes at: <snip> DEBUG : scanning for dmraid on drives ['sda', 'sdb', 'sdc', 'sdd', 'sde', 'sdf'] Correct me if I'm wrong but that leads me to believe that the freeze is dmraid related and the fact that it freezes after keyboard selection is because anaconda doesn't reach the point where the GUI gets updated with the next page?
Created attachment 330782 [details] Anaconda log of the second image The only thing the kickstart did was replacing the keyboard selection screen with a blank one. I've attached the anaconda.log of the raid10-2.img. The log is identical with both images. I did notice something else however that was only visible on tty3: ---------------- FATAL: Module dm_mod not found. FATAL: Module dm_zero not found. FATAL: Module dm_mirror not found. FATAL: Module dm_snapshot not found. ---------------- These errors get thrown exactly after the following error that can be seen in the log: ERROR : LOOP_CLR_FD failed for /tmp/update-disk /dev/loop7: No such device or address
Panagiotis: Ok, I thought you were on the keyboard selecton from stage1. sorry about that. The FATALs are a cause of worry for me. What initrd are you using? Can you post it somewhere so I can use it in my tests and/or explore it? is it custom made? what fedora is it?
on further investigation the FATAL errors are "normal" since now those modules are built into the kernel, the installer will not find them as modules. I'm running out of ideas here :(. Does going to tty2 and executing `dmraid -ay` work at all for you? you would have to install without the kickstart from comment 50. If it does not work for you It would be interesting to see the output of `dmraid -ay -vvv -ddd` (Sorry if you have already answered this).
Created attachment 330882 [details] dmraid -ay -vvv -ddd output I am using the standard initrd from the x86_64 F10 installation DVD. I've uploaded it here: http://rapidshare.com/files/193864110/initrd.img dmraid -ay -vvv -ddd runs fine on tty2 and creates the /dev/mapper nodes. Log attached.
Joel if you haven't figured out what's wrong, perhaps you could prepare a debugging build with verbose output during that stage that would help you locate the bug in the program flow? That's what I do on obscure bugs that I can't replicate myself, extra output on certain checkpoints/control statements to locate exactly where it breaks. Just an idea..
Created attachment 332084 [details] dmraid -ay -vvv -ddd output on my system
Panagiotis: Pls try this image: http://jgranado.fedorapeople.org/temp/raid103.img. We have successfully tested in an nvidia environment and should work for you. Pls post any results.
Created attachment 332201 [details] TGZ of installation logs Mobo: Asus Striker Extreme - nVidia 680i chipset RAID: nvidia_afcjhada - RAID0 sda + sdb (2x 74GB Raptors) nvidia_bhbhgabe - JBOD sdc + sdd (2x 750GB Seagates - this has changed from my previous mirrored 1TB array) Partitions: nvidia_afcjhadap1 = Windows XP x64 nvidia_afcjhadap2 = D:\ nvidia_afcjhadap3 = / nvidia_afcjhadap4 = swap nvidia_bhbhgabep1 = /boot nvidia_bhbhgabep2 = /jbod (data) nvidia_bhbhgabep3 = swap Media: Fedora 10 Final x86_64 Update: raid103.img result = semi happy The patch worked for detecting my striped discs and giving me the option to upgrade my existing installation or do a fresh install but failed with my JBOD. This is a problem as I need to have the JBOD working for /boot - unless there is a way to make GRUB use striped disks properly, it appears to insist on using block devices only instead of using the RAID devices. I was reluctant to go past the upgrade/install step as Fedora now formats and wipes discs earlier in the install process and I can't risk loosing the data. Plus the crap I have to go through to reinstall windows if it turns pear shaped is not worth the stress... :) When I made a mirrored array with sdc/d, created partitions and formatted them, the installation with updated img worked flawlessly. I changed back to JBOD for the second set and was presented with one failure message per partition when it probed for existing installations. Everything appeared to continue as per normal after cancelling these errors - retries did nothing. I have attached a tgz of all my logs which hopefully can help with tracking down this "new" problem. As anaconda previously locked before this stage I can not say for certain if it is indeed new. I even redirected the output of dmraid -ay -vvv -ddd to the file "dmraid.out.log" well done for the raid fix and thanks a million for your work thus far.
(In reply to comment #57) > Created an attachment (id=332201) [details] > TGZ of installation logs > Hi Phil, Thanks for testing. Any chance we could discuss your issues somewhat more interactively on IRC? I'm hansg on freenode, you can find me in #anaconda.
Created attachment 332303 [details] Anaconda log of raid103.img Joel: You almost have it on this one. It finds the raid set but fails to activate it. Then throws criticals on parted. Log attached.
Panagiotis: Totally expected for an f10 install. This was an image for f11. Since we have changed pyparted package, we had to change pyblock to adjust for the change. I have made an image that should work for F10 got to http://jgranado.fedorapeople.org/temp/raidF10.img and if you are interested in testing rawhide (would be very helpfull) you can go to http://jgranado.fedorapeople.org/temp/raidF11.img thx.
Created attachment 332458 [details] Anaconda log of raidF11.img on rawhide Joel: Right on the spot mate:) F10 works like a charm with raidF10.img. Also tested raidF11.img with F11 alpha pre-release. Same behaviour with the previous image against F10. Tested 17th of February boot.iso rawhide against raidF11.img. This seemed to work fine. But when it entered partitioning stage it crashed. Not sure if the crash is related to our issue here...there were several other glitches with my test system. In any case the log is attached.
Clarification: raidF11.img on F11 alpha pre-release has the same behaviour as the test on comment #59. I guess it's based on a later image.
A lot of structural changes have gone into anaconda partitioning code. (we are rewriting the whole thing :) so stuff that has been discussed here might not be relevant anymore. I would like to redirect the interested parties to the new dmraid tracking bug that will have all the dmraid related issues for the new anaconda. There are a number of underlying causes to these problems, all of which have been identified and fixed in rawhide we believe. Unfortunately rawhide is currently not in a good shape to ask you to test it. We hope to organize a dmraid test day, within some weeks, where we will ask the community to test dmraid support in rawhide (the upcoming F-11 development version). In the mean time we are closing all the open anaconda dmraid bugs, against a single master bug, for easier tracking, as all the open bugs have the same underlying cause (2 bugs in pyblock, which have been fixed). If you're interested in participating in the test day, please add yourself to the CC of the master bug, I will add a comment there with a pointer to the announcement for the test day as soon as the date has been fixed.
*** This bug has been marked as a duplicate of bug 489148 ***