Description of problem: The system boots fine as a mirrored [RAID 1] array. If one of the two drives in the RAID is disconnected the system fails to boot. The log indicates "device-mapper: reload ioctl failed: No such device or address" "device-mapper: table ioctl failed: No such device or address" "nash received SEGSEGV" Version-Release number of selected component (if applicable): % uname -r 2.6.27.12-170.2.5.fc10.i686.PAE % rpm -q nash nash-6.0.71-3.fc10.i386 % rpm -q dmraid dmraid-1.0.0.rc15-2.fc10.i386 % rpm -q device-mapper device-mapper-1.02.27-7.fc10.i386 How reproducible: 100% Steps to Reproduce: 1. My motherboard is an EVGA X58 SLI. The motherboard supports fakeraid via JMicron JMB363. I have 2X 1Terabyte drives configured in a RAID 1 configuration. The drive is partitioned such that I can dual boot. WindowsXP is on one partition. Fedora 10 is on another partition. The system has only these two 1Terabyte drives in the box. 2. Once the OS is installed correctly, I powered down the system and disconnected one of the drives, in order to simulate a hardware failure of one of the drives in the RAID. Upon powering up, the Fedora 10 installation fails to boot. Actual results: device-mapper: reload ioctl failed: No such device or address device-mapper: table ioctl failed: No such device or address nash received SIGSEGV! Backtrace (15): /bin/nash[0x8054e0d] [0xe4240c] /usr/lib/libnash.so.6.0.71[0xfbb3cc] /usr/lib/libnash.so.6.0.71[nashDmDevGetName+0x5a)[0xfbc31f] /usr/lib/libnash.so.6.0.71[0xfb861c] /usr/lib/libnash.so.6.0.71[0xfb874d] /usr/lib/libnash.so.6.0.71[nashBdevIterNext+0x106)[0xfb8bd9] /usr/lib/libnash.so.6.0.71[0xfb8e78] /usr/lib/libnash.so.6.0.71[nashFindFsByUUID+0x2e)[0xfb8efd] /usr/lib/libnash.so.6.0.71[nashAGetPathBySpec+0x8e)[0xfb9074] /bin/nash[0x804f6bf] /bin/nash[0x8054c78] /bin/nash[0x80553d7] /lib/libc.so.6(__libc_start_main+0xe5)[0x5fa6e5] /bin/nash[0x804b2b1] Expected results: The system should continue to operate correctly on the remaining drive in the RAID. Additional info: If I boot on the WindowsXP partition, the system runs correctly on the remaining drive in the array.
This is most likely caused by the initrd doing its own dm table creation. In rawhide we no longer do that. If you're interested in testing of the new rawhide mkinitrd indeed fixes this, try installing mkinitrd-6.0.76 or newer from rawhide and then regenerating your initrd, after this you should be able to still boot. Atleast assuming that dmraid can activate a mirror set even if only one drive is present, Heinz ?
Sorry, mkinitrd-6.0.76 does not yet have the changes I was refering too. If you want to test the mkinitrd way of handling dmraid use this mkinitrd script: https://bugzilla.redhat.com/attachment.cgi?id=331781 Before using it make sure you have nash-6.0.71-4 installed (from updates-testing) and that you've upgraded your dmraid to this version: http://koji.fedoraproject.org/koji/buildinfo?buildID=82481
I attempted what you suggested, however it still failed. I'm a total mkinitrd noob, so let me retrace my steps to see if I did anything wrong. I updated nash from updates-testing: % yum update nash --enablerepo=updates-testing % rpm -q nash nash-6.0.71-4.fc10.i386 I installed the dmraid rpm from koji. % rpm -q dmraid dmraid-1.0.0.rc15-4.fc11.i386 Then I did the following with mkinitrd. This is where it got a little fuzzy for me. I created a "custom" version of the img file: % ./mkinitrd /boot/initrd-2.6.27.12-170.2.5.fc10.i686.PAEcustom.img 2.6.27.12-170.2.5.fc10.i686.PAE Then I modified my /boot/grub/grub.conf to point to the "custom" initrd img file Once again I pulled the plug on one of the disks, and attempted to boot. I received the following error, which appears to be the same error...except the listed offsets are slightly different: device-mapper: reload ioctl failed: No such device or address device-mapper: table ioctl failed: No such device or address nash received SIGSEGV! Backtrace (15): /bin/nash[0x8054e8f] [0xa4140c] /usr/lib/libnash.so.6.0.71[0x1433b8] /usr/lib/libnash.so.6.0.71[nashDmDevGetName+0x5a)[0x14430b] /usr/lib/libnash.so.6.0.71[0x140608] /usr/lib/libnash.so.6.0.71[0x14072b] /usr/lib/libnash.so.6.0.71[nashBdevIterNext+0x106)[0x140ba9] /usr/lib/libnash.so.6.0.71[0x140e44] /usr/lib/libnash.so.6.0.71[nashFindFsByUUID+0x2e)[0x140ec9] /usr/lib/libnash.so.6.0.71[nashAGetPathBySpec+0x8e)[0x1401040] /bin/nash[0x804f741] /bin/nash[0x8054cfa] /bin/nash[0x8055459] /lib/libc.so.6(__libc_start_main+0xe5)[0x1676e5] /bin/nash[0x804b2b1]
(In reply to comment #3) > I attempted what you suggested, however it still failed. I'm a total mkinitrd > noob, so let me retrace my steps to see if I did anything wrong. > > I updated nash from updates-testing: > % yum update nash --enablerepo=updates-testing > % rpm -q nash > nash-6.0.71-4.fc10.i386 > Good. > I installed the dmraid rpm from koji. > % rpm -q dmraid > dmraid-1.0.0.rc15-4.fc11.i386 > Also good, but since my last comment I've learned you need an even newer version, please install the one from here: http://koji.fedoraproject.org/koji/buildinfo?buildID=82600 > Then I did the following with mkinitrd. This is where it got a little fuzzy for > me. I created a "custom" version of the img file: > > % ./mkinitrd /boot/initrd-2.6.27.12-170.2.5.fc10.i686.PAEcustom.img > 2.6.27.12-170.2.5.fc10.i686.PAE > The command is correct, but you need to do this with a special new version of mkinitrd (which will be in rawhide soon), download this version from here: https://bugzilla.redhat.com/attachment.cgi?id=331850 And make a new custom initrd with this version of the mkinitrd script, note this is a newer version then the one in I linked to in comment #2. This version should actually work with kernel 2.6.27, the version from comment #2 only worked with 2.6.29 or newer. Can you please give things a try with these new dmraid and even newer mkinitrd script? When you create to the custom initrd please pass -v to mkinitrd and redirect the output to a log file, like this: ./mkinitrd -v /boot/initrd-2.6.27.12-170.2.5.fc10.i686.PAEcustom.img \ 2.6.27.12-170.2.5.fc10.i686.PAE > log And attach the log file, then I can check it is behaving as expected.
Created attachment 332141 [details] Log of mkinitrd execution. This is the log file for the mkinitrd command.
I installed the newer dmraid rpm from koji. % rpm -q dmraid dmraid-1.0.0.rc15-5.fc11.i386 I downloaded the newer mkinitrd script. I have attached the log file. After degrading the array to a single disk, the following error is displayed upon booting: /dev/sda: "jmicron" and "isw" formats discovered (using isw)! ERROR: isw device for volume "Mirror0" broken on /dev/sda in RAID set "isw_bdedhfgbae_Mirror0" ERROR: isw: wrong # of devices in RAID set "isw_bdedhfgbae_Mirror0" [1/2] on /dev/sda ERROR: no mapping possible for RAID set isw_bdedhfgbae_Mirror0 Unable to access resume device (UUID=e6cbd316-3bb3-4698-aa7d-87ddb484b7b8) mount: error mounting /dev/root on /sysroot as ext3: No such file or directory As we've discussed on another issue, it is interesting that both jmicron and isw formats are seen. I initially partitioned the drives with gparted while booting from a Linux rescue CD, however the Fedora installer didn't like that....it saw the header as corrupt. So I let the Fedora installer remove all partitions and then repartition the drive.
Ok, well atleast we got rid of the segfault :) I consider the remainingf issue a dmraid issue, for which I've filed bug 485882. *** This bug has been marked as a duplicate of bug 485882 ***