Bug 485337 - Fedora 10 fails to boot (nash received SIGSEGV) after disconnecting one disk of a two disk fakeraid RAID 1 [mirrored] array
Fedora 10 fails to boot (nash received SIGSEGV) after disconnecting one disk ...
Status: CLOSED DUPLICATE of bug 485882
Product: Fedora
Classification: Fedora
Component: dmraid (Show other bugs)
10
i386 Linux
low Severity high
: ---
: ---
Assigned To: LVM and device-mapper development team
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-02-12 17:37 EST by gregjo
Modified: 2013-01-22 15:52 EST (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-02-17 03:24:24 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Log of mkinitrd execution. (8.27 KB, application/x-unknown)
2009-02-16 17:26 EST, gregjo
no flags Details

  None (edit)
Description gregjo 2009-02-12 17:37:49 EST
Description of problem:
The system boots fine as a mirrored [RAID 1] array. If one of the two drives in the RAID is disconnected the system fails to boot. 
The log indicates
"device-mapper: reload ioctl failed: No such device or address"
"device-mapper: table ioctl failed: No such device or address"
"nash received SEGSEGV"

Version-Release number of selected component (if applicable):
% uname -r
2.6.27.12-170.2.5.fc10.i686.PAE
% rpm -q nash
nash-6.0.71-3.fc10.i386
% rpm -q dmraid
dmraid-1.0.0.rc15-2.fc10.i386
% rpm -q device-mapper
device-mapper-1.02.27-7.fc10.i386

How reproducible:
100%

Steps to Reproduce:
1. My motherboard is an EVGA X58 SLI. The motherboard supports fakeraid via JMicron JMB363. I have 2X 1Terabyte drives configured in a RAID 1 configuration. The drive is partitioned such that I can dual boot. WindowsXP is on one partition. Fedora 10 is on another partition. The system has only these two 1Terabyte drives in the box.
2. Once the OS is installed correctly, I powered down the system and disconnected one of the drives, in order to simulate a hardware failure of one of the drives in the RAID. Upon powering up, the Fedora 10 installation fails to boot.

  
Actual results:
device-mapper: reload ioctl failed: No such device or address
device-mapper: table ioctl failed: No such device or address
nash received SIGSEGV! Backtrace (15):
/bin/nash[0x8054e0d]
[0xe4240c]
/usr/lib/libnash.so.6.0.71[0xfbb3cc]
/usr/lib/libnash.so.6.0.71[nashDmDevGetName+0x5a)[0xfbc31f]
/usr/lib/libnash.so.6.0.71[0xfb861c]
/usr/lib/libnash.so.6.0.71[0xfb874d]
/usr/lib/libnash.so.6.0.71[nashBdevIterNext+0x106)[0xfb8bd9]
/usr/lib/libnash.so.6.0.71[0xfb8e78]
/usr/lib/libnash.so.6.0.71[nashFindFsByUUID+0x2e)[0xfb8efd]
/usr/lib/libnash.so.6.0.71[nashAGetPathBySpec+0x8e)[0xfb9074]
/bin/nash[0x804f6bf]
/bin/nash[0x8054c78]
/bin/nash[0x80553d7]
/lib/libc.so.6(__libc_start_main+0xe5)[0x5fa6e5]
/bin/nash[0x804b2b1]

Expected results:
The system should continue to operate correctly on the remaining drive in the RAID.

Additional info:
If I boot on the WindowsXP partition, the system runs correctly on the remaining drive in the array.
Comment 1 Hans de Goede 2009-02-12 17:55:35 EST
This is most likely caused by the initrd doing its own dm table creation. In rawhide we no longer do that.

If you're interested in testing of the new rawhide mkinitrd indeed fixes this, try installing mkinitrd-6.0.76 or newer from rawhide and then regenerating your initrd, after this you should be able to still boot.

Atleast assuming that dmraid can activate a mirror set even if only one drive is present, Heinz ?
Comment 2 Hans de Goede 2009-02-12 18:19:07 EST
Sorry, mkinitrd-6.0.76 does not yet have the changes I was refering too. If you want to test the mkinitrd way of handling dmraid use this mkinitrd script:
https://bugzilla.redhat.com/attachment.cgi?id=331781

Before using it make sure you have nash-6.0.71-4 installed (from updates-testing) and that you've upgraded your dmraid to this version:
http://koji.fedoraproject.org/koji/buildinfo?buildID=82481
Comment 3 gregjo 2009-02-13 11:12:15 EST
I attempted what you suggested, however it still failed. I'm a total mkinitrd noob, so let me retrace my steps to see if I did anything wrong.

I updated nash from updates-testing:
% yum update nash --enablerepo=updates-testing
% rpm -q nash
nash-6.0.71-4.fc10.i386

I installed the dmraid rpm from koji.
% rpm -q dmraid
dmraid-1.0.0.rc15-4.fc11.i386

Then I did the following with mkinitrd. This is where it got a little fuzzy for me. I created a "custom" version of the img file:

% ./mkinitrd /boot/initrd-2.6.27.12-170.2.5.fc10.i686.PAEcustom.img 2.6.27.12-170.2.5.fc10.i686.PAE

Then I modified my /boot/grub/grub.conf to point to the "custom" initrd img file

Once again I pulled the plug on one of the disks, and attempted to boot. I received the following error, which appears to be the same error...except the listed offsets are slightly different:

device-mapper: reload ioctl failed: No such device or address
device-mapper: table ioctl failed: No such device or address
nash received SIGSEGV! Backtrace (15):
/bin/nash[0x8054e8f]
[0xa4140c]
/usr/lib/libnash.so.6.0.71[0x1433b8]
/usr/lib/libnash.so.6.0.71[nashDmDevGetName+0x5a)[0x14430b]
/usr/lib/libnash.so.6.0.71[0x140608]
/usr/lib/libnash.so.6.0.71[0x14072b]
/usr/lib/libnash.so.6.0.71[nashBdevIterNext+0x106)[0x140ba9]
/usr/lib/libnash.so.6.0.71[0x140e44]
/usr/lib/libnash.so.6.0.71[nashFindFsByUUID+0x2e)[0x140ec9]
/usr/lib/libnash.so.6.0.71[nashAGetPathBySpec+0x8e)[0x1401040]
/bin/nash[0x804f741]
/bin/nash[0x8054cfa]
/bin/nash[0x8055459]
/lib/libc.so.6(__libc_start_main+0xe5)[0x1676e5]
/bin/nash[0x804b2b1]
Comment 4 Hans de Goede 2009-02-15 05:04:50 EST
(In reply to comment #3)
> I attempted what you suggested, however it still failed. I'm a total mkinitrd
> noob, so let me retrace my steps to see if I did anything wrong.
> 
> I updated nash from updates-testing:
> % yum update nash --enablerepo=updates-testing
> % rpm -q nash
> nash-6.0.71-4.fc10.i386
> 

Good.

> I installed the dmraid rpm from koji.
> % rpm -q dmraid
> dmraid-1.0.0.rc15-4.fc11.i386
> 

Also good, but since my last comment I've learned you need an even newer version, please install the one from here:
http://koji.fedoraproject.org/koji/buildinfo?buildID=82600

> Then I did the following with mkinitrd. This is where it got a little fuzzy for
> me. I created a "custom" version of the img file:
> 
> % ./mkinitrd /boot/initrd-2.6.27.12-170.2.5.fc10.i686.PAEcustom.img
> 2.6.27.12-170.2.5.fc10.i686.PAE
> 

The command is correct, but you need to do this with a special new version of mkinitrd (which will be in rawhide soon), download this version from here:
https://bugzilla.redhat.com/attachment.cgi?id=331850

And make a new custom initrd with this version of the mkinitrd script, note this is a newer version then the one in I linked to in comment #2. This version should actually work with kernel 2.6.27, the version from comment #2 only worked with 2.6.29 or newer.

Can you please give things a try with these new dmraid and even newer mkinitrd script?

When you create to the custom initrd please pass -v to mkinitrd and redirect the output to a log file, like this:

./mkinitrd -v /boot/initrd-2.6.27.12-170.2.5.fc10.i686.PAEcustom.img \
  2.6.27.12-170.2.5.fc10.i686.PAE > log

And attach the log file, then I can check it is behaving as expected.
Comment 5 gregjo 2009-02-16 17:26:35 EST
Created attachment 332141 [details]
Log of mkinitrd execution.

This is the log file for the mkinitrd command.
Comment 6 gregjo 2009-02-16 17:27:08 EST
I installed the newer dmraid rpm from koji.
% rpm -q dmraid
dmraid-1.0.0.rc15-5.fc11.i386

I downloaded the newer mkinitrd script. I have attached the log file. 

After degrading the array to a single disk, the following error is displayed upon booting:

/dev/sda: "jmicron" and "isw" formats discovered (using isw)!
ERROR: isw device for volume "Mirror0" broken on /dev/sda in RAID set "isw_bdedhfgbae_Mirror0"
ERROR: isw: wrong # of devices in RAID set "isw_bdedhfgbae_Mirror0" [1/2] on /dev/sda
ERROR: no mapping possible for RAID set isw_bdedhfgbae_Mirror0
Unable to access resume device (UUID=e6cbd316-3bb3-4698-aa7d-87ddb484b7b8)
mount: error mounting /dev/root on /sysroot as ext3: No such file or directory

As we've discussed on another issue, it is interesting that both jmicron and isw formats are seen. I initially partitioned the drives with gparted while booting from a Linux rescue CD, however the Fedora installer didn't like that....it saw the header as corrupt. So I let the Fedora installer remove all partitions and then repartition the drive.
Comment 7 Hans de Goede 2009-02-17 03:24:24 EST
Ok, well atleast we got rid of the segfault :) I consider the remainingf issue a dmraid issue, for which I've filed bug 485882.

*** This bug has been marked as a duplicate of bug 485882 ***

Note You need to log in before you can comment on or make changes to this bug.