Bug 471689 - RHEL 5.3 Snapshot 2 fails to activate sw raid devices, unable to install to sw raid mirror
Summary: RHEL 5.3 Snapshot 2 fails to activate sw raid devices, unable to install to s...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: anaconda
Version: 5.3
Hardware: i386
OS: Linux
medium
high
Target Milestone: rc
: ---
Assignee: Anaconda Maintenance Team
QA Contact: Release Test Team
URL:
Whiteboard:
: 446284 470540 471879 472241 472477 472888 473244 474825 475174 (view as bug list)
Depends On:
Blocks: 475384 475385 475386
TreeView+ depends on / blocked
 
Reported: 2008-11-15 00:40 UTC by Ed Ciechanowski
Modified: 2018-10-20 01:45 UTC (History)
22 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-01-20 21:35:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Anaconda.log file (28.02 KB, text/plain)
2008-11-17 20:01 UTC, Ed Ciechanowski
no flags Details
syslog file (39.35 KB, text/plain)
2008-11-17 20:02 UTC, Ed Ciechanowski
no flags Details
screen capture dmraid on first boot error (62.14 KB, image/jpeg)
2008-11-19 22:29 UTC, Ed Ciechanowski
no flags Details
/var/log dir tar to show logs (106.23 KB, application/x-gzip-compressed)
2008-11-19 22:31 UTC, Ed Ciechanowski
no flags Details
first screen shot of error (78.41 KB, image/pjpeg)
2008-11-21 19:59 UTC, Ed Ciechanowski
no flags Details
Second screen shoot (66.90 KB, image/jpeg)
2008-11-21 20:01 UTC, Ed Ciechanowski
no flags Details
Here is the init file (2.27 KB, text/plain)
2008-11-21 20:03 UTC, Ed Ciechanowski
no flags Details
5 pictures containing the output of init. only the output relevant to the dmraid messages (1.24 MB, application/x-gzip)
2008-11-24 17:28 UTC, Joel Andres Granados
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2009:0164 0 normal SHIPPED_LIVE anaconda bug fix and enhancement update 2009-01-20 16:05:24 UTC
Red Hat Product Errata RHBA-2009:0237 0 normal SHIPPED_LIVE mkinitrd bug fix and enhancement update 2009-01-20 16:06:39 UTC

Description Ed Ciechanowski 2008-11-15 00:40:49 UTC
Description of problem:
During the installing of RHEL 5.3 Snapshot 2 no sw raid devices are active to install OS to. Also the version of dmraid being used during install are old.

After Keyboard type selection and one Skips the Registration #, the following ERROR appears:
ERROR 
Error opening /dev/mapper/isw_cdecfjjff_Volume0:
No such device or address

If you hit Alt-F1
<the last few lines are as follows>
error: only one argument allowed for this option
Error
Error: error opening /dev/mapper/isw_cdecfjjff_Volume0: no such device or address 80

If you hit Alt-F2 and try to activate dmraid:
dmraid –ay
raidset “/dev/mapper/cdecfjjff_Volume0” was not activated 

If you hit Alt-F3 you see this:
9:20:05 INFO:		moving (1) to step regkey
9:20:11 INFO:		Repopaths is {‘base’, ‘server’}
9:20:11 INFO:		Moving (1) to step find root parts
9:20:11 ERROR:	Activating raid “/dev/mapper/isw_cdecfjjff_Volume0” : failed 
9:20:11 ERROR: 	Table: 0 156295168 mirror core 2 131072 nosync 2 /dev/sda 0 /dev/sdb 0 1 handle_errors
9:20:11 ERROR:	Exception: device-mapper: reload ioctl failed: invalid argument
9:20:11 Critical:	parted exception: ERROR: error opening /dev/mapper/isw_cdecfjjff_Volume0: no such device or address

If you hit Alt-F4 (the last few lines are):
<6> device-mapper: multipath: ver 1.0.5 loaded
<6> device-mapper: round-robin: v1.0.0 loaded
<3> device-mapper: table: 253:0 mirror: wrong number of minor arguments
<4> device-mapper: ioctl: error adding target to table

Version-Release number of selected component (if applicable):
During install of RHEL 5.3 snapshot 2.
Also in Alt-F2 you run:
dmraid –V
dmraid version: 		1.0.0.rc13 (2007.09.17) static debug
dmraid library version:	1.0.0.rc13 (2006.09.17) 
device-mapper version:	4.11.5
“THESE ARE THE WRONG VERSIONS!”

How reproducible:
Run install of RHEL 5.3 Snapshot2 DVD iso with an ISW SW RAID mirror setup as the only two drives in the system, can miss it.

Steps to Reproduce:
1. Create RAID1 in OROM. Use default settings.
2. Boot to install DVD of RHEL 5.3 Snapshot2
3. Select a keyboard type and Skip the registration #.
4. the next screen that comes up shows the error
  
Actual results:
RHEL 5.3 Snapshot2 does not recognize the sw raid drives setup in the bios orom so it can install the os to the mirror. 

Expected results:
Expected RHEL 5.3 Snapshot2 to recognize and install the OS to the SW raid mirror. 

Additional info:

Comment 1 Chris Lumens 2008-11-17 03:45:49 UTC
Please attach /tmp/anaconda.log and /tmp/syslog to this bug report.  A picture or something of those error messages on tty1 would be pretty helpful too.  Thanks.

Comment 2 Ed Ciechanowski 2008-11-17 20:01:37 UTC
Created attachment 323792 [details]
Anaconda.log file

attached anaconda.log file

Comment 3 Ed Ciechanowski 2008-11-17 20:02:34 UTC
Created attachment 323793 [details]
syslog file

attached syslog file

Comment 4 Denise Dumas 2008-11-17 20:41:15 UTC
We are seeing device-mapper-related problems showing up in anaconda logs with
Snapshot 2.

05:13:57 ERROR   : Activating raid isw_cdecfjjff_Volume0 failed: 05:13:57 ERROR
  :   table: 0 156295168 mirror core 2 131072 nosync 2 /dev/sda 0 /dev/sdb 0 1
handle_errors
05:13:57 ERROR   :   table: 0 156295168 mirror core 2 131072 nosync 2 /dev/sda
0 /dev/sdb 0 1 handle_errors
05:13:57 ERROR   :   exception: device-mapper: reload ioctl failed: Invalid
argument
05:13:57 ERROR   :   exception: device-mapper: reload ioctl failed: Invalid
argument

lvm-team, could someone please take a look?

Comment 5 Milan Broz 2008-11-17 21:57:31 UTC
Please retest with snapshot3, it should contain fixed dmraid package, also see
https://bugzilla.redhat.com/show_bug.cgi?id=471400#c6

Comment 6 Ed Ciechanowski 2008-11-19 22:29:30 UTC
Created attachment 324109 [details]
screen capture dmraid on first boot error

This is the screen capture of RHEL 5.3 Snapshot3 after installing to a mirror and first reboot will show this error.

Comment 7 Ed Ciechanowski 2008-11-19 22:31:37 UTC
Created attachment 324110 [details]
/var/log dir tar to show logs

Here are the latest log files from RHEL 5.3 Snapshot3. The install goes further but still get an error on first report, see screen capture. Not sure if the logs will help. Please let me know what else I can provide to help resolve this issue.

Comment 8 Hans de Goede 2008-11-20 14:21:34 UTC
Ed,

As the system does boot, can you please do the following:
mkdir t
cd t
zcat /boot/mkinitrd...... | cpio -i

After that you should have a file called init (amongst others) in the "t" directory, can you please attach that here? Thanks!

Comment 9 Hans de Goede 2008-11-20 16:09:56 UTC
One more information request, can you please press a key when the inital Red RHEL5 bootloader screen is shown, then press A to append kernel cmdline arguments, and then remove "quiet" from the cmdline (and press enter to boot)

And then take a screenshot of the machine when it fails to boot again.

Thank you.

Comment 10 Ed Ciechanowski 2008-11-21 19:59:51 UTC
Created attachment 324337 [details]
first screen shot of error

I took two screen shots, this is the first one.

Comment 11 Ed Ciechanowski 2008-11-21 20:01:47 UTC
Created attachment 324339 [details]
Second screen shoot

Second screen shot, I took Two. Let me know if you need previous message that do not appear in screen shot 1 nor 2.

Comment 12 Ed Ciechanowski 2008-11-21 20:03:34 UTC
Created attachment 324340 [details]
Here is the init file

I believe the command you wanted was /sbin/mkinitrd...... | cpio -i and not /boot/mkinitrd. Let me know if this is what you needed. Thanks again!

Comment 13 Heinz Mauelshagen 2008-11-24 11:47:40 UTC
Running "dmraid -ay -i -p $Name" on the command line works perfectly fine.

Do we have all necessary blockdev nodes to acces the component devices of the RAID set $Name requested created by the initrd ?

Comment 14 Ed Ciechanowski 2008-11-24 17:10:47 UTC
When installing RHEL 5.3 Snapshot3 it looks like the mirror is being written to before the reboot. After the install reboots the first time the above errors show up at boot. It seems from this point the OS has defaulted back to running off /dev/sda only. After the OS boots, looks like to /dev/sda, running the command from a terminal "dmraid -ay" Gives the message raid set as not activated. 

What the question in comment 13 for me? Thanks. What more can I provide that will help resolve this issue?

Comment 15 Joel Andres Granados 2008-11-24 17:28:18 UTC
Created attachment 324509 [details]
5 pictures containing the output of init. only the output relevant to the dmraid messages

I believe we have all the necesarry nodes.  This attachement is a tar.gz of the pictures I tool of the output of my machie when changing the init script to execute `dmraid -ay -i -p -vvv -ddd "isw_bhdbbaeebb_Volume0` (sorry for the crappy pictures, the only thing I could find was an Iphone.)

As seen in the output, in the NOTICE messages of the beginning.  dmraid successfully identifies /dev/sdb and /dev/sda as containing isw metadata.

Comment 16 Joel Andres Granados 2008-11-24 17:35:52 UTC
(In reply to comment #14)
> When installing RHEL 5.3 Snapshot3 it looks like the mirror is being written to
> before the reboot. 

Can you please exand on this.  What do you mean by being written to.  it is normal that just before reboot we would want to use the deivice to which we install.  postinstall scripts. rpm installation is ending....  I don't see this as out of the ordinary.

> After the install reboots the first time the above errors
> show up at boot. It seems from this point the OS has defaulted back to running
> off /dev/sda only. After the OS boots, looks like to /dev/sda, 

Yes.  this only happens with mirror raid.  If you install striped RAID you will get a kernal panic.  I assume that it is because of the same reason.  Only with stripped it is not that ieasy to default to using just one of the block devices.

> running the
> command from a terminal "dmraid -ay" Gives the message raid set as not
> activated. 

Same behaviour here.

> 
> What the question in comment 13 for me? Thanks. What more can I provide that
> will help resolve this issue?

Comment 17 Joel Andres Granados 2008-11-24 17:39:40 UTC
(In reply to comment #13)
> Running "dmraid -ay -i -p $Name" on the command line works perfectly fine.

What is your test case.  I mean.  Do you install, and after install you see that the command works as expected?  

Are you testing in a running system?  what special configuration do you have?

Thx for the info.

Comment 18 Heinz Mauelshagen 2008-11-24 17:49:41 UTC
Joel,

after install, the command works fine for me on a running system.
Can't open your attachment to comment #15.
Are you sure, that all block devices (I.e. the component devices making up the RAID set in question) are there when the initrd processes ?

Ed,

the question in comment #13 was meant for our anaconda/mkinitrd colleagues.

Comment 19 Joel Andres Granados 2008-11-24 18:21:12 UTC
try http://jgranado.fedorapeople.org/temp/init.tar.gz, bugzilla somehow screwed this up.

Comment 20 Joel Andres Granados 2008-11-24 18:24:30 UTC
(In reply to comment #17)
> (In reply to comment #13)
> > Running "dmraid -ay -i -p $Name" on the command line works perfectly fine.

I see the same behavior when I have the os installed on a non raid device and try to activate the raid device after boot.  But When I do the install on the raid device itself and try to use it, it does not work.

Heinz:
insight on the output of the init that is on http://jgranado.fedorapeople.org/temp/init.tar.gz would be greatly appreciated.

Comment 21 Joel Andres Granados 2008-11-24 18:36:36 UTC
On a comparison between what I see in the pictures and in the output of "dmraid -ay -i -p $Name" on a running sysmte.  I noticed a slight difference:

Init output:
.
.
.
NOTICE: added DEV to RAID set "NAME"
NOTICE: dropping unwanted RAID set "NAME_Volume0"
.
.
.

Normal output:
.
.
.
NOTICE: added DEV to RAID set "NAME"
.
.
.

The normal output does not have the "dropping unwanted ...." message.

Any ideas?

Comment 22 Hans de Goede 2008-11-24 19:28:41 UTC
(In reply to comment #21)
> On a comparison between what I see in the pictures and in the output of "dmraid
> -ay -i -p $Name" on a running sysmte.  I noticed a slight difference:
> 
> Init output:
> .
> .
> .
> NOTICE: added DEV to RAID set "NAME"
> NOTICE: dropping unwanted RAID set "NAME_Volume0"
> .
> .
> .
> 
> Normal output:
> .
> .
> .
> NOTICE: added DEV to RAID set "NAME"
> .
> .
> .
> 
> The normal output does not have the "dropping unwanted ...." message.
> 
> Any ideas?

Joel, when you run dmraid on a running system do you use:
"dmraid -ay" or "dmraid -ay -p NAME_Volume0" ?

Notice how dmraid says:
> NOTICE: added DEV to RAID set "NAME"
> NOTICE: dropping unwanted RAID set "NAME_Volume0"

Where in one case the _Volume0 is printed and in the other not. There have been several comments in other reports about the _Volume0 causing problems.

Joel, if you are using "dmraid -ay" (so without the " -p NAME_Volume0", try changing the "init" script in the initrd to do the same (so remove the " -p NAME_Volume0"), and then see if the raid array gets recognized at boot.

Comment 23 Joel Andres Granados 2008-11-25 11:06:28 UTC
(In reply to comment #22)

> Joel, when you run dmraid on a running system do you use:
> "dmraid -ay" or "dmraid -ay -p NAME_Volume0" ?

I use NAME_Volume0, it does not find any sets with just NAME.  But it does print the mangled name in the NOTICE message.

> 
> Notice how dmraid says:
> > NOTICE: added DEV to RAID set "NAME"
> > NOTICE: dropping unwanted RAID set "NAME_Volume0"
> 
> Where in one case the _Volume0 is printed and in the other not. There have been
> several comments in other reports about the _Volume0 causing problems.
> 
> Joel, if you are using "dmraid -ay" (so without the " -p NAME_Volume0", try
> changing the "init" script in the initrd to do the same (so remove the " -p
> NAME_Volume0"), and then see if the raid array gets recognized at boot.

I'll give it a try.

Comment 24 Heinz Mauelshagen 2008-11-25 11:11:36 UTC
Hans' comment #22 is a workaround in case mkinitrd provides the wrong RAID set name to dmraid,

Our policy is to activate boot time mappings *only* in the initrd, hence mkinitrd needs fixing if it provides a wrong RAID set name.

Comment 25 Joel Andres Granados 2008-11-25 13:54:22 UTC
(In reply to comment #24)
> Our policy is to activate boot time mappings *only* in the initrd, hence
> mkinitrd needs fixing if it provides a wrong RAID set name.

The name is correct.  That is not the issue.

Heinz:
When I run `dmraid -ay` from init, the raid set starts correctly.  I think there is something missing from the environment at that point, but I have no idea what.  Any ideas?

Comment 28 Joel Andres Granados 2008-11-25 14:10:10 UTC
The snapshots with the name are in http://jgranado.fedorapeople.org/temp/init.tar.gz.

I'll post the snapshots without the name (the one that works) shortly.

Comment 29 Joel Andres Granados 2008-11-25 14:40:10 UTC
The snapshots with the command `dmraid -ay -ddd -vvv` are in http://jgranado.fedorapeople.org/temp/initWork.tar.gz

Comment 30 Joel Andres Granados 2008-11-25 15:43:42 UTC
(In reply to comment #18)
> Joel,
> 
> after install, the command works fine for me on a running system.

Heinz
can you send me, post somewhere, attach to the bug your initrd image for the test machine.
thx.

Comment 31 Heinz Mauelshagen 2008-11-25 16:20:13 UTC
Joel,

like I said, I only ran online, no initrd test.

The provided init*tar.gz snapshots show with the name, that it is being dropped,
ie. the dmraid library want_set() function drops it, which is only possible when the the names in the RAID set and on the command line differ.

Could there be some strange, non-displayable char in the name ?

Please provide the initrd being used to produce to init.tar.gz (ie. the one *with* the name), thx.

Comment 32 Hans de Goede 2008-11-25 21:51:37 UTC
*** Bug 472888 has been marked as a duplicate of this bug. ***

Comment 33 Hans de Goede 2008-12-02 10:46:41 UTC
*** Bug 473244 has been marked as a duplicate of this bug. ***

Comment 34 Hans de Goede 2008-12-02 11:14:55 UTC
We've managed to track down the course of this to mkinitrd (nash). We've done a new build of mkinitrd / nash: 5.1.19.6-41, which we believe fixes this (it does on our test systems).

The new nash-5.1.19.6-41, will be in RHEL 5.3 snapshot 5 which should become available for testing next Monday.

Please test this with snapshot5 when available and let us know how it goes. Thanks for your patience.

Comment 35 Ben Marzinski 2008-12-02 20:01:29 UTC
*** Bug 471879 has been marked as a duplicate of this bug. ***

Comment 37 Peter Jones 2008-12-02 20:45:56 UTC
*** Bug 446284 has been marked as a duplicate of this bug. ***

Comment 39 Peter Jones 2008-12-02 21:09:25 UTC
This should be fixed with nash-5.1.19.6-41 .

Comment 41 Denise Dumas 2008-12-05 18:18:42 UTC
*** Bug 474825 has been marked as a duplicate of this bug. ***

Comment 42 Chris Ward 2008-12-08 11:53:21 UTC
~~ Snapshot 5 is now available @ partners.redhat.com ~~ 

Partners, RHEL 5.3 Snapshot 5 is now available for testing. Please send us your testing feedback on this important bug fix / feature request AS SOON AS POSSIBLE. If you are unable to test, indicate this in a comment or escalate to your Partner Manager. If we do not receive your test feedback, this bug will be AT RISK of being dropped from the release.

If you have VERIFIED the fix, please add PartnerVerified to the Bugzilla
Keywords field, along with a description of the test results. 

If you encounter a new bug, CLONE this bug and request from your Partner
manager to review. We are no longer excepting new bugs into the release, bar
critical regressions.

Comment 47 Ed Ciechanowski 2008-12-09 00:57:36 UTC
Test Results of  installs for RHEL5.3 Snapshot5 to isw raids:

RAID 1 (Mirror) – Work correctly, a true mirror is created. You can boot to either drive if one is removed. I would closed this bugzilla as verified fixed. 


Issues that still exists in this area:
RAID0 (Strip) – produces i/o errors on boot. Seems to be a Strip RAID system boots and work normally except for i/o errors on boot. Cloned this Bugzilla – see bug 475384

RAID10 (0+1) – Install ERROR appears during installation of  RAID10 isw dmraid raid array in RHEL 5.3 Snapshot5. Cloned this Bugzilla – see bug 475385

RAID5 – Install ERROR appears during installation of  RAID5 isw dmraid raid array in RHEL 5.3 Snapshot5. Cloned this Bugzilla – see bug 475386

Comment 48 Andrius Benokraitis 2008-12-09 03:05:53 UTC
*** Bug 472477 has been marked as a duplicate of this bug. ***

Comment 49 Denise Dumas 2008-12-11 14:04:15 UTC
*** Bug 472241 has been marked as a duplicate of this bug. ***

Comment 50 Denise Dumas 2008-12-11 14:15:06 UTC
*** Bug 470540 has been marked as a duplicate of this bug. ***

Comment 51 Hans de Goede 2008-12-11 16:01:10 UTC
*** Bug 470540 has been marked as a duplicate of this bug. ***

Comment 53 errata-xmlrpc 2009-01-20 21:35:48 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0164.html

Comment 56 Ben Marzinski 2010-01-05 18:48:46 UTC
*** Bug 475174 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.