Bug 475386 - RAID10 – Install ERROR appears during installation of RAID10 isw dmraid raid array
RAID10 – Install ERROR appears during installation of RAID10 isw dmraid raid...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: python-pyblock (Show other bugs)
5.4
i386 Linux
high Severity high
: rc
: 5.4
Assigned To: Peter Jones
Alexander Todorov
: OtherQA, Reopened
Depends On: 471689 496038
Blocks: 480792
  Show dependency treegraph
 
Reported: 2008-12-08 19:54 EST by Ed Ciechanowski
Modified: 2009-09-02 06:02 EDT (History)
21 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-09-02 06:02:32 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Screen capture of RAID5 install error (71.81 KB, image/jpeg)
2008-12-08 19:54 EST, Ed Ciechanowski
no flags Details
Raid 10 Snap6 with network img (120.56 KB, application/x-gzip-compressed)
2008-12-18 15:49 EST, Ed Ciechanowski
no flags Details
retest with .img over NET (17.58 KB, application/x-gzip-compressed)
2008-12-18 19:26 EST, Ed Ciechanowski
no flags Details
/var/log dir tar to show logs (109.16 KB, application/x-gzip-compressed)
2008-12-29 00:37 EST, Ed Ciechanowski
no flags Details
Screen capture of RAID5 boot error (85.97 KB, image/jpeg)
2008-12-29 00:39 EST, Ed Ciechanowski
no flags Details
dmraid recursive patch. (2.01 KB, patch)
2009-02-10 05:21 EST, Joel Andres Granados
no flags Details | Diff
dmraid patch to make the bios and dmraid coinside. (414 bytes, patch)
2009-02-10 05:27 EST, Joel Andres Granados
no flags Details | Diff

  None (edit)
Description Ed Ciechanowski 2008-12-08 19:54:50 EST
Created attachment 326235 [details]
Screen capture of RAID5 install error

+++ This bug was initially created as a clone of Bug #471689 +++

Description of problem:
During the installing of RHEL 5.3 Snapshot 2 no sw raid devices are active to install OS to. Also the version of dmraid being used during install are old.

After Keyboard type selection and one Skips the Registration #, the following ERROR appears:
ERROR 
Error opening /dev/mapper/isw_cdecfjjff_Volume0:
No such device or address

If you hit Alt-F1
<the last few lines are as follows>
error: only one argument allowed for this option
Error
Error: error opening /dev/mapper/isw_cdecfjjff_Volume0: no such device or address 80

If you hit Alt-F2 and try to activate dmraid:
dmraid –ay
raidset “/dev/mapper/cdecfjjff_Volume0” was not activated 

If you hit Alt-F3 you see this:
9:20:05 INFO:		moving (1) to step regkey
9:20:11 INFO:		Repopaths is {‘base’, ‘server’}
9:20:11 INFO:		Moving (1) to step find root parts
9:20:11 ERROR:	Activating raid “/dev/mapper/isw_cdecfjjff_Volume0” : failed 
9:20:11 ERROR: 	Table: 0 156295168 mirror core 2 131072 nosync 2 /dev/sda 0 /dev/sdb 0 1 handle_errors
9:20:11 ERROR:	Exception: device-mapper: reload ioctl failed: invalid argument
9:20:11 Critical:	parted exception: ERROR: error opening /dev/mapper/isw_cdecfjjff_Volume0: no such device or address

If you hit Alt-F4 (the last few lines are):
<6> device-mapper: multipath: ver 1.0.5 loaded
<6> device-mapper: round-robin: v1.0.0 loaded
<3> device-mapper: table: 253:0 mirror: wrong number of minor arguments
<4> device-mapper: ioctl: error adding target to table

Version-Release number of selected component (if applicable):
During install of RHEL 5.3 snapshot 2.
Also in Alt-F2 you run:
dmraid –V
dmraid version: 		1.0.0.rc13 (2007.09.17) static debug
dmraid library version:	1.0.0.rc13 (2006.09.17) 
device-mapper version:	4.11.5
“THESE ARE THE WRONG VERSIONS!”

How reproducible:
Run install of RHEL 5.3 Snapshot2 DVD iso with an ISW SW RAID mirror setup as the only two drives in the system, can miss it.

Steps to Reproduce:
1. Create RAID1 in OROM. Use default settings.
2. Boot to install DVD of RHEL 5.3 Snapshot2
3. Select a keyboard type and Skip the registration #.
4. the next screen that comes up shows the error
  
Actual results:
RHEL 5.3 Snapshot2 does not recognize the sw raid drives setup in the bios orom so it can install the os to the mirror. 

Expected results:
Expected RHEL 5.3 Snapshot2 to recognize and install the OS to the SW raid mirror. 

Additional info:

--- Additional comment from clumens@redhat.com on 2008-11-16 22:45:49 EDT ---

Please attach /tmp/anaconda.log and /tmp/syslog to this bug report.  A picture or something of those error messages on tty1 would be pretty helpful too.  Thanks.

--- Additional comment from ed.ciechanowski@intel.com on 2008-11-17 15:01:37 EDT ---

Created an attachment (id=323792)
Anaconda.log file

attached anaconda.log file

--- Additional comment from ed.ciechanowski@intel.com on 2008-11-17 15:02:34 EDT ---

Created an attachment (id=323793)
syslog file

attached syslog file

--- Additional comment from ddumas@redhat.com on 2008-11-17 15:41:15 EDT ---

We are seeing device-mapper-related problems showing up in anaconda logs with
Snapshot 2.

05:13:57 ERROR   : Activating raid isw_cdecfjjff_Volume0 failed: 05:13:57 ERROR
  :   table: 0 156295168 mirror core 2 131072 nosync 2 /dev/sda 0 /dev/sdb 0 1
handle_errors
05:13:57 ERROR   :   table: 0 156295168 mirror core 2 131072 nosync 2 /dev/sda
0 /dev/sdb 0 1 handle_errors
05:13:57 ERROR   :   exception: device-mapper: reload ioctl failed: Invalid
argument
05:13:57 ERROR   :   exception: device-mapper: reload ioctl failed: Invalid
argument

lvm-team, could someone please take a look?

--- Additional comment from mbroz@redhat.com on 2008-11-17 16:57:31 EDT ---

Please retest with snapshot3, it should contain fixed dmraid package, also see
https://bugzilla.redhat.com/show_bug.cgi?id=471400#c6

--- Additional comment from ed.ciechanowski@intel.com on 2008-11-19 17:29:30 EDT ---

Created an attachment (id=324109)
screen capture dmraid on first boot error

This is the screen capture of RHEL 5.3 Snapshot3 after installing to a mirror and first reboot will show this error.

--- Additional comment from ed.ciechanowski@intel.com on 2008-11-19 17:31:37 EDT ---

Created an attachment (id=324110)
/var/log dir tar to show logs

Here are the latest log files from RHEL 5.3 Snapshot3. The install goes further but still get an error on first report, see screen capture. Not sure if the logs will help. Please let me know what else I can provide to help resolve this issue.

--- Additional comment from hdegoede@redhat.com on 2008-11-20 09:21:34 EDT ---

Ed,

As the system does boot, can you please do the following:
mkdir t
cd t
zcat /boot/mkinitrd...... | cpio -i

After that you should have a file called init (amongst others) in the "t" directory, can you please attach that here? Thanks!

--- Additional comment from hdegoede@redhat.com on 2008-11-20 11:09:56 EDT ---

One more information request, can you please press a key when the inital Red RHEL5 bootloader screen is shown, then press A to append kernel cmdline arguments, and then remove "quiet" from the cmdline (and press enter to boot)

And then take a screenshot of the machine when it fails to boot again.

Thank you.

--- Additional comment from ed.ciechanowski@intel.com on 2008-11-21 14:59:51 EDT ---

Created an attachment (id=324337)
first screen shot of error

I took two screen shots, this is the first one.

--- Additional comment from ed.ciechanowski@intel.com on 2008-11-21 15:01:47 EDT ---

Created an attachment (id=324339)
Second screen shoot

Second screen shot, I took Two. Let me know if you need previous message that do not appear in screen shot 1 nor 2.

--- Additional comment from ed.ciechanowski@intel.com on 2008-11-21 15:03:34 EDT ---

Created an attachment (id=324340)
Here is the init file

I believe the command you wanted was /sbin/mkinitrd...... | cpio -i and not /boot/mkinitrd. Let me know if this is what you needed. Thanks again!

--- Additional comment from heinzm@redhat.com on 2008-11-24 06:47:40 EDT ---

Running "dmraid -ay -i -p $Name" on the command line works perfectly fine.

Do we have all necessary blockdev nodes to acces the component devices of the RAID set $Name requested created by the initrd ?

--- Additional comment from ed.ciechanowski@intel.com on 2008-11-24 12:10:47 EDT ---

When installing RHEL 5.3 Snapshot3 it looks like the mirror is being written to before the reboot. After the install reboots the first time the above errors show up at boot. It seems from this point the OS has defaulted back to running off /dev/sda only. After the OS boots, looks like to /dev/sda, running the command from a terminal "dmraid -ay" Gives the message raid set as not activated. 

What the question in comment 13 for me? Thanks. What more can I provide that will help resolve this issue?

--- Additional comment from jgranado@redhat.com on 2008-11-24 12:28:18 EDT ---

Created an attachment (id=324509)
5 pictures containing the output of init. only the output relevant to the dmraid messages

I believe we have all the necesarry nodes.  This attachement is a tar.gz of the pictures I tool of the output of my machie when changing the init script to execute `dmraid -ay -i -p -vvv -ddd "isw_bhdbbaeebb_Volume0` (sorry for the crappy pictures, the only thing I could find was an Iphone.)

As seen in the output, in the NOTICE messages of the beginning.  dmraid successfully identifies /dev/sdb and /dev/sda as containing isw metadata.

--- Additional comment from jgranado@redhat.com on 2008-11-24 12:35:52 EDT ---

(In reply to comment #14)
> When installing RHEL 5.3 Snapshot3 it looks like the mirror is being written to
> before the reboot. 

Can you please exand on this.  What do you mean by being written to.  it is normal that just before reboot we would want to use the deivice to which we install.  postinstall scripts. rpm installation is ending....  I don't see this as out of the ordinary.

> After the install reboots the first time the above errors
> show up at boot. It seems from this point the OS has defaulted back to running
> off /dev/sda only. After the OS boots, looks like to /dev/sda, 

Yes.  this only happens with mirror raid.  If you install striped RAID you will get a kernal panic.  I assume that it is because of the same reason.  Only with stripped it is not that ieasy to default to using just one of the block devices.

> running the
> command from a terminal "dmraid -ay" Gives the message raid set as not
> activated. 

Same behaviour here.

> 
> What the question in comment 13 for me? Thanks. What more can I provide that
> will help resolve this issue?

--- Additional comment from jgranado@redhat.com on 2008-11-24 12:39:40 EDT ---

(In reply to comment #13)
> Running "dmraid -ay -i -p $Name" on the command line works perfectly fine.

What is your test case.  I mean.  Do you install, and after install you see that the command works as expected?  

Are you testing in a running system?  what special configuration do you have?

Thx for the info.

--- Additional comment from heinzm@redhat.com on 2008-11-24 12:49:41 EDT ---

Joel,

after install, the command works fine for me on a running system.
Can't open your attachment to comment #15.
Are you sure, that all block devices (I.e. the component devices making up the RAID set in question) are there when the initrd processes ?

Ed,

the question in comment #13 was meant for our anaconda/mkinitrd colleagues.

--- Additional comment from jgranado@redhat.com on 2008-11-24 13:21:12 EDT ---

try http://jgranado.fedorapeople.org/temp/init.tar.gz, bugzilla somehow screwed this up.

--- Additional comment from jgranado@redhat.com on 2008-11-24 13:24:30 EDT ---

(In reply to comment #17)
> (In reply to comment #13)
> > Running "dmraid -ay -i -p $Name" on the command line works perfectly fine.

I see the same behavior when I have the os installed on a non raid device and try to activate the raid device after boot.  But When I do the install on the raid device itself and try to use it, it does not work.

Heinz:
insight on the output of the init that is on http://jgranado.fedorapeople.org/temp/init.tar.gz would be greatly appreciated.

--- Additional comment from jgranado@redhat.com on 2008-11-24 13:36:36 EDT ---

On a comparison between what I see in the pictures and in the output of "dmraid -ay -i -p $Name" on a running sysmte.  I noticed a slight difference:

Init output:
.
.
.
NOTICE: added DEV to RAID set "NAME"
NOTICE: dropping unwanted RAID set "NAME_Volume0"
.
.
.

Normal output:
.
.
.
NOTICE: added DEV to RAID set "NAME"
.
.
.

The normal output does not have the "dropping unwanted ...." message.

Any ideas?

--- Additional comment from hdegoede@redhat.com on 2008-11-24 14:28:41 EDT ---

(In reply to comment #21)
> On a comparison between what I see in the pictures and in the output of "dmraid
> -ay -i -p $Name" on a running sysmte.  I noticed a slight difference:
> 
> Init output:
> .
> .
> .
> NOTICE: added DEV to RAID set "NAME"
> NOTICE: dropping unwanted RAID set "NAME_Volume0"
> .
> .
> .
> 
> Normal output:
> .
> .
> .
> NOTICE: added DEV to RAID set "NAME"
> .
> .
> .
> 
> The normal output does not have the "dropping unwanted ...." message.
> 
> Any ideas?

Joel, when you run dmraid on a running system do you use:
"dmraid -ay" or "dmraid -ay -p NAME_Volume0" ?

Notice how dmraid says:
> NOTICE: added DEV to RAID set "NAME"
> NOTICE: dropping unwanted RAID set "NAME_Volume0"

Where in one case the _Volume0 is printed and in the other not. There have been several comments in other reports about the _Volume0 causing problems.

Joel, if you are using "dmraid -ay" (so without the " -p NAME_Volume0", try changing the "init" script in the initrd to do the same (so remove the " -p NAME_Volume0"), and then see if the raid array gets recognized at boot.

--- Additional comment from jgranado@redhat.com on 2008-11-25 06:06:28 EDT ---

(In reply to comment #22)

> Joel, when you run dmraid on a running system do you use:
> "dmraid -ay" or "dmraid -ay -p NAME_Volume0" ?

I use NAME_Volume0, it does not find any sets with just NAME.  But it does print the mangled name in the NOTICE message.

> 
> Notice how dmraid says:
> > NOTICE: added DEV to RAID set "NAME"
> > NOTICE: dropping unwanted RAID set "NAME_Volume0"
> 
> Where in one case the _Volume0 is printed and in the other not. There have been
> several comments in other reports about the _Volume0 causing problems.
> 
> Joel, if you are using "dmraid -ay" (so without the " -p NAME_Volume0", try
> changing the "init" script in the initrd to do the same (so remove the " -p
> NAME_Volume0"), and then see if the raid array gets recognized at boot.

I'll give it a try.

--- Additional comment from heinzm@redhat.com on 2008-11-25 06:11:36 EDT ---

Hans' comment #22 is a workaround in case mkinitrd provides the wrong RAID set name to dmraid,

Our policy is to activate boot time mappings *only* in the initrd, hence mkinitrd needs fixing if it provides a wrong RAID set name.

--- Additional comment from jgranado@redhat.com on 2008-11-25 08:54:22 EDT ---

(In reply to comment #24)
> Our policy is to activate boot time mappings *only* in the initrd, hence
> mkinitrd needs fixing if it provides a wrong RAID set name.

The name is correct.  That is not the issue.

Heinz:
When I run `dmraid -ay` from init, the raid set starts correctly.  I think there is something missing from the environment at that point, but I have no idea what.  Any ideas?

--- Additional comment from jgranado@redhat.com on 2008-11-25 09:10:10 EDT ---

The snapshots with the name are in http://jgranado.fedorapeople.org/temp/init.tar.gz.

I'll post the snapshots without the name (the one that works) shortly.

--- Additional comment from jgranado@redhat.com on 2008-11-25 09:40:10 EDT ---

The snapshots with the command `dmraid -ay -ddd -vvv` are in http://jgranado.fedorapeople.org/temp/initWork.tar.gz

--- Additional comment from jgranado@redhat.com on 2008-11-25 10:43:42 EDT ---

(In reply to comment #18)
> Joel,
> 
> after install, the command works fine for me on a running system.

Heinz
can you send me, post somewhere, attach to the bug your initrd image for the test machine.
thx.

--- Additional comment from heinzm@redhat.com on 2008-11-25 11:20:13 EDT ---

Joel,

like I said, I only ran online, no initrd test.

The provided init*tar.gz snapshots show with the name, that it is being dropped,
ie. the dmraid library want_set() function drops it, which is only possible when the the names in the RAID set and on the command line differ.

Could there be some strange, non-displayable char in the name ?

Please provide the initrd being used to produce to init.tar.gz (ie. the one *with* the name), thx.

--- Additional comment from hdegoede@redhat.com on 2008-11-25 16:51:37 EDT ---

*** Bug 472888 has been marked as a duplicate of this bug. ***

--- Additional comment from hdegoede@redhat.com on 2008-12-02 05:46:41 EDT ---

*** Bug 473244 has been marked as a duplicate of this bug. ***

--- Additional comment from hdegoede@redhat.com on 2008-12-02 06:14:55 EDT ---

We've managed to track down the course of this to mkinitrd (nash). We've done a new build of mkinitrd / nash: 5.1.19.6-41, which we believe fixes this (it does on our test systems).

The new nash-5.1.19.6-41, will be in RHEL 5.3 snapshot 5 which should become available for testing next Monday.

Please test this with snapshot5 when available and let us know how it goes. Thanks for your patience.

--- Additional comment from bmarzins@redhat.com on 2008-12-02 15:01:29 EDT ---

*** Bug 471879 has been marked as a duplicate of this bug. ***

--- Additional comment from pjones@redhat.com on 2008-12-02 15:45:56 EDT ---

*** Bug 446284 has been marked as a duplicate of this bug. ***

--- Additional comment from pjones@redhat.com on 2008-12-02 16:09:25 EDT ---

This should be fixed with nash-5.1.19.6-41 .

--- Additional comment from ddumas@redhat.com on 2008-12-05 13:18:42 EDT ---

*** Bug 474825 has been marked as a duplicate of this bug. ***

--- Additional comment from cward@redhat.com on 2008-12-08 06:53:21 EDT ---

~~ Snapshot 5 is now available @ partners.redhat.com ~~ 

Partners, RHEL 5.3 Snapshot 5 is now available for testing. Please send us your testing feedback on this important bug fix / feature request AS SOON AS POSSIBLE. If you are unable to test, indicate this in a comment or escalate to your Partner Manager. If we do not receive your test feedback, this bug will be AT RISK of being dropped from the release.

If you have VERIFIED the fix, please add PartnerVerified to the Bugzilla
Keywords field, along with a description of the test results. 

If you encounter a new bug, CLONE this bug and request from your Partner
manager to review. We are no longer excepting new bugs into the release, bar
critical regressions.

RAID5 – Install ERROR appears during installation of  RAID5 isw dmraid raid array in RHEL 5.3 Snapshot5. SEE ATTACHED .JPG SCREEN CAPTURE
Comment 1 Ed Ciechanowski 2008-12-08 20:01:15 EST
RAID5 – Install ERROR appears during installation of  RAID5 isw dmraid raid
array in RHEL 5.3 Snapshot5. SEE ATTACHED .JPG SCREEN CAPTURE

If logs are needed let me know which ones.
Comment 2 Chris Ward 2008-12-09 02:41:49 EST
Is this  a regression or critical error? It's getting very late to introduce new change into the release. Please make your case as soon as possible. Otherwise we'll be forced to defer to 5.4. If fixing for 5.4 is OK, please let me know that too.
Comment 3 Hans de Goede 2008-12-09 05:13:15 EST
Ed,

This is caused by the way the new in RHEL-5.3 isw raid 5 / raid 10 support has been implemented, as explained by Joel in bug 475385 comment 3.

The implementation of the new isw raid 5 / raid 10 suport is being tracked in bug 437184 (which is still in progress), as such I'm closing this as a dup of 437184.

*** This bug has been marked as a duplicate of bug 437184 ***
Comment 4 Joel Andres Granados 2008-12-09 07:40:47 EST
Is the message:
ERROR: only one argument allowed for this option
???
Comment 5 Joel Andres Granados 2008-12-09 07:42:53 EST

*** This bug has been marked as a duplicate of bug 437185 ***
Comment 6 Heinz Mauelshagen 2008-12-18 11:12:02 EST
FYI: this bug will effect *all* type RAID10/RAID01 sets independent of
     the metadata format type, not just isw!

There's probably bz duplicates ITR already.
Comment 7 Joel Andres Granados 2008-12-18 11:14:58 EST
Ed:

I have identified this as a pyblock issue.  I want to make sure you and me are on the same page,  I have another updates image for you to try out.  This should get passed the point where you were seeing the message.  Pls try it out and post your findings on this bug.

http://jgranado.fedorapeople.org/temp/raid10.img
Comment 8 Ed Ciechanowski 2008-12-18 13:12:04 EST
Joel:

I understand. I will test Snap6, RAID 10 install using the image from comment #7 in this bug. I will post my results here.

After trying this the first time I see some partition exception errors.
Do you have a remote site information that I can send the exception error to?
It asked for remote site information like: host name, file name, username, password. I would need all this information to send that way. 

I will post result here of syslog, anaconda.log, lsmod, exception error (if I can capture it). 
Any other information you may need?
Comment 9 Ed Ciechanowski 2008-12-18 15:48:57 EST
Started RAID 10 install with .img arg like below: 
Boot: linux updates=http://jgranado.fedorapeople.org/temp/raid10.img <enter>

At the point of install, when the graphical interface show Skip and Back button for Installation Number I press ‘Skip’
Install continues to the ‘Installation requires partitioning.

Now I see the /mapper/isw_xxxxx_Volume0, and others, in the Select drives box. Which previously I have only seen /dev/sda, /dev/sdb….. 
But I wonder why this does not say “/dev/mapper/isw_xxxxxx_Volume0?

If I accept the default partition layout. And continue:
Normal WARNING: Remove all Partitions? Answer: YES

Install continues to Network setup and time settings.
Password for Root is given and accepted.
Choose no SW applications.
BEGIN INSTALLATION? Answer: NEXT

Then I get the EXCEPTION ERROR with choice of
1). Save to Remote
2). Debug
3). OK
I collected the logs at this point.

Installation of RHEL5.3 Snapshot6 goes further now. Still FAILS with raid10 and using above image.
 
Attached is a .tgz of all files from raid10 with network image, .tgz contains:
anaconda.log
anacdump.txt
syslog
buildstamp_out
lsmod_out
uname –a out
dmesg out from install
dmraid_r_out from TTY2
dmraid_s_out from TTY2
dmraid_b_out from TTY2
dmraid_tay_out from TTY2
dmsetup table from TTY2
dmsetup targets from TTY2

Please find attached raid10S6wNET.tgz.
Thanks,
EDC
Comment 10 Ed Ciechanowski 2008-12-18 15:49:52 EST
Created attachment 327363 [details]
Raid 10 Snap6 with network img
Comment 11 Ed Ciechanowski 2008-12-18 19:26:05 EST
Created attachment 327394 [details]
retest with .img over NET 

This may help a bit more, please use this comment before above comment#9. 

I get this error earlier that the previous entry comments #9 in this sighting. I can consistently produce this error when installing with the network image. If I cancel and continue then I get the exception error as described in comment #9.

Started RAID 10 install with .img arg like below: 
Boot: linux updates=http://jgranado.fedorapeople.org/temp/raid10.img <enter>

At the point of install, when the graphical interface show Skip and Back button for Installation Number I press ‘Skip’
Install continues to the ‘Installation requires partitioning.
“Error
Error opening /tmp/mapper/isw_ceexxxxxx_R10wNET : No such device or address.”
I did not know why this would be /tmp/mapper and not /dev/mapper? So, I included ls of each directory. 

Gathered logs at this point. 

Attached is a .tgz of all files from raid10 with network image, .tgz contains:
anaconda.log
syslog
lsmod_out
dmesg out from install
dmraid_r_out from TTY2
dmraid_s_out from TTY2
dmraid_b_out from TTY2
dmraid_tay_out from TTY2
dmsetup table from TTY2
dmsetup targets from TTY2
dmsetup info from TTY2
ls of /tmp/mapper directory
ls of /dev/mapper directory

Please find attached imgWr10.tgz

Thanks,
EDC
Comment 12 Joel Andres Granados 2008-12-19 05:46:03 EST
Ed:

this is very good news.  In the sence that I am seeing the same behavior as you.  To answer some questions:

1. The /tmp/mapper thing, its normal.
2. Save to remote.  If you have your network configured properly you can choose a host that accepts scp and put the traceback there.  I *don't* have a server that can receive those files, but with all the information you have posted it will be enough.
Comment 15 Ed Ciechanowski 2008-12-29 00:37:19 EST
Created attachment 327906 [details]
/var/log dir tar to show logs

RAID5 does install with RHEL 5.3 RC1 with Default settings. There are errors on boot like in Bug 475384. Attached are the logs from /var/log directory. Also, I will attach a screen shot of the ugly boot error messages.
Comment 16 Ed Ciechanowski 2008-12-29 00:39:32 EST
Created attachment 327907 [details]
Screen capture of RAID5 boot error

Here is the screen capture of the RAID 5 BOOT errors. It looks like the install went fine and everything is up booting to dmraid5.d
Comment 17 Ed Ciechanowski 2008-12-29 00:49:45 EST
RAID10 install does not work as stated above in commment #11. Let me know if you need any logs or files from this install.
Comment 18 Joel Andres Granados 2009-01-05 07:09:19 EST
(In reply to comment #15)
> Created an attachment (id=327906) [details]
> /var/log dir tar to show logs
> 
> RAID5 does install with RHEL 5.3 RC1 with Default settings. There are errors on
> boot like in Bug 475384. Attached are the logs from /var/log directory. Also, I
> will attach a screen shot of the ugly boot error messages.

Ed:
Does this result in an unusable system, or is it just an ugly message?
Comment 19 Ed Ciechanowski 2009-01-05 08:45:47 EST
This results in a usable system. The messages are just ugly. It looks like a RAID5. More testing is needed to make sure it is a true RAID5. Install is working for RAID5.
Comment 20 Krzysztof Wojcik 2009-01-26 11:06:34 EST
I tested RHEL 5.3 RC2.
Issue still exist.
Comment 21 Gary Case 2009-01-27 17:39:09 EST
I didn't see errors during installation of RHEL5.3 on my DQ35JO system in RAID0 mode (NFS network install after booting from CD). There are errors post-install, but those are discussed in https://bugzilla.redhat.com/show_bug.cgi?id=475384. Ed, could you take a look and see if this BZ could be closed?
Comment 22 Joel Andres Granados 2009-01-28 04:52:55 EST
Gary:

All raid except raid{10,01} should work fine for the installer in rhel5.3.  This bug tracks the failure for raid10 and raid01.  Currently I am working on rawhide to address these issues and will back port a patch to rhel5.4 as soon as I have a good approach.
Pls do *not* close this issue.
Comment 23 RHEL Product and Program Management 2009-02-03 18:14:37 EST
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 24 Joel Andres Granados 2009-02-10 05:21:42 EST
Created attachment 331410 [details]
dmraid recursive patch.

A lot of work has gone into rawhide to make raid10 work properly with the distribution.  here I try to document all that has been done so that it can be easier at the time of development.

1. We need a pyblock patch that handles raid10' recursiveness.
2. We need a dmraid patch that correctly handles the bios order of the devices.
3. We will probably need some mkinitrd patches to make it work (not sure yet)
Comment 25 Joel Andres Granados 2009-02-10 05:27:44 EST
Created attachment 331411 [details]
dmraid patch to make the bios and dmraid coinside.

On reboot the system was not able to get to the grub.  The reason: dmraid and BIOS did not see the devices in the same way.  This patch fixes this issue.
Comment 26 Joel Andres Granados 2009-02-10 05:30:12 EST
Hans has created an mkinitrd patch to reduce/kill the IOError messages.  It is located at  https://bugzilla.redhat.com/attachment.cgi?id=331409.
Comment 27 Hans de Goede 2009-02-10 06:30:52 EST
(In reply to comment #26)
> Hans has created an mkinitrd patch to reduce/kill the IOError messages.  It is
> located at  https://bugzilla.redhat.com/attachment.cgi?id=331409.

Correction, that is a dmraid patch, not a mkinitrd patch, this patch adds a cmdline option to dmraid, which when used makes dmraid tell the kernel to forget about (unlearn) the partitions on the raw disks. If we get the patch in RHEL-5.4 dmraid, and we patch mkinitrd to patch the cmdline option, then the IO-errors *should* go away.
Comment 29 Joel Andres Granados 2009-03-31 10:30:26 EDT
This change will be present in python-pyblock-0.26-4
Comment 30 Ed Ciechanowski 2009-04-08 17:21:39 EDT
Hi Hans and Heinz,

It looks like Bugzilla 475386 fixes issues in both python-pyblock and dmraid.

Seems one fix is in python-pyblock-0.26-4 and the patch attached to the bugzilla adds
an cmdline option to dmraid, which when used makes dmraid tell the kernel to
forget about (unlearn) the partitions on the raw disks. 

If we get the patch in RHEL-5.4 dmraid, and we patch mkinitrd to patch the cmdline option, then the
IO-errors *should* go away.  

https://bugzilla.redhat.com/show_bug.cgi?id=475386 RAID10 Install ERROR appears during installation of  RAID10 isw dmraid raid array.


Question: 
Heinz can you include this patch https://bugzilla.redhat.com/attachment.cgi?id=331409 in the next release of dmraid so it can get into RHEL 5.4?
Need me to include you on the CC line?


Hans can you produce an ISO image of this mkinitrd and dmraid fix? (If you need me to test this, the bug is in NEED INFO state) I do not have the capability to put this all together into a build and image in a timely manor. 

Thanks,
*EDC*
Comment 33 Chris Ward 2009-06-14 19:17:36 EDT
~~ Attention Partners RHEL 5.4 Partner Alpha Released! ~~

RHEL 5.4 Partner Alpha has been released on partners.redhat.com. There should
be a fix present that addresses this particular request. Please test and report back your results here, at your earliest convenience. Our Public Beta release is just around the corner!

If you encounter any issues, please set the bug back to the ASSIGNED state and
describe the issues you encountered. If you have verified the request functions as expected, please set your Partner ID in the Partner field above to indicate successful test results. Do not flip the bug status to VERIFIED. Further questions can be directed to your Red Hat Partner Manager. Thanks!
Comment 34 Krzysztof Wojcik 2009-06-23 06:26:55 EDT
Issue verified in RHEL5.4 Alpha with PASS result.
Comment 35 Alexander Todorov 2009-07-01 08:45:34 EDT
Moving to VERIFIED based on comment #34
Comment 37 errata-xmlrpc 2009-09-02 06:02:32 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-1319.html

Note You need to log in before you can comment on or make changes to this bug.