Bug 475386
| Summary: | RAID10 – Install ERROR appears during installation of RAID10 isw dmraid raid array | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Ed Ciechanowski <ed.ciechanowski> | ||||||||||||||||
| Component: | python-pyblock | Assignee: | Peter Jones <pjones> | ||||||||||||||||
| Status: | CLOSED ERRATA | QA Contact: | Alexander Todorov <atodorov> | ||||||||||||||||
| Severity: | high | Docs Contact: | |||||||||||||||||
| Priority: | high | ||||||||||||||||||
| Version: | 5.4 | CC: | atodorov, borgan, cward, ddumas, fernando, hdegoede, heinzm, Jacek.Danecki, jane.lv, jgranado, jjarvis, jvillalo, keve.a.gabbert, krzysztof.wojcik, luyu, lvm-team, naveenr, nelhawar, pjones, rpacheco, tao | ||||||||||||||||
| Target Milestone: | rc | Keywords: | OtherQA, Reopened | ||||||||||||||||
| Target Release: | 5.4 | ||||||||||||||||||
| Hardware: | i386 | ||||||||||||||||||
| OS: | Linux | ||||||||||||||||||
| Whiteboard: | |||||||||||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||||||
| Clone Of: | Environment: | ||||||||||||||||||
| Last Closed: | 2009-09-02 10:02:32 UTC | Type: | --- | ||||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||
| Embargoed: | |||||||||||||||||||
| Bug Depends On: | 471689, 496038 | ||||||||||||||||||
| Bug Blocks: | 480792 | ||||||||||||||||||
| Attachments: |
|
||||||||||||||||||
|
Description
Ed Ciechanowski
2008-12-09 00:54:50 UTC
RAID5 – Install ERROR appears during installation of RAID5 isw dmraid raid array in RHEL 5.3 Snapshot5. SEE ATTACHED .JPG SCREEN CAPTURE If logs are needed let me know which ones. Is this a regression or critical error? It's getting very late to introduce new change into the release. Please make your case as soon as possible. Otherwise we'll be forced to defer to 5.4. If fixing for 5.4 is OK, please let me know that too. Ed, This is caused by the way the new in RHEL-5.3 isw raid 5 / raid 10 support has been implemented, as explained by Joel in bug 475385 comment 3. The implementation of the new isw raid 5 / raid 10 suport is being tracked in bug 437184 (which is still in progress), as such I'm closing this as a dup of 437184. *** This bug has been marked as a duplicate of bug 437184 *** Is the message: ERROR: only one argument allowed for this option ??? *** This bug has been marked as a duplicate of bug 437185 *** FYI: this bug will effect *all* type RAID10/RAID01 sets independent of
the metadata format type, not just isw!
There's probably bz duplicates ITR already.
Ed: I have identified this as a pyblock issue. I want to make sure you and me are on the same page, I have another updates image for you to try out. This should get passed the point where you were seeing the message. Pls try it out and post your findings on this bug. http://jgranado.fedorapeople.org/temp/raid10.img Joel: I understand. I will test Snap6, RAID 10 install using the image from comment #7 in this bug. I will post my results here. After trying this the first time I see some partition exception errors. Do you have a remote site information that I can send the exception error to? It asked for remote site information like: host name, file name, username, password. I would need all this information to send that way. I will post result here of syslog, anaconda.log, lsmod, exception error (if I can capture it). Any other information you may need? Started RAID 10 install with .img arg like below: Boot: linux updates=http://jgranado.fedorapeople.org/temp/raid10.img <enter> At the point of install, when the graphical interface show Skip and Back button for Installation Number I press ‘Skip’ Install continues to the ‘Installation requires partitioning. Now I see the /mapper/isw_xxxxx_Volume0, and others, in the Select drives box. Which previously I have only seen /dev/sda, /dev/sdb….. But I wonder why this does not say “/dev/mapper/isw_xxxxxx_Volume0? If I accept the default partition layout. And continue: Normal WARNING: Remove all Partitions? Answer: YES Install continues to Network setup and time settings. Password for Root is given and accepted. Choose no SW applications. BEGIN INSTALLATION? Answer: NEXT Then I get the EXCEPTION ERROR with choice of 1). Save to Remote 2). Debug 3). OK I collected the logs at this point. Installation of RHEL5.3 Snapshot6 goes further now. Still FAILS with raid10 and using above image. Attached is a .tgz of all files from raid10 with network image, .tgz contains: anaconda.log anacdump.txt syslog buildstamp_out lsmod_out uname –a out dmesg out from install dmraid_r_out from TTY2 dmraid_s_out from TTY2 dmraid_b_out from TTY2 dmraid_tay_out from TTY2 dmsetup table from TTY2 dmsetup targets from TTY2 Please find attached raid10S6wNET.tgz. Thanks, EDC Created attachment 327363 [details]
Raid 10 Snap6 with network img
Created attachment 327394 [details] retest with .img over NET This may help a bit more, please use this comment before above comment#9. I get this error earlier that the previous entry comments #9 in this sighting. I can consistently produce this error when installing with the network image. If I cancel and continue then I get the exception error as described in comment #9. Started RAID 10 install with .img arg like below: Boot: linux updates=http://jgranado.fedorapeople.org/temp/raid10.img <enter> At the point of install, when the graphical interface show Skip and Back button for Installation Number I press ‘Skip’ Install continues to the ‘Installation requires partitioning. “Error Error opening /tmp/mapper/isw_ceexxxxxx_R10wNET : No such device or address.” I did not know why this would be /tmp/mapper and not /dev/mapper? So, I included ls of each directory. Gathered logs at this point. Attached is a .tgz of all files from raid10 with network image, .tgz contains: anaconda.log syslog lsmod_out dmesg out from install dmraid_r_out from TTY2 dmraid_s_out from TTY2 dmraid_b_out from TTY2 dmraid_tay_out from TTY2 dmsetup table from TTY2 dmsetup targets from TTY2 dmsetup info from TTY2 ls of /tmp/mapper directory ls of /dev/mapper directory Please find attached imgWr10.tgz Thanks, EDC Ed: this is very good news. In the sence that I am seeing the same behavior as you. To answer some questions: 1. The /tmp/mapper thing, its normal. 2. Save to remote. If you have your network configured properly you can choose a host that accepts scp and put the traceback there. I *don't* have a server that can receive those files, but with all the information you have posted it will be enough. Created attachment 327906 [details] /var/log dir tar to show logs RAID5 does install with RHEL 5.3 RC1 with Default settings. There are errors on boot like in Bug 475384. Attached are the logs from /var/log directory. Also, I will attach a screen shot of the ugly boot error messages. Created attachment 327907 [details]
Screen capture of RAID5 boot error
Here is the screen capture of the RAID 5 BOOT errors. It looks like the install went fine and everything is up booting to dmraid5.d
RAID10 install does not work as stated above in commment #11. Let me know if you need any logs or files from this install. (In reply to comment #15) > Created an attachment (id=327906) [details] > /var/log dir tar to show logs > > RAID5 does install with RHEL 5.3 RC1 with Default settings. There are errors on > boot like in Bug 475384. Attached are the logs from /var/log directory. Also, I > will attach a screen shot of the ugly boot error messages. Ed: Does this result in an unusable system, or is it just an ugly message? This results in a usable system. The messages are just ugly. It looks like a RAID5. More testing is needed to make sure it is a true RAID5. Install is working for RAID5. I tested RHEL 5.3 RC2. Issue still exist. I didn't see errors during installation of RHEL5.3 on my DQ35JO system in RAID0 mode (NFS network install after booting from CD). There are errors post-install, but those are discussed in https://bugzilla.redhat.com/show_bug.cgi?id=475384. Ed, could you take a look and see if this BZ could be closed? Gary:
All raid except raid{10,01} should work fine for the installer in rhel5.3. This bug tracks the failure for raid10 and raid01. Currently I am working on rawhide to address these issues and will back port a patch to rhel5.4 as soon as I have a good approach.
Pls do *not* close this issue.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Created attachment 331410 [details]
dmraid recursive patch.
A lot of work has gone into rawhide to make raid10 work properly with the distribution. here I try to document all that has been done so that it can be easier at the time of development.
1. We need a pyblock patch that handles raid10' recursiveness.
2. We need a dmraid patch that correctly handles the bios order of the devices.
3. We will probably need some mkinitrd patches to make it work (not sure yet)
Created attachment 331411 [details]
dmraid patch to make the bios and dmraid coinside.
On reboot the system was not able to get to the grub. The reason: dmraid and BIOS did not see the devices in the same way. This patch fixes this issue.
Hans has created an mkinitrd patch to reduce/kill the IOError messages. It is located at https://bugzilla.redhat.com/attachment.cgi?id=331409. (In reply to comment #26) > Hans has created an mkinitrd patch to reduce/kill the IOError messages. It is > located at https://bugzilla.redhat.com/attachment.cgi?id=331409. Correction, that is a dmraid patch, not a mkinitrd patch, this patch adds a cmdline option to dmraid, which when used makes dmraid tell the kernel to forget about (unlearn) the partitions on the raw disks. If we get the patch in RHEL-5.4 dmraid, and we patch mkinitrd to patch the cmdline option, then the IO-errors *should* go away. This change will be present in python-pyblock-0.26-4 Hi Hans and Heinz, It looks like Bugzilla 475386 fixes issues in both python-pyblock and dmraid. Seems one fix is in python-pyblock-0.26-4 and the patch attached to the bugzilla adds an cmdline option to dmraid, which when used makes dmraid tell the kernel to forget about (unlearn) the partitions on the raw disks. If we get the patch in RHEL-5.4 dmraid, and we patch mkinitrd to patch the cmdline option, then the IO-errors *should* go away. https://bugzilla.redhat.com/show_bug.cgi?id=475386 RAID10 Install ERROR appears during installation of RAID10 isw dmraid raid array. Question: Heinz can you include this patch https://bugzilla.redhat.com/attachment.cgi?id=331409 in the next release of dmraid so it can get into RHEL 5.4? Need me to include you on the CC line? Hans can you produce an ISO image of this mkinitrd and dmraid fix? (If you need me to test this, the bug is in NEED INFO state) I do not have the capability to put this all together into a build and image in a timely manor. Thanks, *EDC* ~~ Attention Partners RHEL 5.4 Partner Alpha Released! ~~ RHEL 5.4 Partner Alpha has been released on partners.redhat.com. There should be a fix present that addresses this particular request. Please test and report back your results here, at your earliest convenience. Our Public Beta release is just around the corner! If you encounter any issues, please set the bug back to the ASSIGNED state and describe the issues you encountered. If you have verified the request functions as expected, please set your Partner ID in the Partner field above to indicate successful test results. Do not flip the bug status to VERIFIED. Further questions can be directed to your Red Hat Partner Manager. Thanks! Issue verified in RHEL5.4 Alpha with PASS result. Moving to VERIFIED based on comment #34 An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-1319.html |