Bug 219536 - When creating two software RAID devices, the mdX numbers don't match to what the kernel sees.
When creating two software RAID devices, the mdX numbers don't match to what ...
Status: CLOSED INSUFFICIENT_DATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: anaconda (Show other bugs)
5.0
All Linux
medium Severity medium
: ---
: ---
Assigned To: Joel Andres Granados
:
Depends On:
Blocks: 240441
  Show dependency treegraph
 
Reported: 2006-12-13 15:36 EST by Konrad Rzeszutek
Modified: 2008-01-25 13:46 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-01-25 13:46:22 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Screen shot of how I created the paritions manually. You can see the /dev/md0 is created, though the kernel thinks it is /dev/md2. (122.25 KB, image/jpeg)
2006-12-13 15:36 EST, Konrad Rzeszutek
no flags Details
screen shot of anaconda failure formating /dev/md0 (86.51 KB, image/jpeg)
2006-12-13 15:46 EST, Konrad Rzeszutek
no flags Details
Anaconda log from the install (49.97 KB, text/plain)
2006-12-13 15:47 EST, Konrad Rzeszutek
no flags Details
Syslog from install. (26.44 KB, text/plain)
2006-12-13 15:47 EST, Konrad Rzeszutek
no flags Details
serial output (23.29 KB, text/plain)
2007-07-03 17:32 EDT, Konrad Rzeszutek
no flags Details
RHEL4 kickstart file (1.37 KB, text/plain)
2007-12-14 11:39 EST, Joel Andres Granados
no flags Details
RHEL5 kickstart file (1.19 KB, text/plain)
2007-12-14 11:41 EST, Joel Andres Granados
no flags Details
ks for rhel 4 installation (1.22 KB, text/plain)
2007-12-20 10:35 EST, Joel Andres Granados
no flags Details
The kickstart for the rhel5 installation. (1.19 KB, text/plain)
2007-12-20 10:35 EST, Joel Andres Granados
no flags Details

  None (edit)
Description Konrad Rzeszutek 2006-12-13 15:36:14 EST
Description of problem:

When installing RHEL5 on a machine with two disks, and creating a 
/boot that is RAID0 and / that is RAID0 (or RAID1), the /dev/mdX values anaconda
sees are not what the kernel has in mind.

Version-Release number of selected component (if applicable):


How reproducible:

You can manually create the partitions or use this exceprt from a kickstart file:

install
nfs --server=bigpapi.boston.redhat.com
--dir=/vol/engineering/redhat/released/RHEL-5-Server/Beta-2/x86_64/os
key --skip
lang en_US.UTF-8
langsupport --default en_US.UTF-8 en_US.UTF-8
keyboard us
mouse genericwheelps/2 --device psaux
skipx
network --device eth0 --bootproto dhcp
rootpw rhts
firewall --disabled
authconfig --enableshadow --enablemd5
timezone --utc America/New_York
bootloader --location=partition --driveorder=sda,sdb
--append="console=ttyS0,115200 console=tty0"
text
cmdline
reboot
repo --name VT --baseurl file:///mnt/source/VT
# The following is the partition information you requested
# Note that any partitions you deleted are not expressed
# here so unless you clear all partitions first, this is
# not guaranteed to work
zerombr yes
clearpart --all --drives=sda,sdb
part raid.13 --size=100 --ondisk=sda --asprimary
part raid.14 --size=100 --ondisk=sdb --asprimary
part swap --size=2048 --ondisk=sda
part swap --size=2048 --ondisk=sdb
part raid.25 --size=100 --grow --ondisk=sda
part raid.24 --size=100 --grow --ondisk=sdb
raid /boot --fstype ext3 --level=RAID1 --device=md0 raid.13 raid.14
raid pv.26 --fstype "physical volume (LVM)" --level=RAID1 --device=md1 raid.24
raid.25
volgroup VGRHEL5 --pesize=32768 pv.26
logvol / --fstype ext3 --name=LVRHEL5 --vgname=VGRHEL5 --size=32512




Steps to Reproduce:
1. Try to install RHEL5 on a machine with two disks.
2. Create two RAID
3. See it fail when formating the second RAID device (based on the kickstart
fiel that would be /boot).
  
Actual results:

Cant' format /boot. Reboot.
Expected results:


Format /boot.

Work-around:

If I replace in the kickstart file 'md0' with 'md2' it works just fine.
But that still doesn't solve the problem as anaconda will create this kick
start file with 'md0' when doing this manually.

Additional info:
Comment 1 Konrad Rzeszutek 2006-12-13 15:36:14 EST
Created attachment 143550 [details]
Screen shot of how I created the paritions manually. You can see the /dev/md0 is created, though the kernel thinks it is /dev/md2.
Comment 2 Konrad Rzeszutek 2006-12-13 15:46:33 EST
Created attachment 143551 [details]
screen shot of anaconda failure formating /dev/md0
Comment 3 Konrad Rzeszutek 2006-12-13 15:47:08 EST
Created attachment 143552 [details]
Anaconda log from the install
Comment 4 Konrad Rzeszutek 2006-12-13 15:47:37 EST
Created attachment 143553 [details]
Syslog from install.
Comment 5 Konrad Rzeszutek 2006-12-13 15:48:09 EST
sh-3.1# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4]
md1 : active raid0 sdb3[1] sda3[0]
      66781696 blocks 256k chunks

md2 : active raid1 sdb1[1] sda1[0]
      104320 blocks [2/2] [UU]

unused devices: <none>
sh-3.1#
Comment 6 Konrad Rzeszutek 2006-12-13 15:48:32 EST
sh-3.1#  fdisk -l

Disk /dev/sda: 36.4 GB, 36401479680 bytes
255 heads, 63 sectors/track, 4425 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          13      104391   fd  Linux raid autodetect
/dev/sda2              14         268     2048287+  82  Linux swap / Solaris
/dev/sda3             269        4425    33391102+  fd  Linux raid autodetect

Disk /dev/sdb: 36.4 GB, 36401479680 bytes
255 heads, 63 sectors/track, 4425 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1          13      104391   fd  Linux raid autodetect
/dev/sdb2              14         268     2048287+  82  Linux swap / Solaris
/dev/sdb3             269        4425    33391102+  fd  Linux raid autodetect

Disk /dev/md2: 106 MB, 106823680 bytes
2 heads, 4 sectors/track, 26080 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md2 doesn't contain a valid partition table

Disk /dev/md1: 68.3 GB, 68384456704 bytes
2 heads, 4 sectors/track, 16695424 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md1 doesn't contain a valid partition table
Comment 7 Jeremy Katz 2006-12-15 11:34:20 EST
More concerning is:
<6>md: bind<sdb1>
<6>md: running: <sdb1><sda1>
<6>raid1: raid set md2 active with 2 out of 2 mirrors
<6>md: ... autorun DONE.
<4>md: invalid superblock checksum on sda3
<4>md: sda3 has invalid sb, not importing!
<4>md: autostart failed!

It looks like for some reason, the RAID isn't getting set up properly; what was
previously on the disks?
Comment 8 Konrad Rzeszutek 2007-05-11 13:45:26 EDT
Jeremy, I just noticed today (by looking in the BZ) that you had a question for
me. Let me re-do this setup and see if this is still a problem.
Comment 9 Konrad Rzeszutek 2007-05-15 14:27:12 EDT
Jeremy,

I really can't figure out what is previously on the disks as this happends with
a box in RHTS - so it might have been RHEL4 or RHEL5. No idea. The kickstart
file does though clear the partition.
Comment 10 Konrad Rzeszutek 2007-05-15 14:28:20 EDT
FYI: here is the ticket for the box.

https://engineering.redhat.com/rt3/Ticket/Display.html?id=11575
Comment 11 Konrad Rzeszutek 2007-05-17 14:26:37 EDT
And another:
https://engineering.redhat.com/rt3/Ticket/Display.html?id=11887
Comment 12 Jeff Burke 2007-05-21 15:08:38 EDT
This issue is causes test failure in RHTS. The workaround in the original post
doesn't seem to work.

Work-around:

If I replace in the kickstart file 'md0' with 'md2' it works just fine.
But that still doesn't solve the problem as anaconda will create this kick
start file with 'md0' when doing this manually.

Comment 13 RHEL Product and Program Management 2007-05-21 15:24:31 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 14 Chris Lumens 2007-06-01 14:10:44 EDT
Konrad, Jeff - does booting with
updates=http://people.redhat.com/clumens/219536.img fix things?
Comment 15 Konrad Rzeszutek 2007-07-03 17:32:10 EDT
Created attachment 158478 [details]
serial output

Didn't work. Here is the serial log.
Comment 16 Konrad Rzeszutek 2007-07-11 15:01:41 EDT
Changing it to the correct state.
Comment 17 Konrad Rzeszutek 2007-07-18 11:57:23 EDT
FYI: To reproduce this one needs to install RHEL4 beforehand. If I have RHEL5
installed (with the software RAID) and then re-install RHEL5 (with the software
RAID), it installs just fine. It only is a problem with an older RHEL4 software
RAID partition.
Comment 19 Joel Andres Granados 2007-12-06 15:59:25 EST
I could not reproduce:
1. I tried with Xen.  First installing a RHEL4 with the kickstart specified in
the first comment.  No dice.
2. I tried with Vmware. the same process as 1. Installed correctly.
3. I first kickstart installed a RHEL4 and then tried to modify the partitioning
stuff by hand in the RHEL5.  It installed correctly.

Might there be another thing to watch for in the reproduction process?
FYI. I didn't use the workaround.
Comment 20 Alexander Todorov 2007-12-07 05:03:44 EST
(In reply to comment #17)
> FYI: To reproduce this one needs to install RHEL4 beforehand. If I have RHEL5
> installed (with the software RAID) and then re-install RHEL5 (with the software
> RAID), it installs just fine. It only is a problem with an older RHEL4 software
> RAID partition.

Please can you be more precise about the steps to reproduce. It's not clear from
the above statement. Do you install RHEL4 with the kickstart partitioning (e.g.
2 raid devices) and then RHEL5 also with 2 raid devices?
Comment 21 Konrad Rzeszutek 2007-12-07 11:28:16 EST
(In reply to comment #20)
> (In reply to comment #17)
> > FYI: To reproduce this one needs to install RHEL4 beforehand. If I have 
RHEL5
> > installed (with the software RAID) and then re-install RHEL5 (with the 
software
> > RAID), it installs just fine. It only is a problem with an older RHEL4 
software
> > RAID partition.
> 
> Please can you be more precise about the steps to reproduce. It's not clear 
from
> the above statement. Do you install RHEL4 with the kickstart partitioning 
(e.g.
> 2 raid devices) and then RHEL5 also with 2 raid devices?

Yes.
Comment 22 Joel Andres Granados 2007-12-13 13:06:04 EST
I have already made these tests with xen and vmware. Where I kickstart install
RHEL4 with two disks and kickstart install a RHEL5 on top of that (of course
with two disks).  I also installed the RHEL5 interactively to make sure it
wasn't a non-kickstart related issue.  Both tests were made with xen and vmware
and I used the kickstart that is attached to this bug.   I will test again to
make sure.  Konrad, can you reproduce in a virtual machine (vmware or xen)? 
Comment 23 Joel Andres Granados 2007-12-14 11:39:26 EST
Created attachment 289201 [details]
RHEL4 kickstart file

This is the kickstart that I use to install RHEL4 on a xen guest.  It installs
succesfully.
Comment 24 Joel Andres Granados 2007-12-14 11:41:39 EST
Created attachment 289241 [details]
RHEL5 kickstart file

This is the kickstart that I use to install RHEL 5 ontop of the RHEL4 images. 
Install ends succesfully.
Comment 25 Joel Andres Granados 2007-12-14 11:45:05 EST
Regarding comments 23 and 24:
The process was basically install using kickstart file from comment 23 and then
install using kickstart file using kickstart file from comment 24.

1. Any comments about the test process?

I'm going to put the bug on needinfo but I will test once more with a physical
machine with two HDs.
Comment 26 Joel Andres Granados 2007-12-14 11:46:25 EST
Comment on attachment 289201 [details]
RHEL4 kickstart file

># Kickstart file automatically generated by anaconda.
>
>install
>url --url http://cobra02.anaconda.englab.brq.redhat.com/mirror/rhel/RHEL-4/U6/AS/x86_64/tree
>lang en_US.UTF-8
>langsupport --default=en_US.UTF-8 en_US.UTF-8
>keyboard us
>skipx
>network --device eth0 --bootproto dhcp
>rootpw --iscrypted otxStqXje25dM
>firewall --disabled
>authconfig --enablemd5 --enableshadow
>selinux --permissive
>timezone Europe/Prague
>bootloader --location=mbr --driveorder=xvda
>reboot
># The following is the partition information you requested
># Note that any partitions you deleted are not expressed
># here so unless you clear all partitions first, this is
># not guaranteed to work
>
>zerombr yes
>clearpart --all --drives=xvda,xvdb
>#part /boot --size=100 --ondisk=xvda --asprimary
>#part swap --size=1000 --ondisk=xvdb --asprimary
>#part / --fstype ext3  --size=100 --grow --asprimary
>part raid.13 --size=100 --ondisk=xvda --asprimary
>part raid.14 --size=100 --ondisk=xvdb --asprimary
>part swap --size=2048 --ondisk=xvda
>part swap --size=2048 --ondisk=xvdb
>part raid.25 --size=100 --grow --ondisk=xvda
>part raid.24 --size=100 --grow --ondisk=xvdb
>raid /boot --fstype ext3 --level=RAID1 --device=md2 raid.13 raid.14
>raid pv.26 --fstype "physical volume (LVM)" --level=RAID1 --device=md1 raid.24 raid.25
>volgroup VGRHEL5 --pesize=32768 pv.26
>logvol / --fstype ext3 --name=LVRHEL5 --vgname=VGRHEL5 --size=10000
>
>%packages
>@base
>@core
>
>
Comment 27 Joel Andres Granados 2007-12-20 10:35:14 EST
Created attachment 290160 [details]
ks for rhel 4 installation
Comment 28 Joel Andres Granados 2007-12-20 10:35:59 EST
Created attachment 290162 [details]
The kickstart for the rhel5 installation.
Comment 29 Joel Andres Granados 2007-12-20 10:52:22 EST
Just tested using two "real" HD with x86_64 arch.  There was no behavior as
expressed in comment 2.  There must be another detail that I'm missing to
reproduce this bug.  The bug does not appear just by installing rhel4 and then
rhel5 with the kickstart described on comments 27, 28 and  original message.
Comment 30 Joel Andres Granados 2008-01-25 13:46:22 EST
closing, lack of information.

Note You need to log in before you can comment on or make changes to this bug.