Red Hat Bugzilla – Bug 174306
'Unconventional' disk druid installation (RAID/non-RAID) scratches MBR on first disk
Last modified: 2008-07-24 13:56:48 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7
Description of problem:
I have a Primergy L100 with two harddisks /dev/hde and /dev/hdg
(these are the only disks on that system and these are factory settings)
(I have a setup similar to the one below on a Scaleo, but that one was
not set up through the installation procedure but 'after the fact' once
the /boot RAID broke.)
I want to install these disks as follows:
/dev/hde1 /dev/hde2 /dev/hde3 /dev/hde4
/dev/hde ---> | /boot | md-RAID1 | md-RAID1 | md-RAID1 |
/dev/hdg1 /dev/hdg2 /dev/hdg3 /dev/hdg4
/dev/hdg ---> | /boot2 | md-RAID1 | md-RAID1 | md-RAID1 |
forming: | /dev/md0 | /dev/md1 | /dev/md2 |
onto which I put: | swap 1 | swap 2 | LVM |
and onto that: | root fs etc. |
That is, I do not want to have the /boot partition on an md
device (I have had a few problems with that approach) but the
rest I *do* want to have on a mirrored md device (incl. the swap
How do I set this up?
* When disk druid comes up, configure the four parts above as
RAID1 devices on /dev/hde. Just 'force' the first part to be
a primary partition.
* Clone /dev/hde to /dev/hdg.
* Modify the filesystem type of /dev/hd1 and /dev/hde2 to be 'ext3'
from 'software RAID' and set the mountpoints.
* Bind the other partitions into RAID devices and set up the
After that installation proceeds without any problem. However, on the
first reboot, the partition table on /dev/hde is gone. /dev/hdg still
has a valid partitioning. SO I suppose something messed up the MBR on
'dd' of the MBR on both discs shows that the /dev/hde MBR is shifted by
16 NUL byte. Attached is a dump (blocks obtained with dd if=/dev/hde bs=512 count=1)
I have tried to install twice with the same results. Installing w/o
RAID devices on /dev/hde only works w/o any problem.
Version-Release number of selected component (if applicable):
Red Hat ES 4.0 Update 2 installation ISO images
Steps to Reproduce:
1. Install as described above
Actual Results: /dev/hde no longer has a valid partition table.
Expected Results: /dev/hde should have a valid partition table.
Created attachment 121519 [details]
Dumps of the MBRs of /dev/hde (bad) and /dev/hdg (good)
Created attachment 121520 [details]
Bad MBR on /dev/hde
Created attachment 121521 [details]
Good MBR on /dev/hdg
Uhhh...there might be something wrong with the disk. It's gone from the disk
druid menu now. Maybe one should add SMART diagnostics to Anaconda? Will check
This problem is getting uncanny, I did a few (7) tests for which the
result was as described above, but now I have had a run of installations
in which the problem has mysteriously gone away.
I have however checked that it is not a *fault* in the hardware (it could
be consistent error in the Promise RAID controller for example, which, even
if the RAID mode is disabled, still controls the harddisks, which is why they
are named /dev/hde and /dev/hdg): I checked both disks, moved disks to another
machine of the same type, same problematic behaviour until a few hours ago.
I know there *is* something up with the hardware as I have encountered some
problems re-reading the MBR after writing it in fdisk.
To reiterate on the case 'MBR on /dev/hde' has been destroyed:
* Install as described above, using the RAID cloning feature to set
up software RAID partitions on "/dev/hde" and "/dev/hdg", modify the
partition type of the first two RAID partitions on each disk to 'ext3'
to get a place to put /boot onto, bind the remaining partitions into
RAID mirrors, format etc. Later repeats say that it is unimportant
whether there is a swap partition or not, the problem occurs regardless.
* Proceed with installation. No problem until reboot.
* After reboot, nothing bootable can be found, i.e. /dev/hde has no longer
a valid MBR.
* Reboot machine using RH ES 4 installation CD. Disk druid now consistently
says that "/dev/hde" is gone (looks like it isn't installed). "/dev/hdg"
is visible and correctly partitioned. This "disk invisiblity" problem
may or may not be related to the MBR problem, but I suspect some hardware
* On the console, fdisk *can* find "/dev/hde". The kernel can also find it:
In /tmp/syslog, the last message about that drive is
"hde: unknown partition table".
* I tried to set up a new partitioning on "/dev/hde"
with "fdisk", but when writing the partition table, I get "error 5:
I/O error. Kernel still uses the old table, the new table wil be used
at the next reboot".
* I zeroed the disk using 'autoclave' from the Ultimate Boot CD but after
a reboot into RH ES 4 installation CD, disk druid could still not see
* I wrote some binary garbage into the MBR of that disk using PTS Disk
Editor to check whether the MBR could be properly written. Works.
(However, these DOS-based programs cannot be reliably started on the
present hardware, as said there is something funny with it)
After a reboot into RH ES 4 installation CD, disk druid could still not see
* However, an installation aborted before 'package installation' reveals
that the MBRs are correctly written on both disks.
Other things tried:
Installation as above, with md device cloning, but keep "/boot" installed on a
software RAID mirror. After reboot, both disks are good but the machine can't
actually boot (i.e. you get the grub command line). No real surprise.
I will do more tests, but this begins to look like some unfathomable thing.
It's the hardware.
There is probably something wrong with the on-board Promise RAID controller.
(Promise Technology, Inc. PDC20265 (FastTrak100 Lite/Ultra100) (rev 02))
If I could solder it off, I would.
The problem boils down to the fact that a simple 'reboot' of the machine won't
work. After a reboot, harddisks cannot be properly read, i.e. /dev/hde seems
to have no valid bootsector and even though /dev/hdg at least shows a valid
partitioning, booting from it results in GRUB giving out unconventional
characters then stopping after printing out "stage2".
You actually have to "power cycle" the machine. After that, /dev/hde and
/dev/hdg are both visible, booting works (though not off an md device) and
the system comes up nicely. Disks have been set up as described on the
2005-11-27, with the /boot on /dev/hdg instead of /dev/hde to make sure
a /boot is available.
It looks like the only interesting problem is: why does disk druid declare
that the first harddisk does not exist, even if it does?
Well, I guess you may close this bug.
Repeat query for resolving bug (NOTABUG)
Per final comment, and at reporter's suggestion, we're finally closing this nonbug.