Bug 63304 - Installer crash in Disk Druid and swap problems on RAID
Installer crash in Disk Druid and swap problems on RAID
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: anaconda (Show other bugs)
9
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Jeremy Katz
Mike McLean
:
Depends On:
Blocks: 61590
  Show dependency treegraph
 
Reported: 2002-04-12 02:39 EDT by Need Real Name
Modified: 2007-04-18 12:41 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-10-04 22:32:41 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Anaconda crash dump from floppy [didn't know I should mount it -- attaching /dev/fd0 freezes Netscape :) ] (40.94 KB, text/plain)
2002-04-12 03:02 EDT, Need Real Name
no flags Details
Same crash with second public beta. Message before crash: "invalid partition type ffff" (maybe a different word than "type") (42.39 KB, text/plain)
2002-04-14 03:03 EDT, Need Real Name
no flags Details
Archive of /tmp contents including syslog. (11.72 KB, application/octet-stream)
2002-05-08 02:15 EDT, Need Real Name
no flags Details
Crash Dump - while converting from RAID (36.68 KB, text/plain)
2002-05-21 00:07 EDT, Need Real Name
no flags Details

  None (edit)
Description Need Real Name 2002-04-12 02:39:49 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.77 [en] (X11; U; Linux 2.4.18 i686)

Description of problem:
Installer crashed after formatting and initializing RAID sets and choosing
packages.

Installer forced reboot if using swap on RAID due to uninitialized swap space
(but the swap was initialized and the "format" button had been selected).

Installer crashed in Disk Druid after converting two RAID partitions into
regular swap partitions (RAID sets were not defined at the time.


These were three separate incidents but I think there may have been a causal
relationship between them.  Perhaps the first crash left some half formatted
paritions and the later installs didn't bother to reformat them even though
"format" was selected.  I think the crashes were also all related to using RAID
in some way.  The second issue was reproducable.  I did not try to reproduce the
other two.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Partition two disks identically each with 7 partitions (plus one for the
extended partition)

2. Mark all partitions as RAID

3. For attempts ONE, TWO, and THREE continue here, for attempt FOUR goto step
10.

3. Make RAID sets in pairs (RAID0 for everything except /boot) for all
filesystems.  I used:
 /boot      (RAID1)
 /            (RAID0)
 swap      (RAID0)
 /usr         (RAID0)
 /var         (RAID0)
 /usr/local  (RAID0)
 /home     (RAID0)

4.  Continue with configuration/install

5.  Install fails.  End of steps.

10.  Suspect swap on RAID doesn't work. Convert RAID partitions for swap into
swap partitions.  Also select "skip X configuration" because you are becoming
more cautious.

11.  Attempt to create first RAID set, installer crashes and asks to insert a
floppy and make a bug report.

12.  File this report ;)
	

Actual Results:  attempt ONE:

Filesystems were initialized, I'm not sure if swap was because I wasn't
expecting problems.
I looked at the screen and the installer had crashed (no prompt for floppy or
anything) and it said I could shut down the system.

attempt TWO:

Dialog box stating that the swap space couldn't be enabled probably because the
space had not been initialized even though I had selected "format" in Disk
Druid.

attempt THREE:

Same as above (I was making sure I had not forgotten to select "format").

attempt FOUR:

Installer crash in Disk Druid.


Expected Results:  attempt ONE:

Install would work.

attempt TWO:

Install would format anything missed the first time around and would complete
successfully.

attempt THREE:

Same as TWO.

attempt FOUR:

Swap problem would be bypassed and install would complete successfully.


Additional info:

I'm using two identical 45 GB IDE disks, each the master on its own channel of a
Promise PCI IDE controller.  The drives we each partitioned identically.  There
is also an on-board IDE controller which has the CDROM on one channel and the
Zip drive on the other. 

The isntall was on a PIII 733 with 256MB of RAM.

The install was perfomed with a floppy (bootnet.img) and an NFS install from
another system on the local network with copies of the first four ISO images. 
The checksums of the ISOs are fine.  I checked them during attempt THREE.
Comment 1 Need Real Name 2002-04-12 03:02:07 EDT
Created attachment 53584 [details]
Anaconda crash dump from floppy [didn't know I should mount it -- attaching /dev/fd0 freezes Netscape :) ]
Comment 2 Need Real Name 2002-04-14 03:03:35 EDT
Created attachment 53768 [details]
Same crash with second public beta.  Message before crash: "invalid partition type ffff" (maybe a different word than "type")
Comment 3 Michael Fulbright 2002-04-15 18:15:33 EDT
I tried to reproduce your step #1 and did not have it happen on my promise-66
controller with 2 drives.

It certainly seems like something is not correct but since we cannot get it to
happen its unclear how to resolve this bug other than 'WORKSFOME'.

Which promise controller do you have?
Comment 4 Michael Fulbright 2002-05-06 14:28:03 EDT
Closing due to inactivity. Please reopen if there is new information regarding
the issue report.
Comment 5 Need Real Name 2002-05-06 14:49:11 EDT
Interesting.

I didn't get an email about the info request.

Here is the information off of the card.  I can lookup the firmware revision if
you need that.


 Promise Ultra ATA/66
 SN ID4S61100701

 CK9994V-0
 00-07
Comment 6 Need Real Name 2002-05-08 02:13:10 EDT
The bug is still present in the official 7.3 release.

However, the installer does not crash.

It displays the following error:

"Error mounting device md5 as /:
 Invalid argument

This most likely means this partition
 has not been formatted.

 Press OK to reboot your system."

I selected the "format" option when creating all filesystems.

I will attach a tarfile of the /tmp directory with the modules removed and
loopback image unmounted.
I suspect the syslog and raidtab files would be useful.

I have also copied down the contents of the debugging console screens and some
/proc info.
Comment 7 Need Real Name 2002-05-08 02:15:36 EDT
Created attachment 56604 [details]
Archive of /tmp contents including syslog.
Comment 8 Need Real Name 2002-05-08 02:54:58 EDT
Recap of setup:
2 identical 45GB EIDE drives.
Each with 8 DOS-style partitions (including the extended "container" partition).
Drives partitioned identically.
RAID1 on /boot and swap.
RAID0 on /, /usr, /var, /usr/local, /home.
ext3 used for fs.

Events before error:
After package selection and timezone, GRUB, root pw configuration the
installer showed progress bars for formatting the following filesystems in
the following order:
 /
 /boot
 /home
 /usr
 /usr/local
 /var

The /usr, /usr/local, and /home filesystems took longer than the others as
expected.
There was no mention of swap initialization at this point.



Information collected after the error:

I hit ALT-F2 and recorded the following:
# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid5]
read_ahead 1024 sectors
md1 : active raid1 hdg7[1] hde7[0]
         513984 blocks [2/2] [UU]
# cat /proc/swaps
Filename    Type      Size       Used   Priority
/tmp/md1   partition  513976   0         -1


ALT-F3 showed the following:

* (many module messages)
 ...
* Anaconda floppy device fd0
* Detected 256M of memory
* Swap attempt of 256M to 512M
* (many depcheck messages)
  ...
* missing components of raid device md0.  The raid device needs 2 drive(s) and
only 1 (was/were) fou
nd. This raid device will not be started.
* missing components of raid device md1.  The raid device needs 2 drive(s) and
only 1 (was/were) fou
nd. This raid device will not be started.
* missing components of raid device md6.  The raid device needs 2 drive(s) and
only 1 (was/were) fou
nd. This raid device will not be started.
* missing components of raid device md5.  The raid device needs 2 drive(s) and
only 1 (was/were) fou
nd. This raid device will not be started.
* missing components of raid device md3.  The raid device needs 2 drive(s) and
only 1 (was/were) fou
nd. This raid device will not be started.
* missing components of raid device md4.  The raid device needs 2 drive(s) and
only 1 (was/were) fou
nd. This raid device will not be started.
* missing components of raid device md2.  The raid device needs 2 drive(s) and
only 1 (was/were) fou
nd. This raid device will not be started.


ALT-F4 showed the following:
<6>md: hdg8 [events: 00000002]<6>(write) hdg8's sb offset 20820096
<6>md: hde8 [events: 00000002]<6>(write) hde8's sb offset 20820096
<6>md: md6 stopped.
<6>md: unbind<hdg8,1>
<6>md: export_rdev(hdg8)
<6>md: unbind<hde8,0>
<6>md: export_rdev(hde8)
<6>md: marking sb clean...
<6>md: updating md2 RAID superblock on device
<6>md: hdg3 [events: 00000002]<6>(write) hdg3's sb offset 6144768
<6>md: hde3 [events: 00000002]<6>(write) hde3's sb offset 6144768
<6>md: md2 stopped.
<6>md: unbind<hdg3,1>
<6>md: export_rdev(hdg3)
<6>md: unbind<hde3,0>
<6>md: export_rdev(hde3)
<6>md: marking sb clean...
<6>md: updating md3 RAID superblock on device
<6>md: hdg2 [events: 00000002]<6>(write) hdg2's sb offset 11261440
<6>md: hde2 [events: 00000002]<6>(write) hde2's sb offset 11261440
<6>md: md3 stopped.
<6>md: unbind<hdg2,1>
<6>md: export_rdev(hdg2)
<6>md: unbind<hde2,0>
<6>md: export_rdev(hde2)
<6>md: marking sb clean...
<6>md: updating md4 RAID superblock on device
<6>md: hdg5 [events: 00000002]<6>(write) hdg5's sb offset 4096448
<6>md: hde5 [events: 00000002]<6>(write) hde5's sb offset 4096448
<6>md: md4 stopped.
<6>md: unbind<hdg5,1>
<6>md: export_rdev(hdg5)
<6>md: unbind<hde5,0>
<6>md: export_rdev(hde5)
<3>md: EXT3-fs: unable to read superblock
<6>md: md1: sync done.


ALT-F5 showed the following:
disk 1: /tmp/hdg5, 4096543kB, raud super block at 4096448kB
mke2fs 1.27 (8-Mar-2002)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
1024128 inodes, 2048224 blocks
102411 blocks (5.00%) reserved for the super user
First data block=0
63 block groups
32768 blocks per group, 32768 fragments per group
16256 inodes per group
Superblock backups stored
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632

Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 24 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override
tune2fs 1.27 (8-Mar-2002)
Setting maximal mount count to -1
Setting interval between check 0 seconds
e2label: Invalid argument while trying to open /tmp/md5
Couldn't find valid filesystem superblock. 
e2label: Invalid argument while trying to open /tmp/md0
Couldn't find valid filesystem superblock. 
e2label: Invalid argument while trying to open /tmp/md6
Couldn't find valid filesystem superblock. 
e2label: Invalid argument while trying to open /tmp/md2
Couldn't find valid filesystem superblock. 
e2label: Invalid argument while trying to open /tmp/md3
Couldn't find valid filesystem superblock. 
e2label: Invalid argument while trying to open /tmp/md4
Couldn't find valid filesystem superblock. 


I also saved the contents of /tmp and attached them to this report.

Let me know if you need any more information.
Comment 9 Need Real Name 2002-05-21 00:04:52 EDT
I have given up on RAID installs with 7.3 (beta and final versions).

However, The install still fails in similar ways.  I guess this bug does not
depend on using software RAID.
I have tried both NFS and CDROM installs of the real release of 7.3.  The fail
in the same manner.

The new configuration will use the disks as follows:

hde (45GB):
      100MB  /boot
    2048MB  /
    2048MB  swap
    8192MB  /var
          rest   /usr
hdg (45GB):
            all   /home

The first attempt crashed.  I was in the middle of chaing the partition tables
with
the DiskDruid tool.  I had just converted the last partition of hde and was
preparing
to work on hdg when the exception window appeared.  I will attach the crashdump.

Before I rebooted I decided that maybe something bad in the partition tables was
causing my problem so I ran:
 cat /dev/zero > hde
 cat /dev/zero > hdg
for a long time.  fdisk no longer reported a parition table.

So I went back and used DiskDruid to parition the disks and this time it didn't
crash!

However, just like with the RAID configuration, it died because of an "Invalid
argument."
when enabling the swap space.  It said this was probably because the swap space
wasn't
initialized.  So I went to F2 and manually initialized hde6 with mkswap and
tried to do a
swapon hde6 and I got the same error.  I tried using version 0 of the format but
that also
failed.  I tried to use more to check the partition and it seemed to contain the
magic string
(can't remember what it was but it was all uppercase).  Of course that was a
binary file
and it was difficult to tell.

I'm just about out of ideas and I'm nearly ready to give up on using Red Hat on
this box.

Will you take a look at this bug or not?
Comment 10 Need Real Name 2002-05-21 00:07:04 EDT
Created attachment 58065 [details]
Crash Dump - while converting from RAID
Comment 11 Need Real Name 2002-07-25 17:15:42 EDT
Any news?

Should I try the 7.4 or 8 beta?

Should I just install Debian?
Comment 12 Jeremy Katz 2004-10-04 22:32:41 EDT
This should be better with newer releases.

Note You need to log in before you can comment on or make changes to this bug.