Bug 19192 - mke2fs hangs in format of / during install
Summary: mke2fs hangs in format of / during install
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: installer
Version: 7.0
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Michael Fulbright
QA Contact: Brock Organ
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2000-10-16 18:08 UTC by Tony Kocurko
Modified: 2007-04-18 16:29 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2000-10-20 17:20:35 UTC
Embargoed:


Attachments (Terms of Use)

Description Tony Kocurko 2000-10-16 18:08:21 UTC
This is a repeatable problem.

The installation of RedHat 7.0 hangs with no error message at the point
that the formatting of the file systems begins with the root file system.

Once the install program begins the actual install by formatting the root
file system, I quickly switch to virtual console 5 and see this:

  mke2fs 1.18, 11-Nov-1999 for EXT2 FS 0.5b, 95/08/09
  Filesystem label=/
  OS type: Linux
  Block size=4096 (log=2)
  Fragment size=4096 (log=2)
  510976 inodes, 1020119 blocks
  51005 blocks (5.00%) reserved for the super user
  First data block=0
  32 block groups
  32768 blocks per group, 32768 fragments per group
  15968 inodes per group
  Superblock backups stored on blocks:
          32768, 98304, 163840, 229376, 294912, 819200, 884736

  Writing inode tables: 20/32

I watch as the count in the "Writing inode tables" line, just above,
progresses towards its limit. Then it simply freezes. Virtual console 2,
the shell console, has no reaction to keypresses, while virtual consoles
3, 4 and 5, each will take a RETURN keypress and scroll one line.

  Now, the dirty details: The hard disk is a 4 GB Western Digital Caviar
sitting as master on the primary IDE connector of an ASUS CUSL2 mother-
board, an HP CD-Writer Plus 8100 is the master on the secondary IDE port
of the motherboard, we've got 512 MB of RAM, a 733 MHz PIII (Coppermine),
a 3Com NIC (unattached to the net as of now), a pair of Promise Technology
Ultra100 ATA/100 PCI cards above the NIC, and six Maxtor 80 GB drives,
besides an old Chinon floppy drive.

  I've set the partition table so that it has a 16 MB boot sector, then
a 128MB swap sector, and the rest of the disk is left as /. I've tried
shrinking the size of the root partition in the belief that things happen
at the vertices and edges of problem spaces. Uh-uh. No luck.

  Is there a way to run mke2fs from virtual console 2? All that I can see
under /tmp is /tmp/hda, as far as the hard drive goes.

  Any ideas would be greatly appreciated.

Cheers,
Tony Kocurko - Memorial U. of Newfoundland

Comment 1 Alan Cox 2000-10-17 15:57:55 UTC
Does the kernel log screen (4 I think) show drive errors at this point?


Comment 2 Tony Kocurko 2000-10-17 16:35:33 UTC
Dear Alan:

  Yes, indeed there is something wrong on Virtual Console 4! All 24 lines have
this same message:

<3>kmem_alloc: Bad slab magic (corrupt) (name=buffer_head)

Interestingly, Virtual Console 2 now let's me use the shell. Hmmmm, things are
becoming curiouser and curiouser.

So far, I've tried another hard drive, another IDE cable, and another operating
system (4.1 FreeBSD), and they all fail to fix the problem. I'm beginning to
suspect the motherboard, although I can't for the life of me see how that could
be the case, since mke2fs is failing while in the middle of checking for bad
blocks.

  As I was writing this, the system had an Oops, according to syslog:
<4>Oops: 0000
<4>CPU:    0
<4>EIP:    0010:[<c011f41d>]
<4>EFLAGS: 00010006
<4>eax: c1f7afc0 ebx: c1f7afc0 ecx: 000c7fec edx: c1f7acb4
<4>esi: 00000400 edi: dfed1740 ebp: 00000282 esp: dee7bc9c
<4>ds: 0018 es: 0018 ss: 0018
<4>Process badblocks (pid: 30, process nr: 13, stackpage=dee7b000)
<4>Stack: 00000000 00000400 c01257a5 dfed1740 00000005 c1f7acb4 00000000 \
                   c012582e
<4>       00000000 00000400 00000400 c1daf000 00000301 dee7bcdc dee7bcdc \
                   dee7a000
<4>       dee7a000 00000000 c0126399 c1daf000 00000400 00000000 00000000 \
                   00000400
<4>Call Trace: [<c01257a5>] [<c012582e>] [<c0126399>] [<c01251d5>] \
                            [<c012537e>] [<c0128723>] [<c01237f0>]
<4>            [<c0123a4a>] [<c0107aac>]
<4>Code: 8b 01 89 03 85 c0 74 2d 8b 73 04 85 f6 75 14 89 19 89 c8 2b
<1>Unable to handle kernel paging request at virtual address 000c7fec
<1>current->tss.cr3 = 1fe5f000, %cr3 = 1fe5f000

  My next step is to install a hard drive from a colleague who has installed
a generic, minimal linux system just for such testing. If that doesn't boot,
then I need to look elsewhere than the installer.

  I'll continue to worry this problem and let you know what happens.

Cheers,
Tony Kocurko - Memorial U. of Newfoundland


Comment 3 Tony Kocurko 2000-10-17 17:58:34 UTC
With a hard drive, pre-installed with Linux, as /dev/hda, I was able to boot
the system, logon as root and get a bash prompt.

The previous drive, with which I've been having all kinds of headaches, is now
/dev/hdb, as slave to the Linux disk.

fdisk /dev/hdb ran flawlessly, just as it has been during the previous install-
ation attempts. Now, I've got /dev/hdb1 as the bootable root partition and
/dev/hdb2 as the Linux swap partition.

mke2fs /dev/hdb1 fails miserably! It crashes horribly as it did during the
RedHat 7.0 install described in the earlier posts. Not only that, but I cannot
even boot the system now, and the boys in Computer Science are going to beat
me to death for torching their hard drive! If I die and it looks like an
accident or suicide, tell the police to check out the sysadmins in the
Department of Computer Science.

I think that we're pretty sure now that the problem is the motherboard. I'm
still confused though, since fdisk supposedly is writing to the disk when it
stores the partition table. Why, then, should mke2fs fail when writing inode
tables to the same disk?

Just so I don't sound completely stooopid, the IDE controllers on the ASUS
CUSL2 are ATA/100. But I am correct, aren't I, in assuming that everything
reverts to the lowest common denominator, as far as speed and protocol goes?

Cheers(?),
Tony Kocurko - Memorial U. of Newfoundland


However, mke2fs /dev/hdb1

Comment 4 Tony Kocurko 2000-10-17 19:21:19 UTC
Downloaded and booted from tomsrtbt.
fdisk /dev/hda found this partition table:

Disk /dev/hda: 255 heads, 63 sectors, 524 cylinders
Units = cylinders of 16065 * 512 bytes

   Device Boot    Start   End   Blocks   Id  System
/dev/hda1   *         1   458  3678853+  83  Linux native
/dev/hda2           459   524   530145    5  Extended
/dev/hda5           459   524   530113+  82  Linux swap

mke2fs /dev/hda1 failed while writing inode tables, as before, but there was
this additional message: "Kernel panic: VFS: Free block list corrupted"

Of course, I'll get rid of that /dev/hda2 bit, eh?

Back at ya' when I learn more.


Comment 5 Tony Kocurko 2000-10-18 12:17:57 UTC
According to http://www.keylabs.com/linux/results/asus_cusl2.html,
the Asus CUSL2 motherboard with the Intel 815e chipset is compatible with
RedHat Linux 6.2, among other distributions.


Comment 6 Tony Kocurko 2000-10-20 17:18:56 UTC
Very possibly caused by bad RAM.
memtest86 (http://reality.sgy.com/cbrady_denver/memtest86/) is reporting
thousands of errors (we're at 6500 errors, so far, and only 40% through the
third of six memory test).

This might explain why the install fails repeatedly but at wildly different
points in the process and why I can commit partition table changes (small amount
of memory) but can not do either badblocks or mke2fs.

Do you want to change the status to RESOLVED?

Cheers,
Tony Kocurko - Memorial University of Newfoundland




Comment 7 Tony Kocurko 2000-10-20 17:20:33 UTC
Of course, the "sgy" in the previous entry's URL reference SHOULD have
been "sgi". duh.


Comment 8 Tony Kocurko 2000-10-23 16:47:04 UTC
Hmmmmmm, how should I put this? Oh, I know _ like this: ARRRGGHHHHH!!!!!

The so-called 133 MHz SDRAM that we got was a bit off-brand and had to be
throttled back to 100 MHz. Problem solved. Vendor called to replace the
off-brand memory with either Nat'l Semiconductor or Kingston Tech SDRAM.

Also, note that the Intel 815e chipset bogusly reports memory errors when
one uses the memtest86 program mentioned in an earlier post. Note to self:
In the future, make checking of memory the first thing done on arrival of
new systems.

Transmission over....


Note You need to log in before you can comment on or make changes to this bug.