Bug 63848

Summary: Grub hangs at boot with no output
Product: [Retired] Red Hat Public Beta Reporter: Jason Tibbitts <j>
Component: grubAssignee: Jeremy Katz <katzj>
Status: CLOSED RAWHIDE QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: skipjack-beta2   
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-01-20 17:07:52 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jason Tibbitts 2002-04-19 15:11:15 UTC
Description of Problem:
After installing Skipjack-beta2 on a machine with a Tyan Tiger S2466N dual
Athlon motherboard (BIOS 1.01) with only one processor and all disk in a single
1.1TB array behind a 3ware 7850 controller, the machine will not boot.

After the BIOS init, the screen clears and presents a single blinking cursor. 
There is no output.  This is very similar to a previous problem booting Red Hat
7.2 with grub on a ServerWorks dual P3 board that was solved by an upgrade to
grub 0.91.

If I make a grub floppy and do "root (hd0,0)" and "configfile /grub/grub.conf"
the menu loads and the machine boots fine.

Version-Release number of selected component (if applicable):
Red Hat Skipjack-beta 3, grub 0.91-3

I pulled 0.91-4 from Rawhide but it looks like the only difference is the splash
screen.

How Reproducible:
On every reboot.

Comment 1 Jeremy Katz 2002-04-19 15:23:54 UTC
Just the obvious question -- is there a newer bios available?  

Also, does it work if you comment out the splashscreen directive?

Comment 2 Jason Tibbitts 2002-04-19 15:34:22 UTC
This is the newest BIOS available for the board.  There is a beta BIOS, but
given the dismal quality of their release software, I'm not even considering the
beta.

I should add that Grub works fine on this motherboard with disk hooked up to the
IDE channel, so it's not a problem inherent to the motherboard or BIOS.

Commenting out the splashscreen directive doesn't help; I'm pretty certain that
things are failing long before it gets to that point.  I can't rule out it dying
in the BIOS before it gets to run anything, but it does clear the screen which
is something that the BIOS generally doesn't do itself.

Just FYI, boot is well below any barrier, but this disk is utterly huge and this
could confuse any number of things:

Using /dev/sda
Information: The operating system thinks the geometry on /dev/sda is
139508/255/63.  Therefore, cylinder 1024 ends at 8032.499M.
(parted) print
Disk geometry for /dev/sda: 0.000-1094334.500 megabytes
Disk label type: msdos
Minor    Start       End     Type      Filesystem  Flags
1          0.031    125.507  primary   ext3        boot
2        125.508   3122.006  primary   ext3
3       3122.007   5169.353  primary   linux-swap
4       5169.353 -1002818.006  extended              lba
5       5169.384   9264.045  logical   ext3
6       9264.076  17453.430  logical   ext3
7      17453.461  17971.149  logical   ext3
8      17971.181  18488.869  logical   ext3



Comment 3 Jason Tibbitts 2002-04-19 17:13:24 UTC
Some more information.  I booted to a grub floppy and did:

root (hd0,0)
setup (hd0)

and I get (typed in manually):

 Checking if "/boot/grub/stage1" exists... no
 Checking if "/grub/stage1" exists... yes
 Checking if "/grub/stage2" exists... yes
 Checking if "/grub/e2fs_stage1_5" exists... yes
 Running "embed /grub/e2fs_stage1_5 (hd0)"... 22 sectors are embedded.
succeeded
 Running "install /grub/stage1 d (hd0) (hd0)1+22 p (hd0,0)/grub/stage2
/grub/grub.conf"... failed

Error 24: Attempt to access block outside partition

Next I'll try it from the grub shell while Linux is booted; perhaps I'll get a
useful kernel message.

Comment 4 Jeremy Katz 2002-04-22 04:11:27 UTC
What are the contents of your /boot/grub/device.map file and what do you have
your BIOS boot order set up as?

Comment 5 Jason Tibbitts 2002-04-22 14:36:29 UTC
/boot/grub/device.map has:

# this device map was generated by anaconda
(fd0)     /dev/fd0
(hd0)     /dev/sda

BIOS boot order is:

Removable Devices
  Legacy Floppy Drives
Hard Drive
  3ware Storage Controller
  Bootable Add-in Cards
CD-ROM
MBA UNDI(Bus2 Slot0)

The last item is the network card.  The machine has no bootable cards besides
the 3ware card and no CD-ROM.  There are no IDE devices (other than the ones
behind the 3ware card, of course).

Comment 6 Jason Tibbitts 2002-11-20 17:57:12 UTC
I just wanted to add that this problem persists with Red Hat 8 (grub-0.92-7).  I
have several machines that cannot boot without making little grub floppies; all
have one or more 3ware cards with as much as 1.4TB on a single card (all of
which appears as one utterly huge /dev/sda).  None have any other attached disk
(other than floppies and perhaps CDs).

Comment 7 Jeremy Katz 2002-12-29 06:15:39 UTC
I've just put grub-0.93-1 up at http://people.redhat.com/~katzj/grub/.  Could
you see if this fixes the problems?

Comment 8 Jason Tibbitts 2003-01-09 18:47:43 UTC
It took me a bit to get a storage server I could play around with.  Finally I
built a new machine, 8x200GB disks on a 3w7500-8 controller in RAID5 mode, dual
Xeon processors, 6GB of RAM, Red Hat 8.  The machine will not boot without a
GRUB floppy (of which I have a standard one used for all of my storage servers).
 The /boot/grub/device.map is identical to the one included above.

I downloaded and installed your 0.93 RPM and ran grub-install /dev/sda.  The
failure is the same, although I notice the size of stage1_5 has shrunk a bit:

util3:~> s grub-install /dev/sda
 
 
    GRUB  version 0.93  (640K lower / 3072K upper memory)
 
 [ Minimal BASH-like line editing is supported.  For the first word, TAB
   lists possible command completions.  Anywhere else TAB lists the possible
   completions of a device/filename. ]
grub> root (hd0,0)
 Filesystem type is ext2fs, partition type 0x83
grub> setup  --stage2=/boot/grub/stage2 --prefix=/grub (hd0)
 Checking if "/grub/stage1" exists... yes
 Checking if "/grub/stage2" exists... yes
 Checking if "/grub/e2fs_stage1_5" exists... yes
 Running "embed /grub/e2fs_stage1_5 (hd0)"...  16 sectors are embedded.
succeeded
 Running "install --stage2=/boot/grub/stage2 /grub/stage1 (hd0) (hd0)1+16 p
(hd0,0)/grub/stage2 /grub/grub.conf"... failed
 
Error 24: Attempt to access block outside partition
grub> quit


Please let me know if there is any further testing I can do.

Comment 9 Jeremy Katz 2003-01-09 20:03:02 UTC
So these are on arrays of larger than a terabyte?  Can you try the test package
available at http://people.redhat.com/~katzj/grub/test/ which should have the
fix for this?

Comment 10 Jason Tibbitts 2003-01-09 20:28:39 UTC
Yes, all of the affected machines have at least 1TB of disk, although I have no
storage servers smaller than that to test on; the small ones are still booting
via LILO.

I downloaded and installed the test package; grub-install ran to completion and
the machine was able to reboot without a boot floppy.  Thanks!

Comment 11 Jeremy Katz 2003-01-20 17:07:52 UTC
Okay, this is in grub-0.93-3