Bug 108923

Summary: grub splashimage mode corrupts chainloader, can destroy partition table
Product: [Retired] Red Hat Linux Reporter: Laurence Tyler <lgt>
Component: grubAssignee: Jeremy Katz <katzj>
Status: CLOSED CURRENTRELEASE QA Contact: Mike McLean <mikem>
Severity: high Docs Contact:
Priority: medium    
Version: 7.3   
Target Milestone: ---   
Target Release: ---   
Hardware: i586   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-06-21 23:08:19 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Laurence Tyler 2003-11-03 12:46:12 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.78 [en] (X11; U; Linux 2.2.19 i686)

Description of problem:
Machine: Compaq Armada 4131T, RedHat 7.3

Tried to set up grub to boot Compaq diagnostics partition on hda3.
(It's a DOS FAT12 partition with graphic tools for setting up &
testing the laptop - common to older Compaq machines that don't have a
BIOS setup mode).

Partition should be bootable as a 'standard' MSDOS system. Used
something like the following stanza in config file (also tried
interactively):

title Compaq diagnostics
    rootnoverify (hd0,2)
    chainloader +1

Trying to boot this produces (1) corruption on splashimage display,
followed by (2) error message about invalid partition table.

(If 'root' is tried instead of 'rootnoverify' (to mount filesystem),
corruption happens followed by message about bad filesystem.)

At this point, any attempt to manipulate partitions in grub will blow
away the partition table (eg. use of 'hide' or 'makeactive'). I did
this twice before I realised what exactly was happening - fortunately,
I had a backup the second time.

Note that the grub.conf has the standard RedHat splashimage line:

    splashimage /boot/grub/splash.xpm.gz

To cut to the chase: I realised that the corruption was data (probably
the hda3 boot sector) being loaded directly into video memory by grub,
there being corrupted by screen output etc. It looks as if the mapping
of the frame buffer for the graphic video mode (vga 16??) on my system
is overlapping the area grub is using to load the chainloader and
other vital data. I don't know why this is, but obviously the two
areas shouldn't overlap.

Fix: Leaving out the 'splashimage' line from grub.conf and allowing
grub to use plain text mode cures the problem - I can now boot my
diagnostics and also a second MSDOS partition that I keep around (just
in case I get self-extracting things that I can't run under Linux).

Although I have solved the problem for myself, I thought I should
still report it as the penalty for getting it wrong was (in my case)
loss of the partition table, which took me most of a day to recover by
scanning the disk for start sectors and manually reconstructing the
table. Hence I have given this severity 'high'

It's quite possible that this *may* be specific to my laptop's
hardware configuration. On the other hand, it may indicate an
oversight in grub's vga16 and splashimage support that is making some
incorrect assumptions about video cards, and which therefore should be
fixed with some urgency.



Version-Release number of selected component (if applicable):
0.91-4

How reproducible:
Always

Steps to Reproduce:
1. Use an Armada 41xx series laptop that has the diagnostic partition
on disk (primary partition 3, partition type 0x12)

2. Make sure grub.conf contains (amongst others) the following lines:

splashimage /boot/grub/splash.xpm.gz

title Compaq diagnostics
    rootnoverify (hd0,2)
    chainloader +1


3. Reboot machine, select 'Compaq diagnostics' from menu

ALTERNATIVELY:

1) Enter grub interactively, e.g. from grub boot floppy

2) Type the following commands:

   grub>  rootnoverify(hd0,2)
   grub>  chainloader +1
   grub>  boot


Actual Results:  Corruption on screen, followed by error message about
invalid partition table. Any use of 'hide' 'unhide' or 'makeactive' at
this point will destroy the on-disk partition table.

Expected Results:  Should boot the MSDOS system in hda3. Expect to see
the Compaq diagnostics startup screen.


Additional info:

I don't know what exactly is needed to diagnose this. Info to hand
about my system:
Compaq Armada 4131T (Pentium 1 133 MHz), 80 Mb memory, 10 Gb disk
(Hitachi DK23CA-10).
Graphics: Cirrus Logic GD-7548 VGA 1Mb (Ver. 1.2). Screen: 800x600 TFT

If you need to know how/where things are mapped in memory, tell me how
to find out in RedHat 7.3 and I'll post the results. (I do have X11
running successfully)

One final point: LILO boots this partition quite happily as well (from
a floppy disk) using:

other=/dev/hda3
    label=diag
    table=/dev/hda

Comment 1 Jeremy Katz 2003-11-06 19:29:09 UTC
This sounds like a BIOS bug where your BIOS doesn't properly handle
VGA16.  Is this any better with newer releases (since I've rewritten
chunks of that patch since 7.3)?

Comment 2 Jeremy Katz 2004-06-21 23:08:19 UTC
Closing due to lack of activity.  Please reopen if you have further
information to add to this bug report.