Bug 18467

Summary:	LILO hang after RH7.0 Update over RH6.2
Product:	[Retired] Red Hat Linux	Reporter:	Doug Campbell <doug.campbell>
Component:	anaconda	Assignee:	Michael Fulbright <msf>
Status:	CLOSED WONTFIX	QA Contact:	Brock Organ <borgan>
Severity:	high	Docs Contact:
Priority:	low
Version:	7.0	CC:	dr, laurent.crepet, msw
Target Milestone:	---
Target Release:	---
Hardware:	i586
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2000-12-24 17:33:12 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Doug Campbell 2000-10-05 20:55:16 UTC

I installed upgrade of boxed set 7.0 over operational RH6.2 system by 
booting from CDROM.  Chose custom install and reviewed what 7.0 had 
determined.  Everything I installed for 6.2 (including LILO) was checked.  
Several new packages (including "sawmill") also checked.  I then proceeded 
to install 7.0 (it required two cds to complete).  Install completed 
successfully, instructing me to remove medium from CD and restart system. 
Upon boot from hard drive, LILO now hangs after typing the "LI" of LILO at 
the system console.

Comment 1 Doug Campbell 2000-10-05 20:57:14 UTC

Note:  LILO was installed in MBR under 6.2

Comment 2 Doug Campbell 2000-10-06 04:58:53 UTC

System successfully boots from diskette.

Contents of lilo.conf:

boot=/dev/sda
map=/boot/map
install=/boot/boot.b
prompt
timeout=50
linear
default=linux
message=/boot/message

other=/dev/sdb1
	label=dos

image=/boot/vmlinuz-2.2.16-22
	label=linux
	initrd=/boot/initrd-2.2.16-22.img
	read-only
	root=/dev/sda1

Results of df -k /
Filesystem           1k-blocks      Used Available Use% Mounted on
/dev/sda1               569204    432148    108144  80% /

Results of df -k /usr
Filesystem           1k-blocks      Used Available Use% Mounted on
/dev/sda6              1233308   1028548    142112  88% /usr

These are the only partitions used by linux

Comment 3 Doug Campbell 2000-10-06 05:28:34 UTC

Results of lilo -v:
LILO version 21.4-4, Copyright (C) 1992-1998 Werner Almesberger
'lba32' extensions Copyright (C) 1999,2000 John Coffman

Reading boot sector from /dev/sda
Merging with /boot/boot.b
Mapping message file /boot/message
Boot other: /dev/sdb1, on /dev/sdb, loader /boot/chain.b
Fatal: First sector of /dev/sdb1 doesn't have a valid boot signature

Apparently, lack of valid boot signature on SECOND scsi drive (I have two drives
in my chain) now causes lilo difficulties (at least with regard to -v option).

Comment 4 Doug Campbell 2000-10-06 05:39:07 UTC

Whoops, I lied.  /dev/sdb1 has a linux partition:
Filesystem           1k-blocks      Used Available Use% Mounted on
/dev/sdb1              1008872     30788    926836   4% /www

I backed up this disk and then unmounted /www and then fdisk'd /dev/sdb1.  It
has no partition table according to fdisk.  I created one spanning entire disk
and then wrote table.  After remount, contents of disk is unaffected.  I will
now reboot to see if lilo works.

Comment 5 Doug Campbell 2000-10-06 05:49:35 UTC

reboot failed -- still hangs after typing "li" of "lilo".  Will rewrite MBR of
/dev/sda1 with lilo and reboot, to see if that fixes things.

Comment 6 Doug Campbell 2000-10-06 06:03:13 UTC

Problem repaired.  Ran /sbin/lilo without arguments, then shutdown system and
rebooted.  Graphical "RedHat" lilo boot screen appeared, and linux then booted
properly.

This problem appears to be in install rather than lilo; install should run lilo
as part of its upgrade process to RH7.0 from earlier versions.

Comment 7 Doug Ledford 2000-10-06 17:39:15 UTC

The installer does run lilo as part of an update.  Obviously, in your case,
there was a problem with sdb and your configuration that caused lilo to fail. 
The only problem was that the install code didn't detect lilo's failure and have
you correct the problem manually.  That's probably as much a feature request as
a bug since the install code can't be made smart enough to correct problems like
yours automatically, so it's just as reasonable to use a rescue disk to fix the
problem as it is to have the installer tell you to switch to vt2 and fix the
problem manually at the shell prompt (and you get better tools to work with on
the rescue disk anyway, so it might even be preferable to leave it as a
situation that needs fixed via a rescue disk).  Adding msw to the Cc:
list so that he is at least aware of what has happened with this problem.

Comment 8 Matt Wilson 2000-10-06 18:33:39 UTC

moving to anaconda

Comment 9 Doug Campbell 2000-10-06 20:15:50 UTC

The problem is definitely with lilo, because the installer did run lilo and 
lilo installed a broken version of itself in the MBR on /dev/sda.  Note my 
statements that the characters "LI" get printed on the console at boot time 
before the system hung.  The broken nature of LILO persisted even after I fixed 
the MBR on /dev/sdb.  I suspect (although I'll never know for sure) that if I 
reran LILO without fixing /dev/sdb, the problems would have continued and I 
would have returned my RH7 package to Frys' for a refund, all the while cursing 
RH under my breath.

I think your target AS A REQUIREMENT is to be as robust as your Microsoft 
competition in installing your system.  As is said in all relationships, 
initial impressions are worth a lot, and if the initial impression your user 
gets from RH is that (a) a previously well running machine is now a pile of 
scrap iron, and (b) (s)he has to call you up for handholding, that colors all 
subsequent experiences.

In examining text of lilo.conf, it is obvious that I once had a DOS 
installation on /dev/sdb.  I deleted that partition about three months ago 
under RH6.2 and replaced it with a extf2 partition.  The associated entry in 
the existing lilo.conf was now invalid. In spite of the invalid entry, 
lilo/RH6.2 continued to work flawlessly.  Upon RH7.0 installation, lilo should 
have detected that erroneous situation (maybe it did) and ignored the entry, 
not even allowing it to be entered into the map. RH install should have 
captured lilo's whinings and showed them to me in a log so I, inexperienced 
OS/2 user that I am, would have understood what I had done wrong and what I 
should do to correct the problem.

And, for the life of me, I can't understand how RH6.2 and RH7.0 managed to 
mount a filesystem on a drive without a partition table...

Comment 10 Michael Fulbright 2000-10-06 20:30:18 UTC

When anaconda ran LILO in your case, would LILO have generated error messages?

Currently the installer does not deal with these messages well, but bug 13614
addresses this. I would recommend we mark this bug as a dupe of 13614 if
everyone agrees.

Comment 11 Doug Campbell 2000-10-06 20:38:25 UTC

What's anaconda?  I assume it supervises the install, since I saw it start 
during the text phase of initial cdrom boot.  And I don't have permission to 
view bug 13614.  I'll just assume that you are going to do the right thing and 
not make any more waves.

Comment 12 Larry Staberg 2000-10-13 20:18:34 UTC

Something similar happened when I upgraded from 6.2 to 7.0. Two disks, RH6.0 on
diska, RH6.2 on
diskb, upgrading diskb to RH 7.0. The upgrade went as advertized, except on
reboot, my old image
from 6.2 was loaded breaking a few things.
After investigation, the correct lilo.conf was created but on examining the boot
list at boot time, I found that lilo had not been run, or not run successfully..
Yet, the /tmp/update.log did not contain
any useful information..  Re-running lilo, which ran successfully, corrected the
problem..

I also could not examine bug 13614..

Comment 13 Need Real Name 2000-10-18 20:10:26 UTC

Similar problem.  Main difference is that the only drive is a 30G Maxtor as hda.  Since the only floppy is an LS-120 making a boot disk wasn't an option 
(reported already).  After reboot, got LI.  Used the cdrom as a rescue disk, executed /sbin/lilo and received an error about no lilo.conf in /etc.  After 
several unsuccessful attempts to get lilo to work,  I decided to create a ZIP boot disk via another install and setting /boot to the ZIP (SCSI card allows 
booting from ZIP).  After a seemingly successful install, I restarted.  The ZIP starts to access and then I get an Invalid Boot Disk from the BIOS (a 
bootable DOS ZIP succeeds).

Comment 14 Michael Fulbright 2000-10-27 16:41:33 UTC

Sorry I fixed bug 13614 so everyone could see it.

Comment 15 Laurent CREPET 2000-12-05 13:15:43 UTC

I've just upgraded my RH 6.2 box to 7.0, and found the same problem. My case:
   - a lot of kernels already installed in the old /etc/lilo.conf
   - one with an 'alias=linux'

I didn't checked the update process, so it also choses to use linux as label for
the RH 7.0 freshly installed kernel. All the stuff for SMP and UP kernel has
been added to the old /etc/lilo.conf file by anaconda... and runs lilo...

But two entries with the same string (linux)... lilo returns some error
messages, not shown by anaconda when upgrading...

I reboot using the rescue system, mount my partition, run some chroot-ed
commands for fixing lilo.conf syntax, re-run lilo and reboot... OUAH !!! It
works !

Comment 16 drigolin 2000-12-24 17:33:09 UTC

I fixed this using the BootDisk created during installation and changing 
lilo.conf adding these lins

disk=/dev/sda
   bios=0x80
disk=/dev/sdc
   bios=0x81

In this way your LILO will be able to find the right SCSI disk at next reboot...

Comment 17 Michael Fulbright 2001-03-02 17:54:17 UTC

Closing this bug as we have not been able to reproduce it.