RH 7.1 RC 1 partially bad media recovery
----- Message Text -----
Anaconda is having a problem with just PART of the MM CD:
The file /mnt/source/RedHat/RPMS/ypbind-1.7-5.i386.rpm
cannot be opened -- This isdue to a missing file, a bad package, or
bad media. Press <return> to try again
Needs a (SKIP this file) option
Rationale: As long as we identify an error condition of this type, offer
to SKIP over it -- perhaps a "GIVE UP" option or other way to gracefully
terminate the partial install with a proper reboot to umount cleanly the
partial system -- although this is more questionable, because the system
may well be non-bootable (I am mid-upgrade [this is on a 486/66 and has
been running over an hour, so I am a bit distressed, and would like to SKIP
at this point]-- we'll see ...)
This really is not an option - it can definately lead to a system with lurking
disasters waiting to happen.
Bad media just happens :(
Lots of present options also can lead to messed up systems -- the 'proceed
without satisfying dependencies' options during resolution phase in TUI and GUI
installs; the decision NOT to make a recovery floppy; ... I mentioned this on
testers list but had not appended it in here ...
PLEASE reconsider ... there is just NO good recovery option available for a
failure mid-upgrade as things are now with media discovered to be defective
Date: Thu, 12 Jul 2001 00:10:07 -0400 (EDT)
From: R P Herrold <firstname.lastname@example.org>
Cc: Red Hat Beta Team <email@example.com>
Subject: [testers] My first beta install failed
David L. Gehrt wrote
> The pause while awaiting the "running anaconda" seem excessively wrong
> not to have a progress indicator.
> XFree86-libs-4.0.3 failed due to missing file,bad package..." hit return
> and it succeeded
> XFree86-100dpi-fonts-4.0.3 failed due to missing file,bad package..." hit
> return twice and it succeeded
which was closed 'WONT" from the last test cycle -- Also on
well -used 7.1 'official' CD's today, I got a media error.
the ONLY options is 'retry' -- there just HAS to be a 'skip'
option to get PAST defective media.
> These last two comments SOUND like the CD problems I am reading about as
> I catch up on my testers-list email. The first CD booted OK, but then
> these two transient CD reading problems were not particularly serious.
but thre is no graceful recovery option to allow getting PAST
this in a 3/4 complete upgrade or install situation ...
I'll reopen 30029 and post this in that Bugzilla
I'm not sure what type of graceful recovery is possible when you have errors
mid-way through an install/upgrade.
I'm guessing you mean there should be a button to push to shutdown the machine
Well, yes, but why not make it more useful?
Undestand that, particularly in the case of an upgrade, a partial
update, and also a non-graceful shutdown is very painful to recover
from ... almost without exception (glibc package failure), it is
better to limp through to the end, and then audit with rpm -Va and
1. Offer to SKIP the unreadible package, and dump a note in the
install/upgrade log -- This alerts the sysadm that there was a problem, which
may need manual intervention.
2. At a minimum, in the case of INSTALLS only, offer a shutdown which properly
umounts the (then currrently mounted) installation partitions. At this point,
sometimes, an ''upgrade'' can be used to 'complete the install. AGAIN, log the
fact to the install/upgrade log -- so that one can do a port-mortem and figure
out what went wrong.
[This actually provokes an idea on my part -- Why not 'tee' in all instances
anaconda tracebacks, in the case of install failures, into
/tmp/anaconda-traceback.txt, and/or the install/upgrade log, so that beta
testers, and indeed customers finding errors, can give better reports, even if a
proper floppy, with proper vfat modules, is not at hand. -- will RFE separately]
I think Suggestion #1 is a bad idea, because we don't have a way of knowing the
importance of a given package at that time. What if that package is the kernel
or glibc? If you skip that package, the system is hosed, and you've wasted your
time with the rest of the install. I mean, knowing that the install media has
some corruptions and wanting to go on with the install anyway seems pretty crazy
We get a fair number of requests for this, and I don't really understand why
it's such a big deal. It's *much* easier to verify that your install media is
good before you start the install than it is to catch the countless number of
ways the installer can fail on bad media. If the user wants to go through the
trouble of downloading and burning their own cd, then I think they bear some
responsibility to make sure the download is good before they start the install.
If there are defects with the retail cd's, the right thing to do is to take it
back to the store and exchange it, not go on installing with bad media.
Re-opened after 7.2 release (taking out of Deferred state)
The point is being missed:
I had this occur in an ISP environment doing an upgrade; the upgrade, from
Official RH CD's failed half way through an UC -- GLIBC had overwritten the
prior version, but the new keneeernel (required by that glibc) was not yet in
place. The ypbind which was aprt of the BASE in the upgrade was destined to
be removed just as soon as the install was complete.
Sure -- I can envision failure modes which are not recoverable -- but if any
but about 5 or 6 packages fail, it is not fatal, just possibly uncomfortable --
Why increase teh pain by omitting a Skip. or even a graceful shutdown;
Isn't the pain of a powerswitch shutdown much more thn a possibly missing
Just got bitten again on my personal workstation with an AIC-7xxx and SCSI CD
drive combo -- with CD's which work consistently on IDE drives.
It's not the hardware; it's not the disks; it is SCSI controller resets this
and because there is no graceful umount option, or skip option, I'm screwed.
If the MBR changes had happened, but 'not enough' to be able to get booted with
a new kernel, I'd REALLY be in hot water.
This (error recovery) _is_ a big deal, if one wants to be taken seriously in a
marketplace. The Adaptec SCSI driver issues are most likely not going away.
Uggh ... actually, as it turned out, in trying to go from RH 7.1 to RH 7.2 and
choosing to convert to ext3, it editt4ed teh fstab and send to a
LABEL=/mountpoint change in the partition information on the HD in question.
Also, it changed the fs type to ext3 and committed the change to the /etc/fstab
and the remaining RH 7.1 kernel, NOT having been updated yet, was unable to
mount ext3 FS type ... Took about 45 minutes to recover it.
(attchment in a second)
Created attachment 37663 [details]
snapshot of fstab and fdisk as alluded to ...
Deferred to future release.
Honest, I had nothing to do with this -- this guy is in the local LUG
Date: Fri, 10 May 2002 14:03:50 -0400 (EDT)
From: Scott Merrill <firstname.lastname@example.org>
Subject: [COLUG] RH 7.3 installation
I upgraded a 7.2 system, and elected to add a few extra packages. Since I
only used the first two CDs for my 7.2 installation, I foolishly assumed I
could get by with just the first two discs from 7.3.
What a mistake that was.
I ended up rebooting since the installer did not provide a convenient way
to stop, and my CD burner was in the system I was upgrading! Thankfully
the system rebooted successfully, and everything appears to be in order
(sans those packages from disc 3).
Who would I talk to in order to suggest that the Red Hat installer
indicate which CD a particular package is on? Barring that, I'd like to
suggest a way to cleanly abort the request for the next CD, if possible.
colug mailing list
Re-opening for the next cycle.
*** Bug 72481 has been marked as a duplicate of this bug. ***
new datapoint FYI -- Upgrade scenario: with post-limbo (Psyche boxed set image)
CD's passing mediachacek, disk 2 on current production IDE based Dell hardware
came up not reading.
I infer the read pattern of mediacheck is sustained and linear; the read pattern
of an install is sporadic with spindowns and random media location seeks, and by
definition cannot be fully tested with all the possible custom install
combinations; Upgrades would be even MORE random.
and so another piece of media needed to be pulled and burned on a different
burner (which worked, the host being offline for several hours during the process)
A clean umount 'Abandon' option alone for a graceful shutdown when errors are
detected (if not both the Skip and Abandon) sure makes sense.
*** This bug has been marked as a duplicate of 68376 ***
Changed to 'CLOSED' state since 'RESOLVED' has been deprecated.