Bug 44584

Summary: infinite loop entered if package is corrupt on cdrom
Product: [Retired] Red Hat Linux Reporter: efm-redhat
Component: anacondaAssignee: Brent Fox <bfox>
Status: CLOSED NOTABUG QA Contact: Brock Organ <borgan>
Severity: low Docs Contact:
Priority: medium    
Version: 7.0CC: thull2
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2001-06-14 15:51:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description efm-redhat 2001-06-14 15:34:07 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.76 [en] (X11; U; Linux 2.4.4 i686)

Description of problem:
 Message:
> >
> >   The file
> >   /mnt/source/RedHat/RPMS/xdaliclock-2.18-5.i386.rpm
> >   cannot be opened. This is due to a missing file, a bad
> >   package, or bad media. Press <return> to try again.
> >
> > Running ls on the directory gives a stat error. Worst
> > problem is that the only option is to try again -- in
> > effect, an infinite loop. Only thing I could think of
> > to do was kill something. I had to reload from scratch,
> > making sure xdaliclock wasn't selected.

This was tracked down to a scratch on the cd-rom, making xdaliclock
unreadable.

In my mind, the more serious problem is with the install program
(anaconda?). At the very least, when such a problem occurs, the
user should be given the option of retry/skip/abort. Skip in my
case would have left me with a perfectly valid system, except
that I would have missed one specified novelty package. (Of course,
had the problem occurred elsewhere, skip might result in a system
that is unusable, but odds are against that, and in any case by
not hanging it becomes possible to survey the installed system,
check package dependencies, and provide a clear explanation of
what did or didn't work.) Hangs are never acceptable, nor is
giving the user only one choice when that choice is to bang your
head against a hard failure.



How reproducible:
Always

Steps to Reproduce:
1. Attempt to install with the scratched disk.
2.
3.
	

Expected Results:  It should have realized that the package was corrupt,
and offered
an 'skip' or 'retry' option after trying more than once.

Additional info:

Comment 1 Brent Fox 2001-06-14 15:51:49 UTC
Skipping is a bad idea, because we have no way of knowing at that step what the
package in question is.  If that package is, say, the kernel or glibc, skipping
that package would leave your system unbootable.  

If the cd is damaged, you are going to have problems...it's unavoidable.

Comment 2 Tom Hull 2001-06-14 17:41:10 UTC
The bug is that you have an error dialog that does not offer the user any
recourse
except to retry. In the case of a hard error, the user can only stare in
disbelief, or
kill something. At minimum, there needs to be an Abort option. In this
particular
instance, a Skip option would have resulted in a successful install, since the
missing or corrupt package was inessential (xdaliclock).

Comment 3 Tom Hull 2001-06-14 19:08:34 UTC
Some more background and comments. I originally started writing this
in email response to a cc of the bug report:

FWIW, the CDROM does not appear to be scratched. I tried to read it
on two machines today (including the one that I was loading when the
problem occurred), and I was able to ls and sum the xdaliclock rpm.
I am at a loss to explain why the installer could not stat the file.
When the problem occurred during the install, I sent to the shell
window and ran ls on the RedHat/RPMS directory: got a cannot stat
error message, no further output. I also ran ls [a-wyzA-Z]* to skip
around xdaliclock, and that produced a long listing. I repeated this
several times, always with exactly the same error.

At the time I was thinking this was a mastering error, but given that
the CDROM is viable on working Linux systems, my best guess right now
is that the install kernel somehow lost or corrupted the vnode (or
whatever it's called in Linux). But it also could be any number of
other things (driver, ls, some marginal sensitivity in the drive or
disc).

In any case, the bug report that I encouraged Evelyn to file does not
depend on finding the original cause of the error: what I'm concerned
about is what you do once this particular error dialog appears.

> +Resolution: NOTABUG

I disagree. See below.

>  Severity: low

Frequency of occurrence may be low, but severity when it happens is high.

>  Priority: normal
>  Component: anaconda
>  an 'skip' or 'retry' option after trying more than once.

The current error message only has a Retry option (actually, do you want to
retry? [YES]). The user's only recourse is to stare at this idiotic message
in disbelief.

I suggested adding Skip and Abort as options. Once it becomes clear that
Retry won't work, the user can make a decision whether to risk continuing
without the missing/corrupt package, or abort.

> +------- Additional comments from bfox 2001-06-14 11:51:49 -------
> +Skipping is a bad idea, because we have no way of knowing at that step what
the
> +package in question is.  If that package is, say, the kernel or glibc,
skipping
> +that package would leave your system unbootable.

That's a bogus argument: don't try to work around a problem, since the
workaround might not work.

Once you hit a problem, there are three logical options:

 1) Retry
 2) Skip
 3) Abort

You're only offering Retry, which doesn't work. What next? If you offered
an Abort option, that would be a big improvement, because it lets the user
do something -- basically, tell the system that Retry doesn't work -- and
conceivably lets you terminate more gracefully than would be the case if
the user has to pull the plug. Big improvement, even if the result is an
unbootable system. No user should ever have to shoot their system.

A Skip option give the user the option of trying to salvage the system.
How good/bad an idea skipping is depends on many factors; e.g., what is
being skipped, and how well the user understands the system. In my case
(xdaliclock, which I know I don't need), the Skip option would have been
perfect.

Admittedly, Skip is more work, but, hey, that's what software is for!
You do know the package, and you should know the package list and the
dependency trees, and the user-provided options, all sorts of stuff.
You could estimate the damage of skipping any given package (glibc?
you're dead in the water; xdm? better default init to level 3; xdaliclock?
who cares?).

> +If the cd is damaged, you are going to have problems...it's unavoidable.

In this case the CD is not damaged, but that's neither here nor there.
Bad user interface software in error dialogs cannot be excused simply
because problems are unavoidable; error dialogs exist because problems
are unavoidable.