Bug 774

Summary: Bug report for Red Hat 5.2 install -- corruption of rpm databases
Product: [Retired] Red Hat Linux Reporter: irwin
Component: installerAssignee: David Lawrence <dkl>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 5.2   
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 1999-03-11 16:57:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description irwin 1999-01-11 03:29:44 UTC
Bug report for Red Hat 5.2 install -- corruption of rpm
databases (+ other
install comments later)

I have been running, updating, and administrating a
slackware distribution
for the last 3 years, but I thought it was time to replace
my much-updated
distribution by a coherent one with glibc, and I picked Red
Hat 5.2.
However, I ran into a severe problem with the Red Hat 5.2
install that I
think you should be aware of.

I was installing using the install boot disk copied from a
$10 Red Hat CD
distributed by our local LUG.  As far as I can tell this CD
is a snapshot of
your ftp web site taken Oct 15 1998.  ls -lt |head 10 on the
RPMS directory
gives:

-r--r--r--   1 root     root        36065 Oct 15 00:43
TRANS.TBL
-rw-r--r--   2 root     root        49437 Oct 15 00:28
initscripts-3.78-1.i386.rpm
-rw-r--r--   3 root     root         6765 Oct 14 03:40
xinitrc-1.6-1.noarch.rpm
-rw-r--r--   3 root     root      1563458 Oct 14 03:40
urw-fonts-1.0-3.noarch.rpm
-rw-r--r--   3 root     root       139529 Oct 14 03:40
words-2-11.noarch.rpm
-rw-r--r--   3 root     root        46789 Oct 14 03:40
swatch-2.2-3.noarch.rpm
-rw-r--r--   3 root     root       147598 Oct 14 03:40
termcap-9.12.6-11.noarch.rpm
-rw-r--r--   3 root     root         9033 Oct 14 03:40
timetool-2.3-7.noarch.rpm
-rw-r--r--   3 root     root         6923 Oct 14 03:40
setuptool-1.0-1.noarch.rpm

I don't think my hardware information matters much, but just
in case I have
a pentium-133, with 2 HDD and a CDROM drive all on a SCSI
(ASUS SC200) bus.
My video card is S3 trio64, and my monitor is the Sony
Multiscan 15sfII.

All the initial stages of the install (**custom class
install**) went well
(impressive compared to what I had to go through with
slackware 3 years
ago).  However, I ran into severe trouble with the package
installs. I had
selected the following components: X, networking, tex, the
extra
documentation, and all aspects of programme development.
However, this came
to 285 Megabytes, and it was all supposed to fit into a 293
Megabyte
partition that I had prepared with disk druid and which I
had subsequently
asked to be formatted and searched for bad blocks.  I ran
into severe
trouble in the rpm databases as a result of me deselecting
part/all of these
components in an attempt to save disk space.  In the next
two paragraphs I
will try and recreate for you as much as I remember about my
package
selection to give you some idea of the mixture of packages
that caused the
problem.

(1) One comment I have is your install software should allow
deselecting
whole components.  After I got to the selection stage and
saw the huge
number of packages involved, I went back to the components
GUI and
deselected everything but X, but it made no difference to
the size. I then
went back and even deselected X, and in desperation finally
went two back to
try and re-enter the component GUI.  However, with
everything deselected
from that, your install softwareskipped right to the
individual package GUI
with the full 285 Megabytes still there.
(2) I then deselected most of the individual packages from
networking, tex,
documentation, and programme development, and **also
deselected some of the
X packages I didn't think I would need** (afterstep was one,
but there were
several more including gtk+ and perhaps other X libraries).
My thought was
that if I deselected too much, the package software would
catch it and
install the packages my selected packages depended on.  Some
of the packages
did notice there were other packages missing, and I vaguely
recall a GUI
where I confirmed that the required required extra packages
should be
installed.

The admittedly odd combination of packages I had selected
from the previous
interactive process caused installation errors for the
following packages:
newt, the slang library, tcl, tk, xaw3d, xfree86-libs, and
xpm.
Nevertheless, I carried on to the finish, booted the system,
and attempted
to correct the package problems by re-installing using rpm.

When I tried rpm -Va, it generated a scsi i/o error on
reading (always
reporting the same block somewhere on the disk (lun=1) where
my new
partition containing Red Hat was located).  I got the
identical error
(including block number) whenever I tried to re-install any
of the packages
that had failed during the install phase, and I got **no**
errors if I
attempted to re-install packages that had no problems during
the install
phase.Although from the lun number I thought I had narrowed
down the problem to
have something to do with my hard disk, I did check the CD
by copying
certain of the "problem" packages (no i/o problems) and by
comparing with
the same packages at your ftp site (no differences).  Thus,
I doubt very
much that I am the victim of media errors on the CD.

I then copied all the rpm data bases in /var/lib/rpm to
another path (no i/o
errors), and rebuilt the rpm data base using the new path to
the rpm data
bases (no i/o errors).  However, any attempt to deal with
the "problem"
packages using the rebuilt rpm data base always generated
the i/o error.

My tentative conclusions from all of these tests: there was
no "real" read
i/o error for any file associated with the rpm software or
rpm packages.  (I
have never had any trouble with my SCSI disks, the partition
was reformatted
and tested for bad blocks as part of the install process,
there were no i/o
errors associated with the SCSI CD when I did things like
"rpm -qil -p
*.rpm", I copied all files I could think of in an attempt to
find the "bad"
file without success, and finally, it seems improbable to me
that the first
ever bad block on my disk happens to fall right in an area
that is critical
to rpm.) I believe the i/o error was generated by a
corrupted rpm database
somehow pointing to an incorrect disk block, and I believe
this corrupted
rpm data base was caused by the unusual combination of
packages that I had
selected.


Once I had drawn this conclusion, I should have made a copy
of the list of
all installed packages for you to try for yourself, but I am
afraid I
foolishly lost this information (see below).  I believe you
could easily
replicate the error by selecting X and deselecting afterstep
and some X
libraries such as gtk+ afterward.  (I don't think there the
problem could
have been generated by the rest of the components because I
largely removed
all packages from those components. I reason the other
components were
probably okay because I removed virtually all of them so
there would be no
leftover dependencies to cause software troubles.)

What I did instead was to try another fresh install on the
same disk partition
(that is why I don't have a list of the packages).  This
time I chose X alone
and didn't deselect anything afterward.  In 10 minutes I had
a working system
with X up and running, (and no rpm data base troubles).

So all is well that ends well, but I thought I should let
you know of the
following two major issues:

(1) the install software's inability to deselect whole
components (I think
this is an important problem that should be rectified).

(2) the install software's running into "i/o" problems
(which I think are
associated with an rpm database that becomes corrupted) when
certain
individual X packages are de-selected after the whole X
component was selected.

There were several other minor problems I encountered that
I should mention as well

(3) The information for 100 dpi X fonts doesn't make
complete sense because
it has obviously been copied from the 75 dpi package
information whose
commentary refers to the higher-resolution 100 dpi package.

(4) telnet, ftp, and probably other packages officially
depend on the
"inetd" package, but such a package does not exist and stops
the
installation of these packages from glint.  After some
investigation I found
inetd was part of netkit-base. Once that was installed,
telnet (and ftp)
could be installed without difficulty.  There is obviously
some
inconsistency here in the naming of the packages that should
be straightened
out.

(5) I found the navigation of the GUI's for the install
programme to be
counter-intuitive.  It is obviously a matter of personal
taste and what your
experience is with other GUI's, but selection by space bar
rather than CR
seemed strange to me.  (In fact I found there was an
undocumented feature
where you could often *but not always* select by CR.) Also,
I think
navigation around the components of the displayed GUI should
strictly be by
the arrow keys without the tab confusing things.  You might
want to copy
something like the arrangement that lynx has with its arrow
navigation.
Anyhow, my feeling is more thought needs to go into this.  I
kept
stumbling over the current navigation, and I am usually
pretty good
about whipping around a GUI.

(6) Xconfigurator is fine to get a lowest-common-denominator
X configuration
that works for almost everybody.  However, once X is working
how do you tune
the configuration to get the maximum vertical refresh rate
consistent with
the resolution you have picked and the maximum horizontal
frequency allowed
by your monitor?  As far as I can see xvidtune only plays
with the pixel
limits for fixed dot-clock frequency.  What is needed is a
fresh version of
that programme that allows you to change the dcf if/when
that is allowed by
the video card (and most modern cards allow a variable dot
clock).  Such a
programme would greatly improve the "saleability" of Linux.
The impression
left by Xconfigurator is you have to put up with a
flickering display if you
want to use RedHat Linux.  Fortunately, I had the modelines
left over from
many X configuration experiments I had done by hand under
slackware for my
particular combination of videocard and monitor so I popped
those into the X
configuration file, and, voila, I had a good looking display
again.  (98 Hz
vertical refresh definitely looks much better to my eyes
than 72 Hz.)

(7) When creating a user account under linuxconf, the
sysadmin is asked to
put in a new password for the user.  But what if you mistype
it or the user
forgets it so you have to enter it at a later date?  I could
find nothing
under linuxconf to allow the sysadmin to set a user password
(except the
submenu that only seems accessible when you first create an
account).

Despite the two major issues and the grab-bag of minor
issues, you are to be
congratulated on a fine distribution.  Once I found a way to
get around the
show-stopping major issues, the installation was a breeze,
and the
subsequent configuration using linuxconf was nice also.  rpm
is great, and I
also enjoyed using glint for the straightforward package
installs. I will
dual-boot for a while to my old distribution for things I
haven't configuredyet, but I can forsee I will quickly
become attached to the new distribution
because of its enhanced ease of administration and package
upgrading.

All the best,

Alan W.Irwin

Comment 1 David Lawrence 1999-03-11 16:57:59 UTC
Thank you for your input on this matter. We are working to improve the
install process regarding detection of free disk space so that these
things will not happen. This should hopefully be alot better in our
upcoming releases of Red Hat.