Bug 112648 - cannot renstall rh9, rpm and many other commands now simply segfault
Summary: cannot renstall rh9, rpm and many other commands now simply segfault
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: rpm (Show other bugs)
(Show other bugs)
Version: 9
Hardware: i686 Linux
high
high
Target Milestone: ---
Assignee: Jeff Johnson
QA Contact: Mike McLean
URL:
Whiteboard: triage|leonardjo|closed|notabug
Keywords:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-12-26 04:38 UTC by jim barchuk
Modified: 2005-10-31 22:00 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-05-12 02:30:35 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
strace date --version (2.79 KB, text/plain)
2003-12-29 13:03 UTC, jim barchuk
no flags Details
strace diff --version (2.80 KB, text/plain)
2003-12-29 13:05 UTC, jim barchuk
no flags Details
strace rpm --version (5.92 KB, text/plain)
2003-12-29 13:06 UTC, jim barchuk
no flags Details

Description jim barchuk 2003-12-26 04:38:25 UTC
PLEASE DO NOT REPLY TO jb@jbarchuk.com. Please reply to
jbarchuk.att.net or I will never see a reply because the system is
hosed as described below.

Description of problem:

As can be seen in my history it's been months since I've been able to
   run up2date. *All* aspects of rpm simply 'stopped working' for no
apparent reason. But there have been no large security upgrades so not
 overly concerned.

The common mentions of rm /var/lib/rpm/__* ; rpm -vv --rebulddb never
had any effect.

Tried rebulding from scratch as described in
http://www.introcomp.co.uk/linux/rpm_rebuild.html with with no
success. I added rpm --initdb as I thought that was necessary but no
difference.

Tried rm -f /var/lib/rpm/* and /var/lib/up2date/* and running update
off distribution CD, selecting all packages. It says it's installing
many gigs of stuff. /var/lib/rpm has a full set of files.
/var/log/rpmpkgs is zero-length. 

Now, rpm, up2date, diff, sendmail, and many other programs simply
return 'Segmentation fault' and no further info.

Some things work such as httpd, sshd, and handrolled BIND.

Version-Release number of selected component (if applicable):

System was an RH9 'install', plus all up2dates until recently.

Additional info:

I need a description of how to reinstall RH9 in sauch a way that it
actually, ummmm... 'reinstalls.'

To be perfectly honest, I expect no answer or fix. For all the install
and upgrade questions I've asked since RH4.x not one has ever been
solved. AAMOF this RH9 'install' was required because there was never
any answer to why I couldn't upgrade from 7.2. Yet Another Disaster.
Whatever.

Have a :) day!

jbarchuk@att.net <-- NOT jb@jbarchuk.com

Comment 1 Jeff Johnson 2003-12-26 14:25:33 UTC
First things first.

What does
    rm -f /var/lib/rpm/__db*
    rpm -qa -vv
say?

If segfault, try
    rm -f /var/lib/rpm/Pubkeys
    rpm -qa -vv



Comment 2 jim barchuk 2003-12-26 16:43:16 UTC
Hi jbj!

Been there, done that, have a lifetime's worth of t-shirts. :) Rpm and
similar are not the issue. Renstalling RH9 is the issue. Please read
-exactly- what I wrote:

> Tried rm -f /var/lib/rpm/* and /var/lib/up2date/* and running update
> off distribution CD, selecting all packages.

After that -tons- of critical stuff segfaults. A few examples, at boot
time:

'Setting local time' failed
/sbin/mkerneldoth: line 9: 302 segfault
/etc/rc.d/rc.sysinit: line 807: 317 segfault

Obviously the prob is something much lower level than simple rpm or
sendmail. Lots of segfaults at shutdown time too. I'm amazed the thing
boots at all. :) Yet, sshd and many other things behave perfectly.

There must be some directory tree(s) that I need to -delete- first
before running CD upgrade. 

But I hesitate to do that because some things act weird at upgrade.
For instance, I run named and httpd residing in a totally separate
areas (/chroot/named and /usr/local/apache. I -expected- to need to
reinstall saved copies of /etc/rc.d/init.d/named and ~/httpd because
upgrade should overwrite them. But it doesn't. Upgrade reports it's
installing -gigs- worth of stuff, but it obviously isn't.

Yet don't want to delete whole trees such as /bin because I have
plenty of other stuff of my own in there that I'd rather not spend
days reinstalling.

How do I tell upgrade to actually -do- an upgrade?

Tnx. Have a :) day!

jbarchuk@att.net <-- I modified my RHN config and that apparently worked

Comment 3 Jeff Johnson 2003-12-26 17:22:44 UTC
Again, first things first. If rpm is non-functional, you
ain't going any place. Please try the commands I suggested.

Deleting stuff ain't the right approach, understanding
how the system is pbroken is.

Is any command at all functional? How about statically
linked commands like /sbin/sash and /sbin/sln?

Does bash "work"?

Are the segfaults random or reproducible? If random, then
your problems are likely to be hardware, not software (just my
guess).

Comment 4 jim barchuk 2003-12-27 02:14:13 UTC
HiHi jbj!

> Again, first things first. If rpm is non-functional, you
> ain't going any place. Please try the commands I suggested.

rpm --help segfaults. It is a symptom not the problem.

> Deleting stuff ain't the right approach, understanding
> how the system is pbroken is.

Yes it would be the right approach if it reinstalled rpm and up2date
and such and everything they need to work.

As I said CD-upgrade does not overwrite /etc/rc.d/init.d/named or
~/httpd. How do I force the upgrade to write the files it claims it's
writing?

> Is any command at all functional? How about statically
> linked commands like /sbin/sash and /sbin/sln?

> Does bash "work"?

As I said named, httpd and sshd work. Pine works via ssh, I can create
mail but not send it because sendmail is one thing that's down Those
are just examples. I'm logged in via ssh and writing this in an editor
and will copy/paste to the web form.

> Are the segfaults random or reproducible?

All the boot time and run time segfaults I mentioned are 100%
'reliable.'

> If random, then your problems are likely to be hardware, not
> software (just my guess).

Again, rpm stopped working months ago as you can see by my up2date
history. Haven't had any instability oddities at all either before or
after that till now.

If I'd had any other oddities along the way I'd of course consider
hardware. But not the slightest glitch.

The segfaults started the *instant* I asked the CD-upgrade to
reinstall the system, the very first time it rebooted, because it
apparently doesn't reinstall -everything- as demonstrated by the named
and httpd init scripts not being overwritten.

Again, how do I ask CD-upgrade to reinstall everything on the CD? With
that I'd have a working rpm and up2date and such and could get back to
a fully working state in a few minutes.

At the moment I don't care why it doesn't only that it does do it.

My guess is that CD-upgrade 'declines' to -downgrade- some things, for
whatever reasons it decides to do that.

As I said, before one reinstall I:

rm -f /var/lib/rpm*
rm -f /var/lib/up2date/*

After the reinstall ~/rpm was full of files that looked fine. But rpm
segfaults. That leads me to believe that although the basic rpm was
reinstalled there might be other libs or associated/required files
that aren't reinstalled. So the simple answer, regardless of
understanding, is to reinstall -everything- and start with a clean
slate. I hate because it's the typical MS-style answer it but it's the
easiest fix.

Have a :) day!

jb


Comment 5 Jeff Johnson 2003-12-27 04:57:11 UTC
OK, rpm --help segfault seems first ;-)

Can you attach strace -o /tmp/xxx rpm --help output?

You might try rpm --version, that does not use libpopt,
might work.

Comment 6 Jeff Johnson 2003-12-27 05:03:46 UTC
Yes not all files are overwritten. Files marked %config(noreplace)
are preserved if locally modified, new versions of those files
are installed with .rpmnew extension instead.

IIRC, the 2 files that you mention are both %config(noreplace).

If you do wish to jumpstart, try
    mv /var/lib/rpm /var/lib/rpm-ORIG
and then use anaconda to install, not upgrade, prolly
a minimal install.

If you don't make a new file system on your existing
partition, then the minimal install will replace only
necessary files, leaving everything else on the file
system intact. You can then use the just installed rpm
to upgrade other pkgs.


Comment 7 jim barchuk 2003-12-29 13:03:48 UTC
Created attachment 96722 [details]
strace date --version

Comment 8 jim barchuk 2003-12-29 13:05:27 UTC
Created attachment 96723 [details]
strace diff --version

Comment 9 jim barchuk 2003-12-29 13:06:32 UTC
Created attachment 96724 [details]
strace rpm --version

Comment 10 jim barchuk 2003-12-29 13:07:35 UTC
Hello Jeff!

> Can you attach strace -o /tmp/xxx rpm --help output?

To give some similariities/differences I attached strace --version
output for date, diff, and rpm.

The only obvious oddity I see is 'open("/etc/ld.so.preload"... No such
file or directory.' I see tons of mentions of that via google ranging
from incompatible lib versions to server cracks. But have no copy of
that in my archived copies of /etc/*. Unless it was a symlink that
didn't get archived properly?

Tnx. Have a :) day!

jb


Comment 11 Jeff Johnson 2003-12-29 17:01:00 UTC
You appear to have a damaged glibc installation. This is more
than replace a few files in /etc to repair.

Many files in /lib will need checking and replace
from a glibc package.

It's probably just as easy to reinstall as to diagnose
and repair imho.

Otherwise read about repairing a system from a rescue disk.
Basically you boot from CDROM, mount your existing file systems,
and use the stripped down version of rpm in the busybox program
to reinstall glibc packages.



Comment 12 Leonard den Ottolander 2004-04-17 21:57:02 UTC
All traces end with

getrlimit(0x3, 0xbfffdaf4)              = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++

If this is a broken glibc installation this can be closed NOTABUG.



Note You need to log in before you can comment on or make changes to this bug.