I agree it seems most remiss of up2date to abort the entire upgrade, leaving essential packages half / not updated, if the upgrade of just one RPM on which nothing else depends fails. I've now cloned this issue as an up2date bug ; the pvm bug that caused the original issue is now fixed with pvm-3.4.5-8_EL3, but perhaps the up2date team could look into fixing this nasty up2date behaviour in a future release.
As mentioned earlier, I had reported a similiar case in August of 2004 with other packages that failed causing up2date to halt its progress and the result was similar, packages not being installed and services being affected. This was reported as bug 129294 It was largely ignored and this has now reared its ugly head again creating quite a bit of frustration. I feel that this point needs to be pushed due to the fact that if it were addressed originally it would not been a problem this time around and it is realtively disheartening to see others are affected by it.
up2date pretty much just builds up a single rpm transaction, and relies upon that being successful. Fatal errors to rpm are fatal errors to up2date. What this really is is an rfe against rpm/rpmlib to provide better recovery against failed transactions. It could be something totally internal to rpm, but it may be something that tools using rpm, such as up2date or yum, will have to be extended once the fundamental bits are in place in rpmlib. Resetting summary, component, and owner.
What exactly failed?
Had the exact same problem today going from RHEL U7 to U8. PVM was not installed on my system.
What *exactly* failed? "Me too" doesn't help any more than incoherent ramblings about "better recovery" features does.
Created attachment 133311 [details] screen grab from RHEL3 U7 to U8 up2date
Find attached. After the up2date ran, just about every RPM that was to be upgraded was instead erased or in an off state. RPM itself was hosed with packages (libelf) missing. I was able to get around this by going through the console, copying over libelf from another RHEL system, using that to get the RPM packages installed from the up2date spool directory, then put the rest of the packages back as best I could. Fortunately it's a demo system, but it still took the better part of an hour to get things mostly back to where it should have been.
WAs libgcc actually installed twice or is that an artifact of cut-n-paste?
In fact, why are identical packages duplicated in the same transaction? That will surely hose an rpmdb, and should have been detected and prevented when adding packages to the transaction.
> WAs libgcc actually installed twice or is that an artifact of cut-n-paste? Not an artifact. In fact, duplicate versions of libgcc were in /var/spool/up2date (along with other packages). I've started upgrading other systems and checked the up2date spool directory and they had nothing in them. All those upgrades have completed without issue, so my guess is this is the root cause. BTW, rpmdb was not hosed. Once I copied over the libelf files from another system and reinstalled the libelf and rpm rpms, I was able to piece the system back together by hand.
Identical packages being erased in the same transaction should never ever happen, and will exhibit error messages like error: error(-30990) setting header #1431 record for Packages removal because already erased entries in indices are being removed again. Duplicate identical packages needs to be investigated. Sure the rpmdb is not hosed, but it may not contain accurate information either.
This should be reassigned back to up2date to investigate how multiple identical packages are being added to a transaction.
No, this is definately an rpm bug. It might have been triggered by a different bug in up2date, and a new bug report can be opened for the up2date bug.
What is the bug? rpmlib is data driven, if an application chooses to construct a transaction with multiple erasures of identical packages -- even though this is not supported by rpmlib -- then its an application, not an rpmlib, bug. Or a feature request for rpm. Now which is it?
Normally Linux should be considered something like a loaded gun pointed at your foot and you have an itchy trigger finger. And with a lot of other applications, I'd agree that if you really want to do something stupid, then by all means go for it. RPM is not and should not be considered one of those packages. Every other time you try to do something stupid with RPM it prevents you from doing it, or at least makes you confirm that yes, in fact, you want to do this dumb thing. Allowing this case is now inconsistent with RPM's normal behavior, so either RPM should allow you do do dumb things without warning or fix cases where dumb things are allowed without warning. Given past history of RPM, it would be easier to do the latter. To answer your question: Neither. It's a bug in rpm. There may be bugs in other applications (up2date) that trigger it, but there's no way that rpm should accept a transaction like this in the first place.
So fix the bug in rpm. Or convince RH to fix the bug. I see something broken in up2date, not rpmlib, your expectations notwithstanding.
> Or convince RH to fix the bug. Right. That's why this was filed. Can I get a response on this from someone with a redhat.com address? I'm getting a bit tired hearing excuses from what appears to be a professional staller.
I've been able to reproduce this bug on RHEL 4 while appling Update 4. (However, this case was not due to a bad package from RHN.) The examples I have are from an x86_64 machine. The administrator of these machines had previously run the following commands in thier kickstart: /bin/mv -f /usr/share/ssl/certs /usr/share/ssl/localcerts /bin/ln -s /afs/bp/contrib/openssl/ssl/certs /usr/share/ssl/certs So the /usr/share/ssl/certs was now a symlink rather than the original directory that the package put there. Update 4 contained an errata for openssl and the following errors occured: Installing /var/spool/up2date/openssl-0.9.7a-43.10.x86_64.rpm... error: unpacking of archive failed on file /usr/share/ssl/certs: cpio: chown Shutting down NFS mountd: [FAILED] Shutting down NFS daemon: [FAILED] Shutting down NFS quotas: [FAILED] Shutting down NFS services: [ OK ] Shutting down RPC idmapd: [ OK ] Stopping NFS statd: [ OK ] groupdel: group rpm does not exist There was a fatal RPM install error. The message was: There was a rpm unpack error installing the package: openssl-0.9.7a-43.10 This being in the middle of the entire Update 4 transaction all of the affect machines were completely hosed. Most services (like sshd and crond) were no longer running after the update. Most of the system was nonfunctional due to missing files and libraries. The RPM database was inconsistant with what was applied to the file system. This issue has become a very serious issue for NCSU and we need to see some movement about getting this condition fixed.
Created attachment 134389 [details] output from failed up2date run This attachment contains the output from up2date showing all packages being upgraded and the error in the transaction.
This bug is filed against RHEL 3, which is in maintenance phase. During the maintenance phase, only security errata and select mission critical bug fixes will be released for enterprise products. Since this bug does not meet that criteria, it is now being closed. For more information of the RHEL errata support policy, please visit: http://www.redhat.com/security/updates/errata/ If you feel this bug is indeed mission critical, please contact your support representative. You may be asked to provide detailed information on how this bug is affecting you.