Bug 492947
Summary: | /etc/passwd moved to /etc/passwd.rpmsave during update transaction. | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Lennart Poettering <lpoetter> | ||||||
Component: | rpm | Assignee: | Panu Matilainen <pmatilai> | ||||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | rawhide | CC: | bnocera, erik-fedora, fche, ffesti, herrold, james.leddy, jlayton, jnovy, jpriddy, jreiser, jspaleta, mitr, mschmidt, n3npq, notting, pebolle, pmatilai, rjones, rvandolson, yersinia.spiros | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2009-04-03 07:59:39 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 446452 | ||||||||
Attachments: |
|
Description
Lennart Poettering
2009-03-30 20:57:11 UTC
If you *really* want to do more than vent, you will need to supply the details that convinced you that yum/rpm deleted /etc/passwd. Otherwise *shrug* ... During the upgrade the old passwd got renamed to /etc/passwd.rpmsave. Also, before the yum upgrade everything was fine, afterwards it wasn't. Good, that's a start. So /etc/passwd wasn't removed, just renamed. Still a largish flaw, I'm not quibbling, just stating facts. The .rpmsave suffix is appended iff the file resolution is FA_BACKUP. The resolution FA_BACKUP is assigned iff a file is marked %config (likely the case for /etc/passwd), and is modified wrto the digest in the rpmdb. Since digests have just changed from MD5 -> SHA256 (so "changed" cannot be computed through usual means), what likely has happened is that /etc/passwd contents ended up being renamed with .rpmsave because of the switch from MD5 to SHA256 in recent versions of F11 beta. Yes still a flaw, but also not exactly an easy problem to solve once the decision to switch from MD5 has been made. hth Enjoy! BTW. A lot of other important config files got renamed this way too. inittab, nsswitch.conf and more. I had to rename quite a few files beneath /etc back to their original names to get back to a bootable system. Likel same %config digest check cause. Lather rinse repeat. Can you attach your yum.log from the transaction in question, just to clarify what packages were updated? Sometimes I find that SELinux is involved. As part of recovery, I boot to single user mode (by appending " single" to the kernel command line) and run "restorecon -R /" before proceeding. Or if things are really bad, or just to make sure things work, then boot into rescue mode and create the file .autorelabel in the root directory (chroot /mnt/sysimage; touch /.autorelabel), then reboot into that root. John: I don't have Selinux enabled. How on earth does a file marked %config _ever_ get renamed away, regardless of whether the checksum is computable. That's just insane. Created attachment 337550 [details]
My current yum.log
Bill, that attachment is my complete yum.log. Not sure which one was the transaction that went wrong. (In reply to comment #3) > Good, that's a start. So /etc/passwd wasn't removed, just renamed. > Still a largish flaw, I'm not quibbling, just stating facts. > > The .rpmsave suffix is appended iff the file resolution is FA_BACKUP. > > The resolution FA_BACKUP is assigned iff a file is marked %config > (likely the case for /etc/passwd), and is modified wrto the digest > in the rpmdb. > > Since digests have just changed from MD5 -> SHA256 (so "changed" > cannot be computed through usual means), what likely has > happened is that /etc/passwd contents ended up being renamed > with .rpmsave because of the switch from MD5 to SHA256 in > recent versions of F11 beta. > > Yes still a flaw, but also not exactly an easy problem to solve once the > decision to switch from MD5 has been made. > I don't know much about rpm, but how hard would it be to write a script to enter all new sha values into the database? And why Apr 1? are these just the first two packages to use sha among many? If so wouldn't we be in for a world of hurt? What does this have to do with Apirl 1st? I posted this bug two days ago. I assume that there will need to be some way to regenerate the hashes before F11 is released. Otherwise F10 -> F11 upgrades will be break the same way, correct? The switch to SHA-256 was done a few weeks ago, and /etc/passwd is %config(noreplace). I did test the case of %config(noreplace) - I even prepared a patch to make sure that %config(noreplace) is not moved to .rpmsave (see #479869). Could someone retry the test cases in https://bugzilla.redhat.com/show_bug.cgi?id=479869#c3 , please? (In reply to comment #16) > Could someone retry the test cases in > https://bugzilla.redhat.com/show_bug.cgi?id=479869#c3 , please? (You'll need to define %_binary_filedigest_algorithm to "1" to create MD5 packages, to "8" to create SHA-256 packages.) (In reply to comment #16) > The switch to SHA-256 was done a few weeks ago Can someone provide a link to the discussion I presume existed, where the idea that new RPMs carry both MD5 and SHA1 hashes must have been proposed and rejected? (In reply to comment #14) > What does this have to do with Apirl 1st? I posted this bug two days ago. Whoops, was looking at the mailing list. (In reply to comment #18) > (In reply to comment #16) > > The switch to SHA-256 was done a few weeks ago > > Can someone provide a link to the discussion I presume existed, > where the idea that new RPMs carry both MD5 and SHA1 hashes > must have been proposed and rejected? This is about file signatures (I presume somewhere in the rpm database). RPMS themselves can already be signed by both MD5 and SHA1, iirc Lennart, From the yum log you did a significant number of updates on the 30th. Can you do a quick scan of your /etc/ to see how many rpmsave files are on the system and see if other noreplace config files were impacted? Maybe this is some sort of hiccup with the setup package specifically and not a general case noreplace problem. -jef (In reply to comment #17) > (In reply to comment #16) > > Could someone retry the test cases in > > https://bugzilla.redhat.com/show_bug.cgi?id=479869#c3 , please? > (You'll need to define %_binary_filedigest_algorithm to "1" to create MD5 > packages, to "8" to create SHA-256 packages.) has it been carried forward? from the yum logs: Mar 04 21:51:01 Updated: rpm-libs-4.6.0-11.fc11.x86_64 Mar 18 12:24:04 Updated: rpm-libs-4.7.0-0.beta1.3.fc11.x86_64 Mar 30 19:37:44 Updated: rpm-libs-4.7.0-0.beta1.7.fc11.x86_64 (In reply to comment #21) > Lennart, > > From the yum log you did a significant number of updates on the 30th. > > Can you do a quick scan of your /etc/ to see how many rpmsave files are on the > system and see if other noreplace config files were impacted? Maybe this is > some sort of hiccup with the setup package specifically and not a general case > noreplace problem. As mentioned above quite a few config files got renamed. inittab, group, the sahdow files, nsswitch.conf. All in all about 20 files or so. Since I renamed them all back I cannot tell you this in more detail, sorry. (In reply to comment #9) > How on earth does a file marked %config _ever_ get renamed away, regardless of > whether the checksum is computable. That's just insane. That is the real question here. Digest change shouldn't affect %config(noreplace) behavior at all, and even for %config without noreplace it shouldn't have been *renamed* with no /etc/passwd left around at all. I'll try to figure out what might cause such behavior, but if somebody can come up with an actual reproducer, that'd be most helpful (can't reproduce this with a single package with a config(noreplace) file in it, digest change or no) Looking at the yum.log from comment #10: ... Mar 30 21:13:13 Updated: 1:gnome-applets-2.25.92-4.fc11.x86_64 Mar 30 21:13:16 Updated: libselinux-devel-2.0.79-4.fc11.x86_64 Mar 30 21:13:16 Updated: gnome-common-2.26.0-1.fc11.noarch Mar 30 21:13:17 Updated: libselinux-2.0.79-4.fc11.i586 Mär 30 21:25:04 Erased: tkinter Mär 30 21:25:07 Erased: PersonalCopy-Lite-patches Mär 30 21:25:09 Erased: audit-libs Mär 30 21:25:11 Erased: opal ... Mär 30 21:28:35 Erased: setup ... Mär 30 21:29:55 Erased: libgcc ... So setup, and big pile of other stuff like libgcc, bash, etc has been *removed* - no wonder there's no /etc/passwd or much anything left. There's a twelve minute time break in times between last "Updated" and the first "Erased" which almost certainly means a new transaction starting there (iirc you can't even perform simultaneous install/update/erase with yum). Lennart, please check your root's command history and see what exact yum commands have been run around that time. From the log it seems that there's been what amounts to self-destruct 'yum remove' command issued on March 30th after the big update. (In reply to comment #25) > So setup, and big pile of other stuff like libgcc, bash, etc has been *removed* > - no wonder there's no /etc/passwd or much anything left. There's a twelve > minute time break in times between last "Updated" and the first "Erased" which > almost certainly means a new transaction starting there (iirc you can't even > perform simultaneous install/update/erase with yum). Also notable: - the sudden change in arch for libselinux (x86_64 -> i586) - the change in language of the month in the printed messages (English -> German?) Hmm, the 12 minute time jump might be an artifact of how yum logs things: "Cleanup" actions (ie erase caused by upgrade) aren't logged, only obsoletions and real removals are logged as "Erased". In any case, the issue here is that something added a huge pile of erasure elements that shouldn't have been there. Now we just need to figure out is it rpm or yum and what triggers it, so far I haven't been able to reproduce. I am seeing some other strangeness in logging times though (this from a chrooted test-upgrade from f10 to rawhide): ... Apr 02 14:25:18 Updated: 1:gdm-user-switch-applet-2.26.0-7.fc11.x86_64 Apr 02 14:25:32 Installed: gnome-bluetooth-2.27.1-4.fc11.x86_64 Apr 02 14:25:33 Updated: bluez-4.34-1.fc11.x86_64 Apr 02 07:27:06 Erased: libdhcp4client Apr 02 07:27:24 Erased: bluez-gnome Apr 02 07:27:35 Erased: pulseaudio-core-libs Here's my guess: 1. yum update was run - something (probably a crash of something like X or selinux or dbus) took down the system mid-transaction 2. depending on the version of yum he had one there at the time - he ran yum-complete-transaction which finished the erasure portion of the update process 3. something in the erasure portion went wrong. Lennart, Does that sound familiar? Created attachment 337802 [details]
yum.log
Seth, your guess fairly much precisely matches what just triggered it for me (yeah, I picked last night to upgrade from F-10 to rawhide; I'm a masochist).
Yum died because the X session restarted itself (and brought me back to a new session, bizarrely. The X server didn't crash and restart and bring me back to the gdm login screen). I restarted it with yum-complete-transaction.
Similar situation for me. I was yum updating over a ssh session and lost connectivity. When I logged back in, I ran yum-complete-transaction and then rebooted, at which point the boot failed due to missing /etc/passwd. (In reply to comment #25) > Looking at the yum.log from comment #10: > > ... > Mar 30 21:13:13 Updated: 1:gnome-applets-2.25.92-4.fc11.x86_64 > Mar 30 21:13:16 Updated: libselinux-devel-2.0.79-4.fc11.x86_64 > Mar 30 21:13:16 Updated: gnome-common-2.26.0-1.fc11.noarch > Mar 30 21:13:17 Updated: libselinux-2.0.79-4.fc11.i586 > Mär 30 21:25:04 Erased: tkinter > Mär 30 21:25:07 Erased: PersonalCopy-Lite-patches > Mär 30 21:25:09 Erased: audit-libs > Mär 30 21:25:11 Erased: opal > ... > Mär 30 21:28:35 Erased: setup > ... > Mär 30 21:29:55 Erased: libgcc > ... > > So setup, and big pile of other stuff like libgcc, bash, etc has been *removed* > - no wonder there's no /etc/passwd or much anything left. There's a twelve > minute time break in times between last "Updated" and the first "Erased" which > almost certainly means a new transaction starting there (iirc you can't even > perform simultaneous install/update/erase with yum). > > Lennart, please check your root's command history and see what exact yum > commands have been run around that time. From the log it seems that there's > been what amounts to self-destruct 'yum remove' command issued on March 30th > after the big update. Dude, I am not stupid. I did a "yum upgrade" that's all. And bash is still there. No clue why yum claims they got removed. Possibly some multi-arch issue? Or an upgrade? (i.e. first install new package, then remove old package?) Also bash, libgcc are not /etc/passwd. (In reply to comment #31) > Dude, I am not stupid. I did a "yum upgrade" that's all. Nobody intentionally removes half the system, I'm just trying to find out where the bug is. Others have had similar experience with upgrade crashing somewhere in the middle and then tried yum-complete-transaction which ended up erasing things it certainly shouldn't have (see comments 28-30). So just be certain: did you use yum-complete-transaction on the system? If not, then we're back to drawing board here. In my case, I was doing "yum --nogpgcheck localupdate *.fc11.$ARCH.rpm *.fc11.noarch.rpm" from directory /var/cache/pungi/rawhide/packages after "rm -f $(repomanage -o .)" This would update at least several dozen packages. The rpm_debug_check and transaction check succeeded, and the replacements started. Then yum failed after (during?) replacement of bash (which was something like the fifth or sixth package replaced), claiming that some package (dependency?) was already installed (or something). Inspecting the results, "rpm -q bash" showed that two different versions of bash (for the same $ARCH) were installed. I removed the older one by hand with "rpm --erase". I tried re-invoking the localupdate, and received the notice that there were pending transactions, and recommending yum-complete-transaction. So I invoked yum-complete-transaction, and saw that yum proposed to remove hundreds of packages. I said, "No, thank you" to that. Instead I invoked "yum-complete-transaction --cleanup-only" and then "yum update", suffering the re-download of all those packages, but saving sanity. This whole process happened about three times total in the last week or so, on both i386 and x86_64; but the most recent runs of localupdate from my pungi caches succeeded. [I considered filing a bug report, but I was more interested in testing Rescue mode of my newly-composed DVDs, and thought that explaining it all would be messy, and that my use case would be ridiculed as outrageous.] Hint: FA_SKIP != FA_BACKUP. Reproducer: Take an upgrade transaction. Split the installs from the erases. Run the upgrade transaction, terminate after installs are done. (wait a bit so that yum.log shows clearly that we're gonna run a new&different transaction) Now run the erases. Have fun! I've also been hit by this bug. Starting point was Rawhide which was fully updated (all packages which were available before/during the beta freeze). After the unfreeze, I ran 'yum update' and during the installation of 'glibc-common' the computer freezed. After a reboot I first manually installed glibc* which was hanging around in /var/cache/yum using 'rpm -Uvh --replacefiles --replacepkgs' (so that my glibc installation was sane..translations didn't work anymore). Afterwards I ran 'yum-complete-transaction'. This caused files like /etc/passwd to be renamed to .rpmsave. Fixed in rpm-4.7.0-0.beta1.9.fc11, file states of skipped files (such as %config(noreplace) on upgrade, and %ghost files) wasn't getting recorded correctly, causing them to be inappropriately handled. The fixed rpm wont magically fix the incorrect states already in rpmdb but further updates will correct the issue, and the issue should only be generally seen in special circumstances like when running yum-complete-transaction. The bug making the session abort but not logout is here: https://bugzilla.redhat.com/show_bug.cgi?id=494046 |