Bug 755335
Summary: | Shutting down while auto-updating breaks the system | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Paolo Leoni <paolo.leoni84> | ||||||||||||
Component: | gnome-settings-daemon | Assignee: | Bastien Nocera <bnocera> | ||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||
Severity: | urgent | Docs Contact: | |||||||||||||
Priority: | urgent | ||||||||||||||
Version: | 17 | CC: | abo, akozumpl, awilliam, bnocera, collura, corey, david.w.holland+rhbugzilla, fedora, hughsient, jonathan, j.wielicki, kparal, mclasen, mkasik, rhughes, robatino, rstrode, rvitale, smparrish, tflink | ||||||||||||
Target Milestone: | --- | ||||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | All | ||||||||||||||
OS: | Linux | ||||||||||||||
Whiteboard: | AcceptedBlocker | ||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2012-05-15 02:09:49 UTC | Type: | --- | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Bug Depends On: | |||||||||||||||
Bug Blocks: | 752650 | ||||||||||||||
Attachments: |
|
Description
Paolo Leoni
2011-11-20 19:10:53 UTC
A colleague of mine hit this with F17, and I easily reproduced it myself. Just installed F17 Beta, booted and waited 5 minutes until security updates started to install. Then correctly powered off my computer using system menu. The result is this: > # yum history > Loaded plugins: langpacks, presto, refresh-packagekit > ID | Login user | Date and time | Action(s) | Altered > ------------------------------------------------------------------------------- > 2 | System <unset> | 2012-04-13 08:32 | I, U | 15 ** > 1 | System <unset> | 2012-04-11 00:21 | Install | 1106 > Warning: RPMDB altered outside of yum. > ** Found 11 pre-existing rpmdb problem(s), 'yum check' output follows: > expat-2.1.0-1.fc17.x86_64 is a duplicate with expat-2.0.1-12.fc17.x86_64 > libpurple-2.10.2-1.fc17.x86_64 is a duplicate with libpurple-2.10.1-4.fc17.x86_64 > 1:libsmbclient-3.6.4-82.fc17.1.x86_64 is a duplicate with 1:libsmbclient-3.6.3-80.fc17.1.x86_64 > libtasn1-2.12-1.fc17.x86_64 is a duplicate with libtasn1-2.7-3.fc17.x86_64 > 1:libwbclient-4.0.0-39alpha18.fc17.x86_64 is a duplicate with 1:libwbclient-3.6.3-80.fc17.1.x86_64 > 1:openssl-1.0.0h-1.fc17.x86_64 is a duplicate with 1:openssl-1.0.0g-4.fc17.x86_64 > rpm-4.9.1.3-1.fc17.x86_64 is a duplicate with rpm-4.9.1.2-14.fc17.x86_64 > rpm-build-libs-4.9.1.3-1.fc17.x86_64 is a duplicate with rpm-build-libs-4.9.1.2-14.fc17.x86_64 > rpm-libs-4.9.1.3-1.fc17.x86_64 is a duplicate with rpm-libs-4.9.1.2-14.fc17.x86_64 > rpm-python-4.9.1.3-1.fc17.x86_64 is a duplicate with rpm-python-4.9.1.2-14.fc17.x86_64 > taglib-1.7.1-1.fc17.x86_64 is a duplicate with taglib-1.7-3.fc17.x86_64 > # yum history info 2 > Loaded plugins: langpacks, presto, refresh-packagekit > Transaction ID : 2 > Begin time : Fri Apr 13 08:32:30 2012 > Begin rpmdb : 1106:f61cfb0978ff109356a1e75e83d944432f8dc537 > User : System <unset> > Return-Code : ** Aborted ** > Transaction performed with: > Installed PackageKit-yum-0.7.3-1.fc17.x86_64 @koji-override-0/$releasever > Installed rpm-4.9.1.2-14.fc17.x86_64 @koji-override-0/$releasever > Installed yum-3.4.3-18.fc17.noarch @koji-override-0/$releasever > Installed yum-presto-0.7.1-2.fc17.noarch @koji-override-0/$releasever > Packages Altered: > ** Updated expat-2.0.1-12.fc17.x86_64 @koji-override-0/$releasever > Update 2.1.0-1.fc17.x86_64 installed > ** Updated freetype-2.4.8-2.fc17.x86_64 @koji-override-0/$releasever > ** Update 2.4.8-3.fc17.x86_64 @?fedora > ** Install kernel-3.3.1-5.fc17.x86_64 @?updates-testing > ** Updated libpng-2:1.5.9-1.fc17.x86_64 @koji-override-0/$releasever > ** Update 2:1.5.10-1.fc17.x86_64 @?fedora > ** Updated libpurple-2.10.1-4.fc17.x86_64 @koji-override-0/$releasever > Update 2.10.2-1.fc17.x86_64 installed > ** Updated libsmbclient-1:3.6.3-80.fc17.1.x86_64 @koji-override-0/$releasever > Update 1:3.6.4-82.fc17.1.x86_64 installed > ** Updated libtasn1-2.7-3.fc17.x86_64 @koji-override-0/$releasever > Update 2.12-1.fc17.x86_64 installed > ** Updated libtiff-3.9.5-2.fc17.x86_64 @koji-override-0/$releasever > ** Update 3.9.5-3.fc17.x86_64 @?fedora > ** Updated libwbclient-1:3.6.3-80.fc17.1.x86_64 @koji-override-0/$releasever > Update 1:4.0.0-39alpha18.fc17.x86_64 installed > ** Updated openssl-1:1.0.0g-4.fc17.x86_64 @koji-override-0/$releasever > Update 1:1.0.0h-1.fc17.x86_64 installed > ** Updated rpm-4.9.1.2-14.fc17.x86_64 @koji-override-0/$releasever > Update 4.9.1.3-1.fc17.x86_64 installed > ** Updated rpm-build-libs-4.9.1.2-14.fc17.x86_64 @koji-override-0/$releasever > Update 4.9.1.3-1.fc17.x86_64 installed > ** Updated rpm-libs-4.9.1.2-14.fc17.x86_64 @koji-override-0/$releasever > Update 4.9.1.3-1.fc17.x86_64 installed > ** Updated rpm-python-4.9.1.2-14.fc17.x86_64 @koji-override-0/$releasever > Update 4.9.1.3-1.fc17.x86_64 installed > ** Updated taglib-1.7-3.fc17.x86_64 @koji-override-0/$releasever > Update 1.7.1-1.fc17.x86_64 installed I don't know yet what consequences this brings. Packagekit applications seemed to be totally broken (even after system restart). "Software sources" showed no repositories at all. "Add/Remove software" did nothing when I tried to install new software. Only "Software Update" informed me of unfinished transactions remaining and that I should run "yum-complete-transaction" as root. When I ran that, I received "Transaction size changed - this means we are not doing the same transaction as we were before. Aborting and disabling this transaction." Even though that did not fix any rpm errors ("yum check" still reports them), it "fixed" PackageKit applications - now I can see repositories and install software. Now imagine I had been installing new kernel and grubby was modifying grub.cfg when I had shut down the system (actually new kernel was scheduled in my case, I was just quicker). Or installing new X.org or some other piece of critical path packages. It's very easy to get unbootable system this way. This problem probably influences F15-F17, all of them have auto-update enabled for security updates. The users who have not disabled it (is this 'majority'?) can break their system at every power-off. It is so damn easy to do! We need some technical means to: 1) reject system reboot/power-off with GUI notification and ideally progress bar -or- 2) abort and revert the transaction quickly before system shuts down Until we are able to do that, I think the best approach is not to provide any means for system auto-update. That means change the defaults and also remove the options from PackageKit GUI and config files. Because I see this issue as very important (high chance of breaking system of any Fedora user out there is important), I propose this for discussion for F17 blocker meeting. We might want to be sure we release F17 with auto-update feature disabled (and even unavailable) so that this does not happen for any new Fedora users. I can of course investigate more how often this might happen, but as I say, I reproduced this on my first attempt. I can't fully imagine what can break and what can not, I'm sure rpm/packagekit guys can. If they can swear that the system boot/core functionality can't break for some technical reason, I'd be very relieved. Could using smaller transactions (a few packages at a time as needed) with rollbacks on shutdown (and blocking shutdown until rollback is complete) mitigate this? It seems to me to be a cleanish solution. Discussed at 2012-04-20 blocker review meeting - http://meetbot.fedoraproject.org/fedora-bugzappers/2012-04-20/fedora-bugzappers.2012-04-20-17.01.log.txt . Accepted as a blocker per Alpha criterion "The installed system must be able to download and install updates with yum and the default graphical package manager in all release-blocking desktops" - as we understand it, this bug can leave the system in a state where it can no longer install updates. Richard, could you please provide your reading of this? Any reason to think it's not as bad as it sounds? Any ideas for mitigating the problem? Thanks! A good approach could be a simple notification with a message like "Installing updates, please don't power off manually." Yes, it's the style of the "famous" proprietary O.S., but. imho, it's a simple and safe solution. I hit this multiple times with previous Fedora versions: Powering off the machine while a new kernel is being installed, and the next boot is broken of course. I certainly wouldn't have done it if I had known that updates are being installed. Something like this would be nice: You want to shut down the computer while updates are being installed [x] wait till its finished & poweroff [ ] cancel Just an observation, from someone who was looking at the list of F17 blocker bugs... The discussion seems to be focusing itself around powering off. Rebooting the system in the middle of updating would be a problem too. Has there been any movement on this? Any suggestions on worarounds? We're getting awfully close to RC1 and the final code freeze and this will block F17 release. I had asked Richard to give it a look, but haven't seen anything back yet (In reply to comment #9) > I had asked Richard to give it a look, but haven't seen anything back yet Right, I've spent quite a bit of time on this in the last week. I agree this is a valid bug, but we've had this behaviour for the last 2 released versions of Fedora. When HAL was still the central thing to shutdown the system we used to inhibit HAL to shop the machine shutting down when any running transaction was uncancellable. So, to fix this we've got three choices: 1. Stop the updates being auto-applied by changing one gsettings key, but this has the drawback we loose the auto-update functionality. This is probably the right thing to do if you really think this bug is a blocker. Note, this kind of corruption on shutdown is probably also going to happen with other things that write databases without locking. 2. Add a systemd unit file to PackageKit that causes the shutdown to wait until all the transactions are finished or cancelled. I've got some experimental code that does this, although it's had zero testing. The patch is here if anyone wants to give me some quick sanity review: http://people.freedesktop.org/~hughsient/temp/0001-Only-allow-a-system-shutdown-when-there-are-no-trans.patch 3. We switch 100% to a two-phase update process that a lot of people think is a good idea. The idea is the session downloads the packages when the user is idle, and then we ask the user to restart to actually do the update process. This can either be done in the initrd at boot time (which has the advantage that it's a known-clean environment) or at shutdown (which would upset people running for the airport). I'm at LGM until Sunday, so I've only got limited networking and hardware. Ideas welcome. Thanks. Both 2 and 3 need some session component. For 2, we need to have some indication that shutdown will get 'stuck' in the power off dialog. For 3, we need an explicit 'power off and apply updates' menuitem. > I agree this is a valid bug, but we've had this behaviour for the > last 2 released versions of Fedora. Pretty scary. > 1. Stop the updates being auto-applied by changing one gsettings key, I think changing the default is not enough, we also need to remove the relevant option in packagekit gui. Otherwise people can just enable it without knowing the consequences. And it's pretty hard to warn them about it; what do we say - "pgrep for yum every time you want to power off"? > 2. Add a systemd unit file to PackageKit that causes the shutdown to wait until > all the transactions are finished or cancelled. I've got some experimental code > that does this, although it's had zero testing. As Matthias says, we need some indicator. E.g. a text on the plymouth screen saying "Installing updates and powering off... [percentage]". Hey, that sounds familiar. Fortunately you'd see that only if you power off computer in the middle of an update process, not every time, like Microsoft does. > 3. We switch 100% to a two-phase update process that a lot of people think is a > good idea. The idea is the session downloads the packages when the user is > idle, and then we ask the user to restart to actually do the update process. > This can either be done in the initrd at boot time (which has the advantage > that it's a known-clean environment) or at shutdown (which would upset people > running for the airport). It upsets people both ways, sometimes you need PC to start fast, sometimes you need it to shut down fast. Would this apply only for automatically installed updates, or for all updates (even if I run yum manually?). Does it apply only for system core packages, or all packages? > Ideas welcome. Thanks. Installing updates on boot/shutdown wastes people's time and makes them annoyed. If we are able to install updates during normal system run, we should do so. We just need to make sure the process is either not interrupted, or its interruption doesn't break anything. Can we somehow improve yum transactions so that the process can be killed any time and it's either fully aborted or it can be continued later on? Btrfs snapshots would really help here I believe, but we're not there yet. If we can't improve yum transactions, we need to place safeguards. Systemd unit file and plymouth indicator sounds good. We can further improve it in GNOME by changing "Shut down" to "Finish installing updates ([percentage]) and shut down" or similar, which gives more information to the user before he calls a poweroff/reboot action. All of that needs some development time, so option 1 (removing auto-updates option) might be needed for F17. I don't have any better ideas ATM. This is not the best place for discussing longer-term ideal solutions, which involve more fundamental changes such as: - separating download and installation - discriminating between system and application updates - installing system updates outside the running system For F17, we can only make simple changes at this point. Even adding new warnings will run into problems with lack of translations. But we should do that anyway. The easiest is probably do to it directly in gnome-shell, in endSessionDialog.js. It will not cover everything, e.g. running reboot on the commandline will not give you a warning. A good long term solution could be option #2. Btw, actually I think option #1 is the better in order to solve this blocker issue in time for F17 final release. Auto-updates feauture would be disabled in this release, but most important thing is that in this way data corruption caused by updating operations will be avoided. (In reply to comment #13) > The easiest is probably do to it directly in gnome-shell, in > endSessionDialog.js. It will not cover everything, e.g. running reboot on the > commandline will not give you a warning. I assume you're talking about option #2 then. If we put the info dialog solely into gnome-shell, other DEs will still be affected, right? Fallback mode, KDE, XFCE, etc - all those use PackageKit and have automatic updates enabled by default, am I correct? (In reply to comment #15) > (In reply to comment #13) > > The easiest is probably do to it directly in gnome-shell, in > > endSessionDialog.js. It will not cover everything, e.g. running reboot on the > > commandline will not give you a warning. > > I assume you're talking about option #2 then. If we put the info dialog solely > into gnome-shell, other DEs will still be affected, right? Fallback mode, KDE, > XFCE, etc - all those use PackageKit and have automatic updates enabled by > default, am I correct? We have no single place to handle this across desktops, atm. In F18, systemd will have an inhibit/delay api for shutdown. Anyway, automatic updates are a feature of the gnome-settings-daemon updates plugin, afaik, so other desktops should not be affected, unless they implement the same thing. So I had a look at other DEs. Fallback mode (obviously), KDE and XFCE are also affected by this. But KDE does not have automatic updates enabled by default, so it's only affected if user changes the setting. Attaching screenshots. KDE has some notifications about installing updates, but please note that *they don't pop up by default*, just a small "i" icon appears on the panel and you have to click it to know the details. You can reboot in the middle of the process and break your system, no safeguards there. I haven't looked at other DEs. Created attachment 582197 [details]
KDE default config
Created attachment 582198 [details]
KDE notifications after security update
Created attachment 582199 [details]
KDE notifications while updating (don't pop up automatically)
Created attachment 582200 [details]
KDE broken rpmdb after reboot during updating
Created attachment 582201 [details]
XFCE default config
Just a quick note from the blocker review meeting today: it'd be good if we could get whatever's going to be done about this done soon, so we have a bit of time to make sure it works, and it's not a potentially de-stabilising change at a late RC stage. thanks! -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers (In reply to comment #23) > Just a quick note from the blocker review meeting today: it'd be good if we > could get whatever's going to be done about this done soon, so we have a bit of > time to make sure it works, and it's not a potentially de-stabilising change at > a late RC stage. thanks! Lennart is working on a proper solution for F18, but he's not going to backport it to F17 any time soon, so we'll need a workaround. My vote is just to turn off automatic updates by changing one gsettings key and make a note in the release notes. If you want a test package implementing that, let me know. I agree with Richard Huges. fine with me too Not fully fine with me. We should also remove the UI widget so that users can't enable it. If a tool offers you an option, you expect it works reasonably, not that it potentially destroys your system. Majority of users won't read the release notes. How difficult is it to remove the widget? We would need to do it in both GTK and QT version of the tool. Disabling it by default would be enough to make this non-blocker for me, I think. But I take Kamil's point, too. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers (In reply to comment #28) > Disabling it by default would be enough to make this non-blocker for me Agreed. I've done a test build here: http://koji.fedoraproject.org/koji/taskinfo?taskID=4062745 I'd appreciate some sanity testing from someone else. From a release notes point of view, I was thinking along the lines of: Fedora will not update security packages automatically due to the risk of shutting down whilst half way through causing database corruption. Use the GUI tool to do updates. A better solution will be present in Fedora 18. ..or something like that. I suck at prose. Thanks. I've just tested your package: Updates Settings now shows "Automatically install: nothing", and it seems to be the real behaviour. Someone has tested it on KDE? Btw, It's ok for me. Thanks to all. Paolo One thing I notice: if you run 'Software Settings' and look at the 'Update Settings' tab, the 'Check for updates:' drop-down - which I'd expect to read 'Never' - is instead empty. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers oh wait, never mind, that's a false alarm. that's my changed gsettings key to make update checking happen more often. the 'Automatically install:' drop-down says 'Nothing', which is correct. I think the change is good. gnome-settings-daemon-3.4.1-4.fc17 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/gnome-settings-daemon-3.4.1-4.fc17 After F17 final release I think this bug can be moved to F18 branch in order to continue with a more structural bug solving based on Lennart solution (now we have a workaround for F17 but, technically, the bug is still unsolved). 1. Do updates in initrd.(*) 2. Snapshot the volumes/filesystems before the update. 3. Make it possible to abort and roll back the update if the user is in a hurry. It's not generally possible to roll back the filesystem if an update fails while the system is running. You don't want to rollback your mysql data (while mysqld is running, even) just because a yum transaction was aborted. But in initrd it might be safe to do so. *) Sometimes it's best to just apply the update while the system is up though, so inhibiting shutdown is needed and a good start. But, wasn't there an idea of moving yum transactions into a deamon of its own? confirming this is fixed in tc4, setting verified. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers I have this problem too with Fedora 16. I would suggest that any fix should be backported, as I am currently working with a heavily damaged package database (any attempt to fully fix it lead to yum wanting to uninstall glibc as dependency), since accidentially I shut down my machine during an update. cheers Due to advanced status of F17 roadmap, the only fix of this issue adopted in the next release is default option changing in "Update settings" dialog, before F18 final release we'll have a better solution. Btw, you can manually disable the dangerous option from "Update settings" dialog also in F16, in order to avoid any future damages. gnome-settings-daemon-3.4.1-4.fc17 has been pushed to the Fedora 17 stable repository. If problems still persist, please make note of it in this bug report. Paolo: can we ship an update for F16 which flips the setting? We can change it if the user never adjusted the default, right? -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Yes, I think it could be useful also an update for F16. It should limit further damages caused by aborted updates for those that use *Power Off* instead *Suspend*. F15 is near to EOL, so I think it isn't a problem. Closing bug, as this update was pushed stable and was in TC5. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers |