Bug 1178978
Summary: | offline update fails due to separate /var volume mounting too late | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Chris Murphy <bugzilla> | ||||||||||
Component: | systemd | Assignee: | systemd-maint | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||
Priority: | unspecified | ||||||||||||
Version: | 21 | CC: | johannbg, jonathan, jsynacek, kalevlember, kparal, lnykryn, miminar, msekleta, rdieter, rhughes, samuel-rhbugs, smparrish, s, systemd-maint, vpavlin, zbyszek | ||||||||||
Target Milestone: | --- | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | Unspecified | ||||||||||||
OS: | Unspecified | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | systemd-216-20.fc21 | Doc Type: | Bug Fix | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2015-02-15 03:05:36 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Description
Chris Murphy
2015-01-05 20:00:25 UTC
Created attachment 976572 [details]
journal with systemd debug
systemd.log_level=debug
journalctl -b -l -o short-monotonic --no-pager
So I tried to manually initiate it this way, and I get some text in the terminal window as show, and then a reboot. But nothing was actually installed, and upon reboot, login, and checking the restart dialog in shell it no longer has pending updates as an option. # systemctl start packagekit-offline-update.service Calling manager for StartUnit on packagekit-offline-update.service, replace Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=StartUnit cookie=1 reply_cookie=0 error=n/a Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=GetUnit cookie=2 reply_cookie=0 error=n/a Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 object=/org/freedesktop/systemd1/unit/packagekit_2doffline_2dupdate_2eservice interface=org.freedesktop.DBus.Properties member=Get cookie=3 reply_cookie=0 error=n/a Adding /org/freedesktop/systemd1/job/1402 to the set Got message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=UnitNew cookie=2 reply_cookie=0 error=n/a Got D-Bus request: org.freedesktop.systemd1.Manager.UnitNew() on /org/freedesktop/systemd1 Got message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=UnitNew cookie=3 reply_cookie=0 error=n/a Got D-Bus request: org.freedesktop.systemd1.Manager.UnitNew() on /org/freedesktop/systemd1 Got message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=UnitNew cookie=4 reply_cookie=0 error=n/a Got D-Bus request: org.freedesktop.systemd1.Manager.UnitNew() on /org/freedesktop/systemd1 Got message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobNew cookie=5 reply_cookie=0 error=n/a Got D-Bus request: org.freedesktop.systemd1.Manager.JobNew() on /org/freedesktop/systemd1 Got message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobRemoved cookie=8 reply_cookie=0 error=n/a Got D-Bus request: org.freedesktop.systemd1.Manager.JobRemoved() on /org/freedesktop/systemd1 Got result done/Success for job packagekit-offline-update.service [root@f21v mnt]# Connection to localhost closed by remote host. For example, this file is present still: /var/cache/PackageKit/metadata/updates/packages/btrfs-progs-3.17.3-1.fc21.x86_64.rpm and yet: # rpm -q btrfs-progs btrfs-progs-3.17-1.fc21.x86_64 OK figured it out. If /var is a mountpoint instead of a directory, systemd mounts it too late for systemd offline update to see /var/.updated. [ 26.851413] f21s2.localdomain systemd[506]: Executing: /bin/mount -n /dev/disk/by-uuid/9b3a4e65-0486-4537-9bf7-422e388b4d7f /var -t btrfs -o autodefrag,compress,subvol=var [ 26.942918] f21s2.localdomain systemd[1]: Mounted /var. [ 28.017738] f21s2.localdomain systemd[1]: Starting of systemd-update-done.service requested but condition failed. Ignoring. [root@f21s2 ~]# systemctl status systemd-update-done.service Showing one /org/freedesktop/systemd1/unit/systemd_2dupdate_2ddone_2eservice Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 object=/org/freedesktop/systemd1/unit/systemd_2dupdate_2ddone_2eservice interface=org.freedesktop.DBus.Properties member=GetAll cookie=1 reply_cookie=0 error=n/a ConditionNeedsUpdate 1 0 /var -1 ConditionNeedsUpdate 1 0 /etc -1 ● systemd-update-done.service - Update is Completed Loaded: loaded (/usr/lib/systemd/system/systemd-update-done.service; static) Active: inactive (dead) start condition failed at Thu 2015-01-08 23:35:28 MST; 4min 20s ago none of the trigger conditions were met Docs: man:systemd-update-done.service(8) Journal filter: (((OBJECT_SYSTEMD_UNIT=systemd-update-done.service AND _UID=0) OR (UNIT=systemd-update-done.service AND _PID=1) OR (COREDUMP_UNIT=systemd-update-done.service AND _UID=0 AND MESSAGE_ID=fc2e22bc6ee647b6b90729ab34a250b1) OR _SYSTEMD_UNIT=systemd-update-done.service) AND _BOOT_ID=e0c74b1140fa4d23b37f651ce7b5f108) Jan 08 23:35:27 f21s2.localdomain systemd[1]: Merged into installed job systemd-update-done.service/start as 178 Jan 08 23:35:28 f21s2.localdomain systemd[1]: ConditionNeedsUpdate=|/var failed for systemd-update-done.service. Jan 08 23:35:28 f21s2.localdomain systemd[1]: ConditionNeedsUpdate=|/etc failed for systemd-update-done.service. Jan 08 23:35:28 f21s2.localdomain systemd[1]: Starting of systemd-update-done.service requested but condition failed. Ignoring. Jan 08 23:35:28 f21s2.localdomain systemd[1]: Job systemd-update-done.service/start finished, result=done Jan 08 23:35:28 f21s2.localdomain systemd[1]: Started Update is Completed. I don't know why it says /var and /etc failed. /etc is a directory and is available at the time / is mounted. I don't know if this is related, and is causing the delayed /var mount? But it's still a bug in Fedora 21, so /var isn't empty before the var volume is mounted at /var. https://bugzilla.redhat.com/show_bug.cgi?id=1098611 Possible work around: $ systemctl start packagekit-offline-update.service This has worked successfully for me 3 out of 4 attempts; I don't know why it didn't work the 4th time, but the system simply wasn't updated. So it seems to be a fail safe suggestion that's worth trying. Also, this has the added benefit of using packages already downloaded and approved by the user in Software. Long term work around depends on a systemd update to get the service to depend on /var mounting first. Sorry for not looking this earlier.
systemd-update-done.service has a dependency on local-fs.target, which in turn has a dependency on var.mount. (You can see that with "systemctl list-dependencies --after systemd-update-done").
I don't see why you think that the check is performed before /var is mounted. The log shows that systemd reports mount as finished about a second before the condition is checked.
> I don't know why it says /var and /etc failed. /etc is a directory
> and is available at the time / is mounted.
"Condition... failed" only means that the condition was not met. In this case, that the timestamp on /etc/.updates was not newer than /etc.
Anyway, systemd-update-done.service cannot be the source of your problems. All it does is update the timestamp. So if it wasn't working, then the symptom would probably be repeated attempts to do the update, not a failure of the update.
packagekit-offline-update.service hooks into system-update.target, which is a different beast altogether. It probably failed, but it is done as a separate boot. By the time you get to the normal system, that failed attempt is boot -1. Can you attach the logs for that?
(In reply to Zbigniew Jędrzejewski-Szmek from comment #5) > packagekit-offline-update.service hooks into system-update.target, which is > a different beast altogether. It probably failed, but it is done as a > separate boot. By the time you get to the normal system, that failed attempt > is boot -1. Can you attach the logs for that? Do you mean journalctl -b-1 could be the same kernel boot as -b but is a restarted systemd without the computer rebooting? What I've attached is the journal recorded for the boot that should have had an offline-upgrade start, but one didn't start. 1. System is off. 2. Boot system. 3. Use gnome-software to check for updates, it has updates and offers a Restart & Install button. I click that button. 4. The computer reboots. At the grub menu I edit the 1st grub menuentry adding systemd.log_level=debug. And I boot that entry. An update does not commence, there is no additional reboot, the system arrives at a login window. I use journalctl -b to write out this boot and attached that journal. There's no possibility there's an additional boot that I missed. I click "Install & Restart" and I get one reboot, and end up at a login window. There's supposed to be an additional boot. See the description on http://freedesktop.org/wiki/Software/systemd/SystemUpdates/. It seems that you don't have this additional boot, which means that for some reason the "system update" thing is not starting for you at all. /usr/lib/systemd/system-generators/systemd-system-update-generator runs and exits successfully, so most likely it simply does not pick up the update for some reason. After a failed update, do you have /system-update symlink? Yes. system-update -> /var/cache Created attachment 984005 [details]
journal restart&install
This is a journal for a boot where I checked updates in Software, which found updates to download, then once it was possible I clicked Restart & Install.
Created attachment 984006 [details]
journal following c9
Following Restart & Install, there is one boot which takes me directly to gdm. There's no additional reboot, no attempt to even update the system at all. This is the journal for this boot.
Created attachment 984007 [details]
journal with /var directory instead of /var volume
Following the state in c10, I copy the contents of /var volume to /var directory, and change fstab so that the separate /var volume no longer mounts. Reboot from CLI. Now I get a successful update followed by an unprompted auto reboot. That journal is attached, captured with journalctl -b-1 from the following boot after updates applied.
I notice this, so the update did fail I guess, although I don't think it's related.
pk-offline-update[569]: failed to update system: Failed to find NetworkManager;1:0.9.10.1-1.4.20150115git.fc21;x86_64;updates
(In reply to Chris Murphy from comment #10) > Created attachment 984006 [details] > journal following c9 > > Following Restart & Install, there is one boot which takes me directly to > gdm. There's no additional reboot, no attempt to even update the system at > all. This is the journal for this boot. UHHH, what in the world? [ 2.583215] f21s2.localdomain kernel: Switched to clocksource tsc [ 11.196008] f21s2.localdomain systemd-sysctl[143]: Setting 'kernel/shmmax' to '4294967295' There is a 9 second missing gap in this boot entry. Whereas the boot in c11 has a ton of things recorded in this time period including pk-offline-update entries. journalctl --verify shows nothing wrong with the journal. This is strange. (In reply to Chris Murphy from comment #8) > Yes. > system-update -> /var/cache Oh, this is indeed obvious. You were (almost) right the first time. /system-update symlink points to /var/cache, but it's not present at the time systemd-update-generator runs. I'm not sure what the proper fix would be. It is expected that the update image is in /var, which makes the whole design ill-fitted for a separate /var. I guess that systemd-update-generator should simply check for the presence of the symlink, and accept a dangling symlink too. Not pretty, but should work. Patch sent to the mailing list: http://lists.freedesktop.org/archives/systemd-devel/2015-January/027397.html. Fixed in git. systemd-216-20.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/systemd-216-20.fc21 Package systemd-216-20.fc21: * should fix your issue, * was pushed to the Fedora 21 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing systemd-216-20.fc21' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2015-1793/systemd-216-20.fc21 then log in and leave karma (feedback). systemd-216-20.fc21 has been pushed to the Fedora 21 stable repository. If problems still persist, please make note of it in this bug report. |