Bug 1178978 - offline update fails due to separate /var volume mounting too late
Summary: offline update fails due to separate /var volume mounting too late
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: 21
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-01-05 20:00 UTC by Chris Murphy
Modified: 2015-05-25 11:12 UTC (History)
16 users (show)

Fixed In Version: systemd-216-20.fc21
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-02-15 03:05:36 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
journal with systemd debug (790.80 KB, text/plain)
2015-01-05 20:05 UTC, Chris Murphy
no flags Details
journal restart&install (1.53 MB, text/plain)
2015-01-25 20:54 UTC, Chris Murphy
no flags Details
journal following c9 (1.02 MB, text/plain)
2015-01-25 20:58 UTC, Chris Murphy
no flags Details
journal with /var directory instead of /var volume (164.37 KB, text/plain)
2015-01-25 21:18 UTC, Chris Murphy
no flags Details

Description Chris Murphy 2015-01-05 20:00:25 UTC
Description of problem: When clicking Restart & Install from GNOME Software, or when restarting from GNOME shell leaving "Install pending software updates" ticked, updates are not applied upon reboot.

Uncertain if this is a systemd or PackageKit bug. It could be related to non-standard /boot on Btrfs (as a subvolume).


Version-Release number of selected component (if applicable):
PackageKit-1.0.3-4.fc21.x86_64
systemd-216-12.fc21.x86_64
gnome-shell-3.14.2-1.fc21.x86_64


How reproducible:
Always


Steps to Reproduce:
1. Fedora 21 clean installation with /boot, /var, /, /home as mountpoints backed by their own subvolumes: boot, var, root, home.
2. Login and wait for update notification.
3. Restart either from GNOME Software Restart & Install button; or gshell's restart dialog with pending updates checked.

Actual results:
Updates not applied.

Expected results:
Updates applied during reboot.

Additional info:

After restarting, where an update should have been performed (but didn't):

# systemctl status packagekit-offline-update.service
Showing one /org/freedesktop/systemd1/unit/packagekit_2doffline_2dupdate_2eservice
Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 object=/org/freedesktop/systemd1/unit/packagekit_2doffline_2dupdate_2eservice interface=org.freedesktop.DBus.Properties member=GetAll cookie=1 reply_cookie=0 error=n/a
● packagekit-offline-update.service - Updates the operating system whilst offline
   Loaded: loaded (/usr/lib/systemd/system/packagekit-offline-update.service; static)
   Active: inactive (dead)
Root directory /var/log/journal added.
Considering /var/log/journal/80c3462249aa4a85be8e5d060239ba11.
Directory /var/log/journal/80c3462249aa4a85be8e5d060239ba11 added.
File /var/log/journal/80c3462249aa4a85be8e5d060239ba11/system.journal added.
Journal filter: (((OBJECT_SYSTEMD_UNIT=packagekit-offline-update.service AND _UID=0) OR (UNIT=packagekit-offline-update.service AND _PID=1) OR (COREDUMP_UNIT=packagekit-offline-update.service AND _UID=0 AND MESSAGE_ID=fc2e22bc6ee647b6b90729ab34a250b1) OR _SYSTEMD_UNIT=packagekit-offline-update.service) AND _BOOT_ID=59f65c5ef6f74d898f148ab7635df871)

Jan 05 12:54:53 f21v.localdomain systemd[1]: Collecting packagekit-offline-update.service
Root directory /var/log/journal removed.
Directory /var/log/journal/80c3462249aa4a85be8e5d060239ba11 removed.
mmap cache statistics: 442 hit, 4 miss



# journalctl -b -l -o short-monotonic | grep -i offline
Root directory /var/log/journal added.
Considering /var/log/journal/80c3462249aa4a85be8e5d060239ba11.
Directory /var/log/journal/80c3462249aa4a85be8e5d060239ba11 added.
File /var/log/journal/80c3462249aa4a85be8e5d060239ba11/system.journal added.
File /var/log/journal/80c3462249aa4a85be8e5d060239ba11/user-1000.journal added.
Journal filter: _BOOT_ID=59f65c5ef6f74d898f148ab7635df871
[   75.549329] f21v.localdomain systemd[1]: Got message type=method_call sender=n/a destination=org.freedesktop.systemd1 object=/org/freedesktop/systemd1/unit/packagekit_2doffline_2dupdate_2eservice interface=org.freedesktop.DBus.Properties member=GetAll cookie=1 reply_cookie=0 error=n/a
[   75.551818] f21v.localdomain systemd[1]: SELinux access check scon=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tcon=system_u:object_r:systemd_unit_file_t:s0 tclass=service perm=status path=/usr/lib/systemd/system/packagekit-offline-update.service cmdline=systemctl status packagekit-offline-update.service: 0
[   75.557928] f21v.localdomain systemd[1]: Collecting packagekit-offline-update.service
[   75.562901] f21v.localdomain systemd[1]: Failed to send unit change signal for packagekit-offline-update.service: Transport endpoint is not connected
[   75.563753] f21v.localdomain systemd[1]: Failed to send unit remove signal for packagekit-offline-update.service: Transport endpoint is not connected
[   89.815933] f21v.localdomain systemd[1]: Got message type=method_call sender=n/a destination=org.freedesktop.systemd1 object=/org/freedesktop/systemd1/unit/packagekit_2doffline_2dupdate_2eservice interface=org.freedesktop.DBus.Properties member=GetAll cookie=1 reply_cookie=0 error=n/a
[   89.816462] f21v.localdomain systemd[1]: SELinux access check scon=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tcon=system_u:object_r:systemd_unit_file_t:s0 tclass=service perm=status path=/usr/lib/systemd/system/packagekit-offline-update.service cmdline=systemctl status packagekit-offline-update.service: 0
[   89.821843] f21v.localdomain systemd[1]: Collecting packagekit-offline-update.service
[   89.825853] f21v.localdomain systemd[1]: Failed to send unit change signal for packagekit-offline-update.service: Transport endpoint is not connected
[   89.826707] f21v.localdomain systemd[1]: Failed to send unit remove signal for packagekit-offline-update.service: Transport endpoint is not connected
Root directory /var/log/journal removed.
Directory /var/log/journal/80c3462249aa4a85be8e5d060239ba11 removed.
mmap cache statistics: 505371 hit, 7 miss


# ausearch -m AVC
<nothing for today>

Comment 1 Chris Murphy 2015-01-05 20:05:38 UTC
Created attachment 976572 [details]
journal with systemd debug

systemd.log_level=debug
journalctl -b -l -o short-monotonic --no-pager

Comment 2 Chris Murphy 2015-01-05 20:15:51 UTC
So I tried to manually initiate it this way, and I get some text in the terminal window as show, and then a reboot. But nothing was actually installed, and upon reboot, login, and checking the restart dialog in shell it no longer has pending updates as an option.

# systemctl start packagekit-offline-update.service
Calling manager for StartUnit on packagekit-offline-update.service, replace
Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=StartUnit cookie=1 reply_cookie=0 error=n/a
Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=GetUnit cookie=2 reply_cookie=0 error=n/a
Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 object=/org/freedesktop/systemd1/unit/packagekit_2doffline_2dupdate_2eservice interface=org.freedesktop.DBus.Properties member=Get cookie=3 reply_cookie=0 error=n/a
Adding /org/freedesktop/systemd1/job/1402 to the set
Got message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=UnitNew cookie=2 reply_cookie=0 error=n/a
Got D-Bus request: org.freedesktop.systemd1.Manager.UnitNew() on /org/freedesktop/systemd1
Got message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=UnitNew cookie=3 reply_cookie=0 error=n/a
Got D-Bus request: org.freedesktop.systemd1.Manager.UnitNew() on /org/freedesktop/systemd1
Got message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=UnitNew cookie=4 reply_cookie=0 error=n/a
Got D-Bus request: org.freedesktop.systemd1.Manager.UnitNew() on /org/freedesktop/systemd1
Got message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobNew cookie=5 reply_cookie=0 error=n/a
Got D-Bus request: org.freedesktop.systemd1.Manager.JobNew() on /org/freedesktop/systemd1
Got message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobRemoved cookie=8 reply_cookie=0 error=n/a
Got D-Bus request: org.freedesktop.systemd1.Manager.JobRemoved() on /org/freedesktop/systemd1
Got result done/Success for job packagekit-offline-update.service
[root@f21v mnt]# Connection to localhost closed by remote host.


For example, this file is present still:
/var/cache/PackageKit/metadata/updates/packages/btrfs-progs-3.17.3-1.fc21.x86_64.rpm

and yet:

# rpm -q btrfs-progs
btrfs-progs-3.17-1.fc21.x86_64

Comment 3 Chris Murphy 2015-01-09 06:45:01 UTC
OK figured it out.

If /var is a mountpoint instead of a directory, systemd mounts it too late for systemd offline update to see /var/.updated.


[   26.851413] f21s2.localdomain systemd[506]: Executing: /bin/mount -n /dev/disk/by-uuid/9b3a4e65-0486-4537-9bf7-422e388b4d7f /var -t btrfs -o autodefrag,compress,subvol=var
[   26.942918] f21s2.localdomain systemd[1]: Mounted /var.

[   28.017738] f21s2.localdomain systemd[1]: Starting of systemd-update-done.service requested but condition failed. Ignoring.

[root@f21s2 ~]# systemctl status systemd-update-done.service
Showing one /org/freedesktop/systemd1/unit/systemd_2dupdate_2ddone_2eservice
Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 object=/org/freedesktop/systemd1/unit/systemd_2dupdate_2ddone_2eservice interface=org.freedesktop.DBus.Properties member=GetAll cookie=1 reply_cookie=0 error=n/a
ConditionNeedsUpdate 1 0 /var -1
ConditionNeedsUpdate 1 0 /etc -1
● systemd-update-done.service - Update is Completed
   Loaded: loaded (/usr/lib/systemd/system/systemd-update-done.service; static)
   Active: inactive (dead)
           start condition failed at Thu 2015-01-08 23:35:28 MST; 4min 20s ago
           none of the trigger conditions were met
     Docs: man:systemd-update-done.service(8)
Journal filter: (((OBJECT_SYSTEMD_UNIT=systemd-update-done.service AND _UID=0) OR (UNIT=systemd-update-done.service AND _PID=1) OR (COREDUMP_UNIT=systemd-update-done.service AND _UID=0 AND MESSAGE_ID=fc2e22bc6ee647b6b90729ab34a250b1) OR _SYSTEMD_UNIT=systemd-update-done.service) AND _BOOT_ID=e0c74b1140fa4d23b37f651ce7b5f108)

Jan 08 23:35:27 f21s2.localdomain systemd[1]: Merged into installed job systemd-update-done.service/start as 178
Jan 08 23:35:28 f21s2.localdomain systemd[1]: ConditionNeedsUpdate=|/var failed for systemd-update-done.service.
Jan 08 23:35:28 f21s2.localdomain systemd[1]: ConditionNeedsUpdate=|/etc failed for systemd-update-done.service.
Jan 08 23:35:28 f21s2.localdomain systemd[1]: Starting of systemd-update-done.service requested but condition failed. Ignoring.
Jan 08 23:35:28 f21s2.localdomain systemd[1]: Job systemd-update-done.service/start finished, result=done
Jan 08 23:35:28 f21s2.localdomain systemd[1]: Started Update is Completed.

I don't know why it says /var and /etc failed. /etc is a directory and is available at the time / is mounted.

I don't know if this is related, and is causing the delayed /var mount? But it's still a bug in Fedora 21, so /var isn't empty before the var volume is mounted at /var.
https://bugzilla.redhat.com/show_bug.cgi?id=1098611

Comment 4 Chris Murphy 2015-01-18 21:06:22 UTC
Possible work around:
$ systemctl start packagekit-offline-update.service

This has worked successfully for me 3 out of 4 attempts; I don't know why it didn't work the 4th time, but the system simply wasn't updated. So it seems to be a fail safe suggestion that's worth trying. Also, this has the added benefit of using packages already downloaded and approved by the user in Software.

Long term work around depends on a systemd update to get the service to depend on /var mounting first.

Comment 5 Zbigniew Jędrzejewski-Szmek 2015-01-25 02:04:14 UTC
Sorry for not looking this earlier.

systemd-update-done.service has a dependency on local-fs.target, which in turn has a dependency on var.mount. (You can see that with "systemctl list-dependencies --after systemd-update-done").

I don't see why you think that the check is performed before /var is mounted. The log shows that systemd reports mount as finished about a second before the condition is checked.

> I don't know why it says /var and /etc failed. /etc is a directory
> and is available at the time / is mounted.
"Condition... failed" only means that the condition was not met. In this case, that the timestamp on /etc/.updates was not newer than /etc.

Anyway, systemd-update-done.service cannot be the source of your problems. All it does is update the timestamp. So if it wasn't working, then the symptom would probably be repeated attempts to do the update, not a failure of the update.

packagekit-offline-update.service hooks into system-update.target, which is a different beast altogether. It probably failed, but it is done as a separate boot. By the time you get to the normal system, that failed attempt is boot -1. Can you attach the logs for that?

Comment 6 Chris Murphy 2015-01-25 09:31:53 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #5)
> packagekit-offline-update.service hooks into system-update.target, which is
> a different beast altogether. It probably failed, but it is done as a
> separate boot. By the time you get to the normal system, that failed attempt
> is boot -1. Can you attach the logs for that?

Do you mean journalctl -b-1 could be the same kernel boot as -b but is a restarted systemd without the computer rebooting?

What I've attached is the journal recorded for the boot that should have had an offline-upgrade start, but one didn't start.

1. System is off.
2. Boot system.
3. Use gnome-software to check for updates, it has updates and offers a Restart & Install button. I click that button.
4. The computer reboots. At the grub menu I edit the 1st grub menuentry adding systemd.log_level=debug. And I boot that entry. An update does not commence, there is no additional reboot, the system arrives at a login window. I use journalctl -b to write out this boot and attached that journal.

There's no possibility there's an additional boot that I missed. I click "Install & Restart" and I get one reboot, and end up at a login window.

Comment 7 Zbigniew Jędrzejewski-Szmek 2015-01-25 15:13:40 UTC
There's supposed to be an additional boot. See the description on http://freedesktop.org/wiki/Software/systemd/SystemUpdates/.
It seems that you don't have this additional boot, which means that for some reason the "system update" thing is not starting for you at all. /usr/lib/systemd/system-generators/systemd-system-update-generator runs and exits successfully, so most likely it simply does not pick up the update for some reason.

After a failed update, do you have /system-update symlink?

Comment 8 Chris Murphy 2015-01-25 20:49:29 UTC
Yes. 
system-update -> /var/cache

Comment 9 Chris Murphy 2015-01-25 20:54:56 UTC
Created attachment 984005 [details]
journal restart&install

This is a journal for a boot where I checked updates in Software, which found updates to download, then once it was possible I clicked Restart & Install.

Comment 10 Chris Murphy 2015-01-25 20:58:08 UTC
Created attachment 984006 [details]
journal following c9

Following Restart & Install, there is one boot which takes me directly to gdm. There's no additional reboot, no attempt to even update the system at all. This is the journal for this boot.

Comment 11 Chris Murphy 2015-01-25 21:18:10 UTC
Created attachment 984007 [details]
journal with /var directory instead of /var volume

Following the state in c10, I copy the contents of /var volume to /var directory, and change fstab so that the separate /var volume no longer mounts. Reboot from CLI. Now I get a successful update followed by an unprompted auto reboot. That journal is attached, captured with journalctl -b-1 from the following boot after updates applied.

I notice this, so the update did fail I guess, although I don't think it's related.
pk-offline-update[569]: failed to update system: Failed to find NetworkManager;1:0.9.10.1-1.4.20150115git.fc21;x86_64;updates

Comment 12 Chris Murphy 2015-01-25 22:01:13 UTC
(In reply to Chris Murphy from comment #10)
> Created attachment 984006 [details]
> journal following c9
> 
> Following Restart & Install, there is one boot which takes me directly to
> gdm. There's no additional reboot, no attempt to even update the system at
> all. This is the journal for this boot.

UHHH, what in the world?
[    2.583215] f21s2.localdomain kernel: Switched to clocksource tsc
[   11.196008] f21s2.localdomain systemd-sysctl[143]: Setting 'kernel/shmmax' to '4294967295'

There is a 9 second missing gap in this boot entry. Whereas the boot in c11 has a ton of things recorded in this time period including pk-offline-update entries. journalctl --verify shows nothing wrong with the journal. This is strange.

Comment 13 Zbigniew Jędrzejewski-Szmek 2015-01-26 00:09:57 UTC
(In reply to Chris Murphy from comment #8)
> Yes. 
> system-update -> /var/cache
Oh, this is indeed obvious. You were (almost) right the first time. /system-update symlink points to /var/cache, but it's not present at the time systemd-update-generator runs.

I'm not sure what the proper fix would be. It is expected that the update image is in /var, which makes the whole design ill-fitted for a separate /var. I guess that systemd-update-generator should simply check for the presence of the symlink, and accept a dangling symlink too. Not pretty, but should work.

Comment 14 Zbigniew Jędrzejewski-Szmek 2015-01-26 06:37:28 UTC
Patch sent to the mailing list: http://lists.freedesktop.org/archives/systemd-devel/2015-January/027397.html.

Comment 15 Lennart Poettering 2015-01-29 01:59:42 UTC
Fixed in git.

Comment 16 Fedora Update System 2015-02-05 20:53:27 UTC
systemd-216-20.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/systemd-216-20.fc21

Comment 17 Fedora Update System 2015-02-07 03:57:32 UTC
Package systemd-216-20.fc21:
* should fix your issue,
* was pushed to the Fedora 21 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing systemd-216-20.fc21'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-1793/systemd-216-20.fc21
then log in and leave karma (feedback).

Comment 18 Fedora Update System 2015-02-15 03:05:36 UTC
systemd-216-20.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.