Bug 1185604 - fedup upgrade fails within the initramfs-fedup env
Summary: fedup upgrade fails within the initramfs-fedup env
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: 21
Hardware: All
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedBlocker https://fedoraproject...
: 1190415 1199849 1200880 (view as bug list)
Depends On:
Blocks: F22BetaBlocker
TreeView+ depends on / blocked
 
Reported: 2015-01-25 06:43 UTC by Chris Murphy
Modified: 2015-04-02 15:37 UTC (History)
31 users (show)

Fixed In Version: systemd-216-24.fc21
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-04-02 15:37:37 UTC
Type: Bug


Attachments (Terms of Use)
image1.jpg (155.62 KB, image/jpeg)
2015-01-25 06:46 UTC, Chris Murphy
no flags Details
journal1.txt (1.07 MB, text/plain)
2015-01-25 06:47 UTC, Chris Murphy
no flags Details
journal2.txt (240.01 KB, text/plain)
2015-01-25 06:49 UTC, Chris Murphy
no flags Details
fedup.log (508.28 KB, text/plain)
2015-01-25 06:49 UTC, Chris Murphy
no flags Details
grub.cfg (5.68 KB, text/plain)
2015-01-25 06:49 UTC, Chris Murphy
no flags Details
fstab (436 bytes, text/plain)
2015-01-25 06:49 UTC, Chris Murphy
no flags Details
Screenshot of hung VM (45.87 KB, image/png)
2015-03-04 14:20 UTC, James Hogarth
no flags Details
journal output for failed fedup with selinux permissive (1.17 MB, text/plain)
2015-03-04 14:27 UTC, James Hogarth
no flags Details
'strace -fp 1' from the point of upgrade-prep.sh finishing till the final hang using the upgrade-wait trick Will described (376.75 KB, text/plain)
2015-03-04 14:44 UTC, James Hogarth
no flags Details
Realized last strace got killed when debug shell stopped. Disowned process to get fuller view of strace of pid1 (999.36 KB, text/plain)
2015-03-04 17:43 UTC, James Hogarth
no flags Details
coredump found in /var/lib/systemd/coredump from first fedup attempt (266.01 KB, application/x-xz)
2015-03-04 18:32 UTC, James Hogarth
no flags Details
coredump found in /var/lib/systemd/coredump from second fedup attempt (256.66 KB, application/x-xz)
2015-03-04 18:33 UTC, James Hogarth
no flags Details
coredump found in /var/lib/systemd/coredump from third fedup attempt (250.55 KB, application/x-xz)
2015-03-04 18:33 UTC, James Hogarth
no flags Details
screenshot of crash with systemd.log_target=kmsg systemd.log_level=debug (43.19 KB, image/png)
2015-03-05 16:46 UTC, James Hogarth
no flags Details
switch-root at rescue.target to a yum --installroot style f22 container (20.83 KB, image/png)
2015-03-05 17:05 UTC, James Hogarth
no flags Details
output of fedup f21 server to f22 with systemd.log_level=debug systemd.log_target=console (258.06 KB, text/plain)
2015-03-05 20:25 UTC, James Hogarth
no flags Details
coredump from the previous fedup of f21 server linked to the console debug log output (205.70 KB, application/x-xz)
2015-03-05 20:26 UTC, James Hogarth
no flags Details
Full backtrace of coredump from comment #26 (4.41 KB, text/plain)
2015-03-06 21:32 UTC, Will Woods
no flags Details
backtrace with ssse3 disabled in the VM (4.75 KB, text/plain)
2015-03-07 01:42 UTC, James Hogarth
no flags Details
Full backtrace with correct libselinux debuginfo (4.74 KB, text/plain)
2015-03-11 22:54 UTC, Will Woods
no flags Details
0001-Fix-SEGV-during-switch-root-from-system-with-SELinux.patch (1.20 KB, patch)
2015-03-13 21:19 UTC, Will Woods
no flags Details | Diff

Description Chris Murphy 2015-01-25 06:43:48 UTC
Description of problem:

Following successful fedup command, reboot 'System Upgrade (fedup)' fails in the initramfs and completely hangs the system.


Version-Release number of selected component (if applicable):
fedup-0.9.0-2.fc21.noarch
systemd-216-16.fc21.i686

How reproducible:
3 out of 4 boots fail


Steps to Reproduce:
1. Clean Fedora 21 i686 installation
/boot is XFS
/ is Btrfs
2. fedup --network 22 --instrepo https://kojipkgs.fedoraproject.org/mash/rawhide-20150124/rawhide/i386/os/ --nogpgcheck --debuglog fedup.log
3. Reboot


Actual results:

Hang after upgrade prep complete, switching root...

Expected results:

Successful upgrade


Additional info:

Comment 1 Chris Murphy 2015-01-25 06:46:29 UTC
Created attachment 983927 [details]
image1.jpg

JPG photo of screen at the time of the hang. This boot has systemd.log_level=debug enabled. This boot is the same as for journal1.txt.

Comment 2 Chris Murphy 2015-01-25 06:47:45 UTC
Created attachment 983928 [details]
journal1.txt

'journalctl -b-1 -l -o short-monotonic' following the failed boot, so it contains the output with systemd.log_level=debug. This is the same boot for image1.txt.

Comment 3 Chris Murphy 2015-01-25 06:49:11 UTC
Created attachment 983929 [details]
journal2.txt

rd.debug was used for this failed boot

Comment 4 Chris Murphy 2015-01-25 06:49:24 UTC
Created attachment 983930 [details]
fedup.log

Comment 5 Chris Murphy 2015-01-25 06:49:37 UTC
Created attachment 983931 [details]
grub.cfg

Comment 6 Chris Murphy 2015-01-25 06:49:47 UTC
Created attachment 983932 [details]
fstab

Comment 7 Will Woods 2015-01-26 21:47:27 UTC
Okay, so.. this looks like a systemd problem. Here's what I see in the log:

> upgrade-prep.sh[645]: upgrade prep complete, switching root...

So upgrade-prep.sh has finished, and systemd is executing upgrade-prep.service's
ExecStopPost command:

 ExecStopPost=-/usr/bin/systemctl --no-block isolate upgrade-switch-root.target

> systemctl[686]: Calling manager for StartUnit on upgrade-switch-root.target, isolate
> systemd[1]: Accepted new private connection.
> systemd[1]: Accepted new private connection.
> systemd-cgroups-agent[685]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1/agent interface=org.freedesktop.systemd1.Agent member=Released cookie=1 reply_cookie=0 error=n/a
> systemctl[686]: Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=StartUnit cookie=1 reply_cookie=0 error=n/a
> systemd[1]: Got message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1/agent interface=org.freedesktop.systemd1.Agent member=Released cookie=1 reply_cookie=0 error=n/a
> systemd[1]: Got message type=method_call sender=n/a destination=org.freedesktop.systemd1 object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=StartUnit cookie=1 reply_cookie=0 error=n/a
> systemd[1]: SELinux access check scon=system_u:system_r:init_t:s0 tcon=system_u:object_r:systemd_unit_file_t:s0 tclass=service perm=start path=/usr/lib/systemd/system/upgrade-switch-root.target cmdline=/usr/bin/systemctl --no-block isolate upgrade-switch-root.target: 0

Permissions are OK, we're now isolating upgrade-switch-root.target:

> systemd[1]: Trying to enqueue job upgrade-switch-root.target/start/isolate
> systemd[1]: Installed new job upgrade-switch-root.target/start as 214
> systemd[1]: Installed new job upgrade-switch-root.service/start as 215

Now systemd installs stop jobs for active units:

> systemd[1]: Installed new job local-fs.target/stop as 218
> [etc.]
> systemd[1]: Installed new job dm-event.socket/stop as 310

And then we actually enqueue/run upgrade-switch-root.target/start:

> systemd[1]: Enqueued job upgrade-switch-root.target/start as 214
> systemd[1]: Sent message type=method_return sender=n/a destination=n/a object=n/a interface=n/a member=n/a cookie=1 reply_cookie=1 error=n/a
> systemd[1]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=UnitNew cookie=2 reply_cookie=0 error=n/a

Something goes wrong, though:

> systemd[1]: Failed to send unit change signal for upgrade-switch-root.target: Connection reset by peer

Similarly, we fail to send signals for the rest of the jobs:

> systemd[1]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobNew cookie=3 reply_cookie=0 error=n/a
> systemd[1]: Failed to send job change signal for 310: Transport endpoint is not connected
> [etc.]
> systemd[1]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobNew cookie=36 reply_cookie=0 error=n/a
> systemd[1]: Failed to send job change signal for 214: Transport endpoint is not connected
> systemd[1]: Got disconnect on private connection.

Eventually stuff shuts down and we try to switch-root to the new systemd, but:

> systemd[1]: Serializing state to /run/systemd
> systemd[1]: Failed to send reloading signal: Connection reset by peer
> systemd[1]: Switching root.
> systemd[1]: Failed to send unit remove signal for sys-kernel-config.mount: Transport endpoint is not connected
> systemd[1]: Failed to send unit remove signal for local-fs.target: Transport endpoint is not connected
[etc.]

And that's where the journal ends.

My guess is that something has changed about dbus (either permissions or use of kdbus or something) between F21 and rawhide, and so either F21 systemd or rawhide systemd doesn't know how to talk to the other one properly.

Reassigning to rawhide systemd, since AFAIK F21 systemd/fedup are doing the right thing to initiate switch-root to the rawhide upgrade.img.

Comment 8 Lennart Poettering 2015-02-04 19:30:30 UTC
The messages about dbus signals don't really matter, they are shown probably mostly because a bus connection was shortliving and systemd wanted to notify it about changes, but didn't reap the connection yet. Ignore them. 

This might be an selinux issue, dunno?

Only way to figure out what is going on is by booting with the emergency shell and then stracing PID 1 over the transition or so...

Comment 9 Chris Murphy 2015-02-04 21:03:44 UTC
Problem is the failure and subsequent forced reboot happen too quickly to attach strace; and I don't think strace is on the install media or in the initramfs either.

Comment 10 Will Woods 2015-02-05 08:24:40 UTC
(In reply to Lennart Poettering from comment #8)
> This might be an selinux issue, dunno?

that'd be my guess too, but it's hard to debug over bugzilla.

> Only way to figure out what is going on is by booting with the emergency
> shell and then stracing PID 1 over the transition or so...

The fedup debug shell is enabled by default, on tty2.

(In reply to Chris Murphy from comment #9)
> Problem is the failure and subsequent forced reboot happen too quickly to
> attach strace; and I don't think strace is on the install media or in the
> initramfs either.

At the point of failure we're attempting to switch from your real root back to initramfs, so your system tools are all available.

You could edit /usr/libexec/upgrade-prep.sh and add whatever you want to happen.

Sometimes I add:

  touch /run/upgrade-wait
  while [ -f /run/upgrade-wait ]; do sleep 1; done

then use the shell on tty2 to set up debugging.

(If you're doing this in a VM, this is a good time to take a snapshot of the system.)

When you're done, "rm /run/upgrade-wait" makes the upgrade continue.

Comment 11 Adam Williamson 2015-02-12 16:39:51 UTC
*** Bug 1190415 has been marked as a duplicate of this bug. ***

Comment 12 Jaroslav Reznik 2015-03-03 16:45:48 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle.
Changing version to '22'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22

Comment 13 James Hogarth 2015-03-04 14:19:32 UTC
To negate selinux issues I set enforcing=0 on the kernel command line for fedup and it still hangs with "systemd-coredump[884]: Failed to log coredump: Connection refused"

This was a freshly installed F21 Workstation VM picking default behaviour and a full yum update.

Comment 14 James Hogarth 2015-03-04 14:20:19 UTC
Created attachment 997869 [details]
Screenshot of hung VM

Comment 15 James Hogarth 2015-03-04 14:27:38 UTC
Created attachment 997871 [details]
journal output for failed fedup with selinux permissive

Comment 16 James Hogarth 2015-03-04 14:44:25 UTC
Created attachment 997882 [details]
'strace -fp 1' from the point of upgrade-prep.sh finishing till the final hang using the upgrade-wait trick Will described

Comment 17 James Hogarth 2015-03-04 17:43:25 UTC
Created attachment 997989 [details]
Realized last strace got killed when debug shell stopped. Disowned process to get fuller view of strace of pid1

Comment 18 James Hogarth 2015-03-04 18:25:10 UTC
It would appear from the strace systemd-216 is serializing to /var/run/systemd correctly but something is happening when systemd-219 gets switched to during the switch-root to the upgrade environment.

Unfortunately by that point whatever explodes kills the system in a sufficient way gathering logs becomes tricky (the strace attached was piped to ssh to get it off the VM and onto the parent host to minimize loss).

What is a good way to test a simple case of the systemd-216 to systemd-219 transition in a clean a way as possible to eliminate (or confirm?) that part of the upgrade process?

Comment 19 James Hogarth 2015-03-04 18:32:39 UTC
Created attachment 997999 [details]
coredump found in /var/lib/systemd/coredump from first fedup attempt

Comment 20 James Hogarth 2015-03-04 18:33:24 UTC
Created attachment 998000 [details]
coredump found in /var/lib/systemd/coredump from second fedup attempt

Comment 21 James Hogarth 2015-03-04 18:33:46 UTC
Created attachment 998001 [details]
coredump found in /var/lib/systemd/coredump from third fedup attempt

Comment 22 James Hogarth 2015-03-05 16:46:41 UTC
Created attachment 998464 [details]
screenshot of crash with systemd.log_target=kmsg systemd.log_level=debug

Lennart on the systemd-devel mailing list pointed out at the time of the switch-root the strace gets killed as part of switching root so the bit we would be interested in gets lost.

This makes it remarkably more tricky to get decent debug information.

It was requested that an attempt be made with the logs sent to the kernel buffer to avoid the journald and strace being killed prior to interesting bit issue.

As can been seen in the screenshot pid1 gets a SEGV and then execution freezes.

Comment 23 James Hogarth 2015-03-05 17:05:55 UTC
Created attachment 998469 [details]
switch-root at rescue.target to a yum --installroot style f22 container

Following on from the previous thought I'd check behaviour in a similar but minimal test case.

Using yum --installroot (as per the systemd-nspawn man page) created a container area.

From here used the '/usr/bin/systemctl --no-block --force switch-root /srv/mycontainer' command used by the upgrade-system-root.service in fedup to switch root into that.

The segfault can be seen more clearly than the previous example - assuming this is a valid reduced test case.

Comment 24 James Hogarth 2015-03-05 17:36:47 UTC
Actually I'm not entirely convinced this is a valid test case for a reduced sample in this situation ...

Doing a installroot of F21 and calling switch-root into that also fails with the same segfault.

So unless systemd-216 has a an issue (rather than the transition from 216 to 219) carrying out switch-root it could be unrelated.

But then switch-root is used for start and stop so I'd have expected to see an issue there if it was that fundamental. 

So we're still stuck at something happening in the transition between 216 and 219 after the request to switch root has occurred but with little in the way of debug due to how the processes interact during this transition.

Any further ideas on how to get some worthwhile diagnostics?

Comment 25 James Hogarth 2015-03-05 20:25:55 UTC
Created attachment 998543 [details]
output of fedup f21 server to f22 with systemd.log_level=debug systemd.log_target=console

This is a bit better than the screenshots ...

From virt-builder fedora-21 as a starting point fedup to TC8 server was carried out in a VM with the console directed to ttyS0 and virsh console used to connect to this.

The output of the console was copied from the terminal session directly.

Comment 26 James Hogarth 2015-03-05 20:26:56 UTC
Created attachment 998544 [details]
coredump from the previous fedup of f21 server linked to the console debug log output

Comment 27 James Hogarth 2015-03-05 21:08:32 UTC
After setting systemd.crash_shell=true and going through the standard fedup sequence I had a poke around ... the process list from the crash shell (so this is after switch-root) looks like:

sh-4.3# UID        PID  PPID CLS PRI STIME TTY          TIME CMD
root         1     0 TS   19 15:30 ?        00:00:04 /usr/lib/systemd/systemd --switched-root --system --deserialize 19
root         2     0 TS   19 15:30 ?        00:00:00 [kthreadd]
root         3     2 TS   19 15:30 ?        00:00:00 [ksoftirqd/0]
root         5     2 TS   39 15:30 ?        00:00:00 [kworker/0:0H]
root         6     2 TS   19 15:30 ?        00:00:00 [kworker/u4:0]
root         7     2 TS   19 15:30 ?        00:00:00 [rcu_sched]
root         8     2 TS   19 15:30 ?        00:00:00 [rcu_bh]
root         9     2 TS   19 15:30 ?        00:00:00 [rcuos/0]
root        10     2 TS   19 15:30 ?        00:00:00 [rcuob/0]
root        11     2 FF  139 15:30 ?        00:00:00 [migration/0]
root        12     2 FF  139 15:30 ?        00:00:00 [watchdog/0]
root        13     2 FF  139 15:30 ?        00:00:00 [watchdog/1]
root        14     2 FF  139 15:30 ?        00:00:00 [migration/1]
root        15     2 TS   19 15:30 ?        00:00:00 [ksoftirqd/1]
root        16     2 TS   19 15:30 ?        00:00:00 [kworker/1:0]
root        17     2 TS   39 15:30 ?        00:00:00 [kworker/1:0H]
root        18     2 TS   19 15:30 ?        00:00:00 [rcuos/1]
root        19     2 TS   19 15:30 ?        00:00:00 [rcuob/1]
root        20     2 TS   39 15:30 ?        00:00:00 [khelper]
root        21     2 TS   19 15:30 ?        00:00:00 [kdevtmpfs]
root        22     2 TS   39 15:30 ?        00:00:00 [netns]
root        23     2 TS   39 15:30 ?        00:00:00 [perf]
root        24     2 TS   19 15:30 ?        00:00:00 [kworker/0:1]
root        26     2 TS   39 15:30 ?        00:00:00 [writeback]
root        27     2 TS   14 15:30 ?        00:00:00 [ksmd]
root        28     2 TS    0 15:30 ?        00:00:00 [khugepaged]
root        29     2 TS   39 15:30 ?        00:00:00 [crypto]
root        30     2 TS   39 15:30 ?        00:00:00 [kintegrityd]
root        31     2 TS   39 15:30 ?        00:00:00 [bioset]
root        32     2 TS   39 15:30 ?        00:00:00 [kblockd]
root        33     2 TS   19 15:30 ?        00:00:00 [kworker/1:1]
root        34     2 TS   39 15:30 ?        00:00:00 [ata_sff]
root        36     2 TS   39 15:30 ?        00:00:00 [md]
root        37     2 TS   39 15:30 ?        00:00:00 [devfreq_wq]
root        44     2 TS   19 15:30 ?        00:00:00 [kworker/u4:1]
root        61     2 TS   19 15:30 ?        00:00:00 [kswapd0]
root        62     2 TS   19 15:30 ?        00:00:00 [fsnotify_mark]
root        72     2 TS   39 15:30 ?        00:00:00 [kthrotld]
root        73     2 TS   39 15:30 ?        00:00:00 [acpi_thermal_pm]
root        74     2 TS   19 15:30 ?        00:00:00 [scsi_eh_0]
root        75     2 TS   39 15:30 ?        00:00:00 [scsi_tmf_0]
root        76     2 TS   19 15:30 ?        00:00:00 [scsi_eh_1]
root        77     2 TS   39 15:30 ?        00:00:00 [scsi_tmf_1]
root        79     2 TS   39 15:30 ?        00:00:00 [kpsmoused]
root        80     2 TS   39 15:30 ?        00:00:00 [dm_bufio_cache]
root        82     2 TS   39 15:30 ?        00:00:00 [ipv6_addrconf]
root        84     2 TS   39 15:30 ?        00:00:00 [deferwq]
root       118     2 TS   39 15:30 ?        00:00:00 [kmpathd]
root       119     2 TS   39 15:30 ?        00:00:00 [kmpath_handlerd]
root       133     2 TS   19 15:30 ?        00:00:00 [kauditd]
root       266     2 TS   39 15:30 ?        00:00:00 [rpciod]
root       296     2 TS   39 15:30 ?        00:00:00 [kmpath_rdacd]
root       328     2 TS   19 15:30 ?        00:00:00 [kworker/0:3]
root       402     1 TS   19 15:30 ?        00:00:05 @usr/sbin/plymouthd --mode=boot --pid-file=/var/run/plymouth/pid --attach-to-session
root       430     2 TS   39 15:30 ?        00:00:00 [ttm_swap]
root       431     2 TS   39 15:30 ?        00:00:00 [qxl_gc]
root       448     2 TS   39 15:30 ?        00:00:00 [kworker/1:1H]
root       455     2 TS   19 15:30 ?        00:00:00 [jbd2/vda3-8]
root       456     2 TS   39 15:30 ?        00:00:00 [ext4-rsv-conver]
root       569     1 TS   19 15:31 ?        00:00:00 [systemd-journal] <defunct>
root       626     2 TS   39 15:31 ?        00:00:00 [kworker/0:1H]
root       657     2 TS   19 15:31 ?        00:00:00 [jbd2/vda1-8]
root       658     2 TS   39 15:31 ?        00:00:00 [ext4-rsv-conver]
root       710     1 TS   19 15:31 ?        00:00:00 [systemctl] <defunct>
root       723     1 TS   19 15:32 ?        00:00:00 /bin/sh

Based on the output of mount it doesn't appear that the root switch completes with the failure happening after that:

sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=1002552k,nr_inodes=250638,mode=755)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,seclabel)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,seclabel,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime,seclabel)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
configfs on /sys/kernel/config type configfs (rw,relatime)
/dev/vda3 on / type ext4 (rw,relatime,seclabel,data=ordered)
tmpfs on /system-upgrade-root type tmpfs (rw,relatime,seclabel,mode=755)
selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,relatime,seclabel)
mqueue on /dev/mqueue type mqueue (rw,relatime,seclabel)
tmpfs on /tmp type tmpfs (rw,seclabel)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel)
/dev/vda3 on /system-upgrade-root/sysroot type ext4 (rw,relatime,seclabel,data=ordered)
devtmpfs on /system-upgrade-root/sysroot/dev type devtmpfs (rw,nosuid,seclabel,size=1002552k,nr_inodes=250638,mode=755)
tmpfs on /system-upgrade-root/sysroot/dev/shm type tmpfs (rw,nosuid,nodev,seclabel)
devpts on /system-upgrade-root/sysroot/dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
mqueue on /system-upgrade-root/sysroot/dev/mqueue type mqueue (rw,relatime,seclabel)
hugetlbfs on /system-upgrade-root/sysroot/dev/hugepages type hugetlbfs (rw,relatime,seclabel)
proc on /system-upgrade-root/sysroot/proc type proc (rw,nosuid,nodev,noexec,relatime)
systemd-1 on /system-upgrade-root/sysroot/proc/sys/fs/binfmt_misc type autofs (rw,rsysfs on /system-upgrade-root/sysroot/sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel)
securityfs on /system-upgrade-root/sysroot/sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /system-upgrade-root/sysroot/sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,seclabel,mode=755)
cgroup on /system-upgrade-root/sysroot/sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
cgroup on /system-upgrade-root/sysroot/sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /system-upgrade-root/sysroot/sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /system-upgrade-root/sysroot/sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /system-upgrade-root/sysroot/sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /system-upgrade-root/sysroot/sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /system-upgrade-root/sysroot/sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /system-upgrade-root/sysroot/sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /system-upgrade-root/sysroot/sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /system-upgrade-root/sysroot/sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
pstore on /system-upgrade-root/sysroot/sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime,seclabel)
configfs on /system-upgrade-root/sysroot/sys/kernel/config type configfs (rw,relatime)
selinuxfs on /system-upgrade-root/sysroot/sys/fs/selinux type selinuxfs (rw,relatime)
debugfs on /system-upgrade-root/sysroot/sys/kernel/debug type debugfs (rw,relatime,seclabel)
tmpfs on /system-upgrade-root/sysroot/run type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
tmpfs on /system-upgrade-root/sysroot/tmp type tmpfs (rw,seclabel)
/dev/vda1 on /system-upgrade-root/sysroot/boot type ext4 (rw,relatime,seclabel,data=ordered)

Given the pid count and the fact the switch-root appears to have failed I'm guessing the systemd that actually segfaulted was the new systemd-219 version given the state of pid 1 in the process list above.

This is very reproducible with very simple steps ... 

1. virt-builder fedora-21 --update --install fedup,strace
2. start image virt virsh/kvm/etc
3. in guest fedup --network 22 --instrepo=https://path/to/instrepo/os/ --nogpgcheck
4. Edit grub.cfg to append systemd.log_level=debug systemd.log_target=console systemd.chrash_shell=true enforcing=0
5. connect via virsh console to see console output
6. reboot and watch fedup fail with the segfault and coredump

What is the best way to find out why systemd-219 in the fedup environment is segfaulting and failing to start and be switched to correctly?

Comment 28 Will Woods 2015-03-06 21:29:13 UTC
(In reply to James Hogarth from comment #27)
> Given the pid count and the fact the switch-root appears to have failed I'm
> guessing the systemd that actually segfaulted was the new systemd-219
> version given the state of pid 1 in the process list above.

No, if you use crash_shell then systemd forks /bin/bash and then just goes into an infinite sleep. So it makes sense that PID 1 is still "alive".

Also, most importantly, the core file from comment #26 is for systemd-216, not systemd-219.

(You could probably verify this by checking `ls -i /proc/1/maps` before the switch-root and after the crash. If the inode number has changed, systemd re-exec'd itself.)

I'm attaching a full backtrace for the coredump in comment #26, but the short version is this:

Core was generated by `/usr/lib/systemd/systemd --switched-root --system --deserialize 22'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f4a008ddf99 in raise () from /lib64/libpthread.so.0
#0  0x00007f4a008ddf99 in raise () from /lib64/libpthread.so.0
#1  0x00007f4a01e608f8 in crash.lto_priv.223 (sig=11) at ../src/core/main.c:168
#2  <signal handler called>
#3  0x00007f4a006584e0 in __strcmp_ssse3 () from /lib64/libc.so.6
#4  0x00007f4a01984be5 in selabel_sub () from /lib64/libselinux.so.1
#5  0x00007f4a01984d30 in selabel_lookup_common () from /lib64/libselinux.so.1
#6  0x00007f4a0198511b in selabel_lookup_raw () from /lib64/libselinux.so.1
#7  0x00007f4a01e6322f in mac_selinux_create_file_prepare (path=path@entry=0x7ffd2b2e9aa0 "/system-upgrade-root/mnt", mode=mode@entry=16384) at ../src/shared/selinux-util.c:320
#8  0x00007f4a01e6358a in mkdir_label (path=0x7ffd2b2e9aa0 "/system-upgrade-root/mnt", mode=493) at ../src/shared/label.c:44
#9  0x00007f4a01e63625 in mkdir_p_internal (prefix=0x0, _mkdir=0x7f4a01e63570 <mkdir_label>, mode=493, path=0x7ffd2b2e9aa0 "/system-upgrade-root/mnt") at ../src/shared/mkdir.c:123
#10 mkdir_p_label (path=path@entry=0x7ffd2b2e9aa0 "/system-upgrade-root/mnt", mode=mode@entry=493) at ../src/shared/mkdir-label.c:44
#11 0x00007f4a01e83387 in switch_root.constprop.199 (new_root=0x7f4a02cb75e0 "/system-upgrade-root", mountflags=8192, detach_oldroot=true, oldroot=0x7f4a01ebb97b "/mnt") at ../src/shared/switch-root.c:56
#12 0x00007f4a01ddc2fb in main (argc=5, argv=<optimized out>) at ../src/core/main.c:1889

So - it looks like switch_root() tries to mkdir_p_label("/system-upgrade-root/mnt"), but the attempt to look up the correct SELinux label for that directory causes a SEGV.

This probably doesn't happen in the normal initramfs-to-system switch-root because we don't have SELinux policy loaded at that point.

This also would explain why you can reproduce the problem with a F21-to-F21 switch-root.

Comment 29 Will Woods 2015-03-06 21:32:30 UTC
Created attachment 999004 [details]
Full backtrace of coredump from comment #26

Generated with `gdb --core=core.systemd.* --batch -ex 'bt full'`

Comment 30 James Hogarth 2015-03-06 21:50:07 UTC
(In reply to Will Woods from comment #28)

> So - it looks like switch_root() tries to
> mkdir_p_label("/system-upgrade-root/mnt"), but the attempt to look up the
> correct SELinux label for that directory causes a SEGV.
> 
> This probably doesn't happen in the normal initramfs-to-system switch-root
> because we don't have SELinux policy loaded at that point.
> 
> This also would explain why you can reproduce the problem with a F21-to-F21
> switch-root.

Thanks for that analysis of the coredump Will...

Interesting my testing was with enforcing=0 passed to the fedup kernel and it still blew up.

I'll give it a go with SELinux completely disabled and see how it behaves and try to work backwards from there.

Comment 31 James Hogarth 2015-03-06 22:40:23 UTC
So configuring the test system as permissive (and enforcing=0) incurs the crash as previously mentioned.

Running a test upgrade with the test system disabled and selinux=0 the upgrade happens mostly as expected.

It hung at the end after reboot target when it failed to unmount /sysroot:

About to execute: /bin/umount /sysroot -n
systemd-journald.service: Got notification message from PID 11735 (FDSTORE=1)
systemd-journald.service: got FDSTORE=1
systemd-journald.service: added fd to fd store.
Got disconnect on private connection.
Received SIGCHLD from PID 11742 (umount).
Child 11742 (umount) died (code=exited, status=32/n/a)
Child 11742 belongs to sysroot.mount
sysroot.mount mount process exited, code=exited status=32
sysroot.mount changed unmounting -> mounted
Job sysroot.mount/stop finished, result=failed

Forcing a restart of the VM started F22 as expected and booting after a relabel and setting selinux back to enforcing works.

Comment 32 James Hogarth 2015-03-06 23:23:06 UTC
Will the corefile being 216 doesn't seem to line up with that stacktrace you've pulled out...

It refers to src/shared/switch-root.c and the mkdir_p_label indeed but that's in 219:

https://github.com/systemd/systemd/blob/v219/src/shared/switch-root.c

In 216 switch-root was carried out at src/core/switch-root.c and used mkdir_p not mkdir_p_label:

https://github.com/systemd/systemd/blob/v216/src/core/switch-root.c

I'm popping some waits in the various upgrade shell scripts at the appropriate points to try and nail down exactly what is failing where...

It does start to look like the selinux policy database may need an update before fedup works for some reason though ... or systemd needs to handle the lack of label in the policy database for /system-upgrade-root/mnt used for the temporary pivot in a better way than a segfault somehow ;)

Comment 33 James Hogarth 2015-03-06 23:35:10 UTC
Ah my mistake there ...

Although it's systemd-216 in F21 by the RPM version I just grabbed the SRPM and it's on the 217 (plus patches) codebase... nice and confusing ;)

And indeed 217 is when switch-root was moved from core to shared:

https://github.com/systemd/systemd/blob/v217/src/shared/switch-root.c

And indeed it is carrying out the mkdir_p_label there ...

So it looks like this is the first time we're doing a switch-root with mkdir_p_label and with policy in place (since as you mentioned policy isn't loaded in the fedup initrd environment).

So either looking to sort out the policy issues or handling this edge case better should resolve the behaviour.

Comment 34 James Hogarth 2015-03-07 01:42:57 UTC
Created attachment 999038 [details]
backtrace with ssse3 disabled in the VM

So as a long shot somewhat given the previous backtrace I disabled SSSE3 in the VM and tried the upgrade. 

It failed with a similar backtrace (attached) with the strcmp in selabel_sub appearing to the the initial point of failure that then bubbles up the chain.

I'm inclined to think at this point it's not a systemd issue in and of itself but rather the change of mkdir_p to mkdir_p_label used by switch-root.c in the systemd-217 version in F21 has highlighted some underlying issue in libselinux since the labelling is now checked.

At any rate the actual initial failure appears to be in libselinux with the rest bubbling out from there...

For now a workaround of disabling selinux, carrying out the fedup process, enabling selinux and relabelling the filesystem appears to work to test the upgrade itself.

I'd suggest this gets reassigned to libselinux for the maintainers in that area to take a look.

Comment 35 James Hogarth 2015-03-09 11:04:57 UTC
I'm trying to put together a minimal test case of the same code path but annoyingly this *doesn't* fail like systemd does:

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <selinux/label.h>
#include <sys/stat.h>

int main(int argc, char *argv[])  {
  char *path;
  int r;
  security_context_t filecon = NULL;
  static struct selabel_handle *label_hnd = NULL;

  path = argv[1];
  label_hnd = selabel_open(SELABEL_CTX_FILE, NULL, 0);

  freecon(filecon);


  r = selabel_lookup_raw(label_hnd, &filecon, path, S_IFDIR);

  if (r < 0 && errno == ENOENT) {
    r = 0;
    return 0;
  }
  if (r < 0) {
    r = -errno;
  }

  printf("%s\n", filecon);

return 0;
}

The function selabel_lookup_raw (which is the one called by mac_selinux_create_file_prepare in src/shared/selinux-util.c from mkdir_label) works fine in this minimal test case which I've tried to get reflecting the real one... 

Tested in both the fedup environment just before switch-root and just local in F21 ...

Starting to run out of ideas here.

Comment 36 Will Woods 2015-03-09 20:21:12 UTC
(In reply to James Hogarth from comment #31)
> So configuring the test system as permissive (and enforcing=0) incurs the
> crash as previously mentioned.

Yes. We still look up (and apply) labels in permissive mode. The system just doesn't enforce the rules.

> Running a test upgrade with the test system disabled and selinux=0 the
> upgrade happens mostly as expected.

Which makes sense because we don't have any policy to look up labels in.

> It hung at the end after reboot target when it failed to unmount /sysroot:
>
> [...]
> 
> Forcing a restart of the VM started F22 as expected and booting after a
> relabel and setting selinux back to enforcing works.

Unrelated; please file that in a separate bug if you want to track that.

(In reply to James Hogarth from comment #32)
> It does start to look like the selinux policy database may need an update
> before fedup works for some reason though ... or systemd needs to handle the
> lack of label in the policy database for /system-upgrade-root/mnt used for
> the temporary pivot in a better way than a segfault somehow ;)

Maybe the lookup is happening using the on-disk policy (which is F21's) but the *loaded* policy is F22s? I'm having trouble remembering the gritty details of how labels get loaded and applied, but I'd thought they got pulled out of the in-memory policy..

Comment 37 Will Woods 2015-03-10 17:51:00 UTC
*** Bug 1199849 has been marked as a duplicate of this bug. ***

Comment 38 Will Woods 2015-03-11 18:12:21 UTC
*** Bug 1200880 has been marked as a duplicate of this bug. ***

Comment 39 Will Woods 2015-03-11 22:54:55 UTC
Created attachment 1000710 [details]
Full backtrace with correct libselinux debuginfo

So here's an improved backtrace of that coredump, with with the correct libselinux debuginfo installed.

The proximal cause of the SEGV is that there's garbage in systemd's selinux labeling handle (label_hnd) - specifically, in label_hnd->dist_subs.

libselinux reads /etc/selinux/targeted/contexts/files/file_contexts.subs_dist
into the dist_subs linked list when you do selabel_open(SELABEL_CTX_FILE, ..).

Normally that list looks like this:

  {src[]="/var/roothome", slen=13, dst[]="/root", next=<ptr>}
  {src[]="/var/home",     slen=9,  dst[]="/home", next=<ptr>}
  ...12 more items...
  {src[]="/run",          slen=4,  dst[]="/var/run", next=NULL}

In the coredump, the dist_subs items look like this:

  {src[]="`",        slen=13, dst[]=" \222\253\002J\177", next=<ptr>}
  {src[]="\300\005", slen=9,  dst[]="\260\221\253\002J\177", next=<ptr>}
  ...10 more items...
  {src[]="\340",     slen=19, dst[]="Ќ\253\002J\177", next=<ptr>}

So: garbage data in src and dest, but the slen values are normal.. The item after this one, the 'next' pointer points back into itself(??), and then we get a garbage item that has src=0x0, which makes strncmp SEGV.

Comment 40 Will Woods 2015-03-13 20:04:07 UTC
I think this turns out to be a use-after-free bug in F21 systemd (v216).

When systemd v216 shuts down, it does label_finish() before switch_root():
  https://github.com/systemd/systemd/blob/v216/src/core/main.c#L1819-1845

But label_finish() doesn't set label_hnd=NULL:
  https://github.com/systemd/systemd/blob/v216/src/shared/label.c#L205

So mkdir_p_label() ends up using the freed label_hnd (which now contains a bunch of garbage), leading to the segfault.

In master that function is named mac_selinux_finish(), but it has the same bug:
  https://github.com/systemd/systemd/blob/master/src/shared/selinux-util.c#L118

I'm building a systemd to test the following fix:

diff --git a/src/shared/label.c b/src/shared/label.c
index 25a8b36..ecce6aa 100644
--- a/src/shared/label.c
+++ b/src/shared/label.c
@@ -203,6 +203,8 @@ void label_finish(void) {
 
         if (label_hnd)
                 selabel_close(label_hnd);
+
+        label_hnd = NULL;
 #endif
 }

Comment 41 Will Woods 2015-03-13 21:19:39 UTC
Created attachment 1001497 [details]
0001-Fix-SEGV-during-switch-root-from-system-with-SELinux.patch

Confirmed - the attached patch fixes the segfault. Test system is happily upgrading to F22 Alpha.

Patch is against systemd master but applies to F21 systemd just fine.

A scratch build should be available from koji soon:
  http://koji.fedoraproject.org/koji/taskinfo?taskID=9226067

I'll send the patch to the systemd-devel mailing list shortly.

Comment 42 Sandro Mathys 2015-03-14 12:32:34 UTC
+1 for the patch / scratch build. Issue reproduced and fix confirmed.

Comment 43 Fedora Update System 2015-03-15 04:00:41 UTC
systemd-219-9.fc22 has been submitted as an update for Fedora 22.
https://admin.fedoraproject.org/updates/systemd-219-9.fc22

Comment 44 Fedora Update System 2015-03-16 01:40:55 UTC
Package systemd-219-9.fc22:
* should fix your issue,
* was pushed to the Fedora 22 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing systemd-219-9.fc22'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-3949/systemd-219-9.fc22
then log in and leave karma (feedback).

Comment 45 Will Woods 2015-03-16 14:58:26 UTC
...that update won't fix the bug, though.

The actual bug is in F21 systemd (v216). It will SEGV any time you attempt to switch-root when SELinux policy has been loaded. (It just happens to be the case that fedup is the only thing that does this on a regular basis.)

This will require a F21 update to fix. (And I should probably make F21 fedup Require: the new systemd version.)

Comment 46 Will Woods 2015-03-16 15:10:45 UTC
To clarify - yes, the actual systemd bug needs fixing in both F21 and F22.
It's now fixed in F22, and upstream git:
  http://cgit.freedesktop.org/systemd/systemd/commit/?id=f5ce2b4
but *this* report concerns the failure to upgrade from F21 - so closing this bug requires a patch to F21 systemd.

(And probably I should make fedup Require the new systemd version, when that appears..)

Comment 47 Zbigniew Jędrzejewski-Szmek 2015-03-16 15:39:17 UTC
Yes, I know. I just didn't get to making the F21 update yet.

Comment 48 Will Woods 2015-03-16 15:52:07 UTC
No worries - just want to be sure everyone's clear! (There's been a lot of confusion about where the crash actually happened, etc.)

Comment 49 Chris Murphy 2015-03-16 18:14:18 UTC
(In reply to Will Woods from comment #45)
> The actual bug is in F21 systemd (v216).

Is the bug also in systemd-208-31.fc20, or in systemd-204-21.fc19? Officially we support F20->F22 upgrades. And I think we'd like to offer F19->F22 upgrades even though F19 isn't supported anymore.

Comment 50 Will Woods 2015-03-17 14:41:36 UTC
AFAICT no; the bug was introduced in this commit, which (as part of a refactor) replaced mkdir_p() with mkdir_p_label() in switch_root():

  http://cgit.freedesktop.org/systemd/systemd-stable/commit/?h=v216-stable&id=5a4bf02

So it should only be present in v216-stable/v217 and later systemd.

(Also, since what you're upgrading *to* is mostly irrelevant, if the bug was present in F19/F20 we would have hit this on upgrades to F20/F21..)

Comment 51 Mykola Dvornik 2015-03-18 20:21:29 UTC
Is it still planned to be solved? IMHO, this is one of the most serious bugs, since it ruins the entire QA idea.

Comment 52 Adam Williamson 2015-03-18 20:56:07 UTC
Mykola: https://bugzilla.redhat.com/show_bug.cgi?id=1185604#c47

Comment 53 Fedora Update System 2015-03-21 05:02:39 UTC
systemd-219-9.fc22 has been pushed to the Fedora 22 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 54 Adam Williamson 2015-03-21 05:39:23 UTC
As discussed above the F22 update does not fix this. Re-opening.

Comment 55 Fedora Update System 2015-03-26 14:30:33 UTC
systemd-216-23.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/systemd-216-23.fc21

Comment 56 Fedora Update System 2015-03-30 07:04:05 UTC
Package systemd-216-24.fc21:
* should fix your issue,
* was pushed to the Fedora 21 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing systemd-216-24.fc21'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-4991/systemd-216-24.fc21
then log in and leave karma (feedback).

Comment 57 Adam Williamson 2015-04-01 23:53:08 UTC
The fix for this has been verified by multiple testers, and is only awaiting a stable push.

Comment 58 Fedora Update System 2015-04-02 15:37:37 UTC
systemd-216-24.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.