Bug 1702358 - Booting ends in shutdown
Summary: Booting ends in shutdown
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-04-23 14:54 UTC by Zdenek Kabelac
Modified: 2019-04-29 01:41 UTC (History)
6 users (show)

Fixed In Version: systemd-241-8.git9ef65cb.fc30
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-04-29 01:41:48 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
serial console log from failing boot (64.01 KB, text/plain)
2019-04-23 14:54 UTC, Zdenek Kabelac
no flags Details
Console screen capture (180.02 KB, image/jpeg)
2019-04-24 10:27 UTC, Zdenek Kabelac
no flags Details
Log of succesful boot for comparision (262.26 KB, text/plain)
2019-04-24 10:30 UTC, Zdenek Kabelac
no flags Details
gdb bt full (6.24 KB, text/plain)
2019-04-24 20:46 UTC, Zdenek Kabelac
no flags Details
/usr/lib/dracut/modules.d/01systemd-debug-shell/module-setup.sh (619 bytes, text/plain)
2019-04-24 21:01 UTC, Zdenek Kabelac
no flags Details

Description Zdenek Kabelac 2019-04-23 14:54:24 UTC
Created attachment 1557734 [details]
serial console log from failing boot

Description of problem:

From today's upgrade of my Rawhide's systemd  it seems my 5.1-rcX kernel no longer boots and ends in some strange shutdown sequence.

It seems I can boot 5.0 kernel and 4.20 I still have on my desktop.

I can also boot  5.1-rcX kernel when I use 'trick' with  rd.break=pre-pivot and just typing exit (ctrl+d) - in this case I also normally boot.

I'm attaching grab of serial console debug log from failing boot.


Version-Release number of selected component (if applicable):
systemd-242-2.fc31.x86_64  
kernel 5.1.0-0.rc5.git2.2.fc31.x86_64

How reproducible:


Steps to Reproduce:
1. trying boot
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Zbigniew Jędrzejewski-Szmek 2019-04-23 15:35:51 UTC
Do you have kernel-modules-5.1.0 installed?

Comment 2 Zbigniew Jędrzejewski-Szmek 2019-04-23 19:03:52 UTC
I thought I could reproduce this with kernel-core-5.1.0-0.rc5.git2.1.fc31.x86_64, but that turned out to be a false positive. (I installed kernel-core, but not kernel-modules, so I was missing i915.ko, so gdm would start and fail. But apart from missing graphical session, everything seems to be in order. Once I provide the missing modules, I get the full graphical session without issue.)

In the attached trace, the "shutdown sequence" is just preparation for the switch from initramfs to real root.
Normally, we would expect a message from the new systemd instance soon after that.
I have no idea why it doesn't appear in the log. The simplest possibility is that the
new systemd instance does not log to the serial console. Do you see anything more on
the tty?

Comment 3 Zdenek Kabelac 2019-04-24 10:27:37 UTC
Created attachment 1558136 [details]
Console screen capture

The only visible message getting afterwards is  'random: crng init done'  (not even sure why it takes 44 seconds??)

(Hardware is Lenovo T61 C2D, 4G RAM)

Comment 4 Zdenek Kabelac 2019-04-24 10:30:23 UTC
Created attachment 1558137 [details]
Log of succesful boot for comparision

Just for that case it would help - this is the log of 'working'  pre-pivot boot.

Comment 5 Zbigniew Jędrzejewski-Szmek 2019-04-24 13:58:55 UTC
What happens if you add init=/bin/bash ? You should get a shell in the real root file system. Then 'exec /sbin/init' should bring up the machine.

Comment 6 Zdenek Kabelac 2019-04-24 20:44:01 UTC
So I've made longer 'testing' session with many new results:

1.)  When I've accidentally regenerated initramfs for my 5.0  kernel - the problem started to show-up with this kernel as well - this gives big hint the failure is likely caused by new systemd (as kernel has not changed).

2.)  When trying to capture trace with serial console and using 'debug' kernel parameter - it often makes the boot to pass - so it's some sort of race (I'll attach later main areas to look at).

3.) Thanks to mschmidt@  he provided me a dracut module for having 'Alt+F9' systemd debug shell available in dracut - great feature in fact - this allowed further decrypt that in case of 'blocked' boot there is  100% CPU occupied by systemd doing a simple 'accept/poll' busy loop.

As there is no easy way to actually run gdb inside ramdisk (even when /sysroot is mounted) - I've used 'kill-ABRT 1' trick to get coredump out of systemd with this shortend result (I'll attach full)

(gdb) bt 
#0  0x00007f1e080811ab in kill () at ../sysdeps/unix/syscall-template.S:78
#1  0x000055c305710b8f in crash (sig=6) at ../src/core/main.c:198
#2  <signal handler called>
#3  0x00007f1e081452ab in accept4 (fd=fd@entry=54, addr=addr@entry=..., addr_len=addr_len@entry=0x0, 
    flags=flags@entry=526336) at ../sysdeps/unix/sysv/linux/accept4.c:32
#4  0x00007f1e07e82ca5 in flush_accept (fd=54) at ../src/basic/socket-util.c:1245
#5  0x000055c3056b7b3c in flush_ports (s=<optimized out>) at ../src/core/socket.c:2259
#6  socket_enter_running (cfd=-1, s=<optimized out>) at ../src/core/socket.c:2281
#7  socket_dispatch_io (source=<optimized out>, fd=<optimized out>, revents=<optimized out>, 
    userdata=<optimized out>) at ../src/core/socket.c:2978
#8  0x00007f1e07e14137 in source_dispatch (s=s@entry=0x55c30732e740) at ../src/libsystemd/sd-event/sd-event.c:2821
#9  0x00007f1e07e1441d in sd_event_dispatch (e=0x55c3072b2310) at ../src/libsystemd/sd-event/sd-event.c:3234
#10 0x00007f1e07e145a8 in sd_event_run (e=0x55c3072b2310, timeout=18446744073709551615)
    at ../src/libsystemd/sd-event/sd-event.c:3291
#11 0x000055c3057102c9 in manager_loop (m=<optimized out>) at ../src/core/manager.c:2928
#12 invoke_main_loop (m=<optimized out>, ret_reexecute=<optimized out>, ret_retval=<optimized out>, 
    ret_shutdown_verb=<optimized out>, ret_fds=<optimized out>, ret_switch_root_dir=<optimized out>, 
    ret_switch_root_init=<optimized out>, ret_error_message=<optimized out>) at ../src/core/main.c:1838
#13 0x000055c305663591 in main (argc=4, argv=0x7fff6e228b78) at ../src/core/main.c:2621

Comment 7 Zdenek Kabelac 2019-04-24 20:46:27 UTC
Created attachment 1558390 [details]
gdb  bt full

Comment 8 Zdenek Kabelac 2019-04-24 20:51:19 UTC
End of 'FAILING' boot:

[  OK  ] Stopped target Remote File Systems (Pre).
         Starting Setup Virtual Console...
[  OK  ] Stopped target System Initialization.
[  OK  ] Stopped Create Volatile Files and Directories.
[  OK  ] Stopped target Local File Systems.
[  OK  ] Stopped Apply Kernel Variables.
         Stopping udev Kernel Device Manager...
[  OK  ] Stopped udev Coldplug all Devices.
[  OK  ] Stopped target Swap.
[  OK  ] Stopped target Slices.
[    9.392162] systemd-vconsole-setup[319]: Successfully forked off '(setfont)' as PID 322.
[    9.394837] audit: type=1131 audit(1556118185.423:12): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-udev-trigger comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[    9.700601] systemd-vconsole-setup[317]: /usr/bin/setfont succeeded.
[    9.700617] systemd-vconsole-setup[317]: Executing "/usr/bin/loadkeys -q -C /dev/tty1 -u us"...
[    9.701001] systemd-vconsole-setup[317]: Successfully forked off '(loadkeys)' as PID 324.
[    9.736930] systemd-vconsole-setup[319]: /usr/bin/setfont succeeded.
[    9.736944] systemd-vconsole-setup[319]: Executing "/usr/bin/loadkeys -q -C /dev/tty1 -u us"...
[    9.737239] systemd-vconsole-setup[319]: Successfully forked off '(loadkeys)' as PID 326.
[    9.824159] systemd-vconsole-setup[317]: /usr/bin/loadkeys succeeded.
[  OK  ] Started Plymouth switch root service.
[  OK  ] Started Cleaning Up and Shutting Down Daemons.
[  OK  ] Stopped udev Kernel Device Manager.
[    9.889977] systemd-vconsole-setup[319]: /usr/bin/loadkeys succeeded.
[   28.903834] random: crng init done
[   28.904790] random: 7 urandom warning(s) missed due to ratelimiting






Same part of boot that is passing just fine

[    8.745716] systemd-vconsole-setup[283]: Sysfs UTF-8 flag enabled
[    8.754543] systemd-vconsole-setup[283]: UTF-8 kbdmode enabled on /dev/tty1
[    8.756721] systemd-fstab-generator[286]: Found entry what=/dev/disk/by-uuid/dc4977c8-a38f-41c1-81bf-4f3ad41374d4 where=/sysroot type=n/a
[    8.761541] systemd-vconsole-setup[283]: Executing "/usr/bin/setfont -C /dev/tty1 latarcyrheb-sun16"...
[    8.777505] systemd-fstab-generator[286]: Creating /run/systemd/generator/systemd-fsck-root.service
[    8.784026] systemd-vconsole-setup[283]: Successfully forked off '(setfont)' as PID 293.
[    8.799056] systemd-fstab-generator[286]: Parsing /etc/fstab...
[    8.807516] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[    8.809231] systemd-fstab-generator[286]: Parsing /sysroot/etc/fstab...
[    8.823977] systemd-vconsole-setup[283]: /usr/bin/setfont succeeded.
[    8.831478] systemd-vconsole-setup[283]: Executing "/usr/bin/loadkeys -q -C /dev/tty1 -u us"...
[    8.842012] systemd-vconsole-setup[283]: Successfully forked off '(loadkeys)' as PID 305.
[    8.973055] systemd-vconsole-setup[283]: /usr/bin/loadkeys succeeded.
[    8.988779] systemd-vconsole-setup[283]: KD_FONT_OP_GET failed while trying to get the font metadata: Function not implemented
[    9.008707] systemctl[282]: Got message type=method_return sender=org.freedesktop.systemd1 destination=n/a path=n/a interface=n/a member=n/a cookie=401 reply_cookie=1 signature=n/a error-name=n/a error-message=n/a
[    9.008948] systemd-vconsole-setup[283]: Fonts will not be copied to remaining consoles
[    9.027843] systemctl[282]: Bus n/a: changing state RUNNING → CLOSED
[    9.046455] systemd[314]: initrd-parse-etc.service: Executing: /usr/bin/systemctl --no-block start initrd-fs.target
[    9.064348] systemctl[314]: Bus n/a: changing state UNSET → OPENING
[    9.064680] [drm] RC6 disabled, disabling runtime PM support
[    9.073072] systemctl[314]: Bus n/a: changing state OPENING → AUTHENTICATING
[    9.076910] [drm] Initialized overlay support.
[    9.083876] systemctl[314]: Executing dbus call org.freedesktop.systemd1.Manager StartUnit(initrd-fs.target, replace)
[    9.098984] systemctl[314]: Bus n/a: changing state AUTHENTICATING → RUNNING
[    9.100073] [drm] Initialized i915 1.6.0 20190207 for 0000:00:02.0 on minor 0
[    9.106481] systemctl[314]: Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=StartUnit cookie=1 reply_cookie=0 signature=ss error-name=n/a error-message=n/a
[    9.113818] ACPI: Video Device [VID] (multi-head: yes  rom: no  post: no)
[    9.136957] systemctl[314]: Got message type=method_return sender=org.freedesktop.systemd1 destination=n/a path=n/a interface=n/a member=n/a cookie=2 reply_cookie=1 signature=o error-name=n/a error-message=n/a
[    9.145082] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input8
[    9.162440] systemctl[314]: Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=GetUnit cookie=2 reply_cookie=0 signature=s error-name=n/a error-message=n/a
[    9.194828] systemctl[314]: Got message type=method_return sender=org.freedesktop.systemd1 destination=n/a path=n/a interface=n/a member=n/a cookie=3 reply_cookie=2 signature=o error-name=n/a error-message=n/a
[    9.213738] systemctl[314]: Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1/unit/initrd_2dfs_2etarget interface=org.freedesktop.DBus.Properties member=Get cookie=3 reply_cookie=0 signature=ss error-name=n/a error-message=n/a
[    9.238955] systemctl[314]: Got message type=method_return sender=org.freedesktop.systemd1 destination=n/a path=n/a interface=n/a member=n/a cookie=4 reply_cookie=3 signature=v error-name=n/a error-message=n/a

Comment 9 Zdenek Kabelac 2019-04-24 20:56:50 UTC
If I'm not mistaken the busy-loop in systemd was showing EOPNOTSUPP as error code.

Anyway - does anything of this make now some more sense ? 

Should I focus on something particular ?

Comment 10 Zdenek Kabelac 2019-04-24 21:01:56 UTC
Created attachment 1558396 [details]
/usr/lib/dracut/modules.d/01systemd-debug-shell/module-setup.sh

File to placed in dracut modules.d subdir.

Generated ramdisk is then capable to make available 'Alt+F9' systemd's debug shell (so it works before rootfs switch).

All credits goes to mschmidt.

Comment 11 Zbigniew Jędrzejewski-Szmek 2019-04-24 21:49:38 UTC
Looks like https://github.com/systemd/systemd/pull/12346. Thanks for debugging this.

Comment 12 Fedora Update System 2019-04-26 11:30:54 UTC
systemd-241-8.git9ef65cb.fc30 has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-7322053e74

Comment 13 Fedora Update System 2019-04-27 22:30:28 UTC
systemd-241-8.git9ef65cb.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-7322053e74

Comment 14 Fedora Update System 2019-04-29 01:41:48 UTC
systemd-241-8.git9ef65cb.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.