Bug 2059633

Summary: systemd doesn't record messages to the journal during boot
Product: Red Hat Enterprise Linux 8 Reporter: Renaud Métrich <rmetrich>
Component: systemdAssignee: Michal Sekletar <msekleta>
Status: CLOSED ERRATA QA Contact: Frantisek Sumsal <fsumsal>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 8.5CC: anrussel, dtardon, fkrska, jamacku, msekleta, pdwyer, qguo, systemd-maint-list
Target Milestone: rcKeywords: Triaged, ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: systemd-239-71.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2164049 2164050 (view as bug list) Environment:
Last Closed: 2023-05-16 09:07:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2164049, 2164050    

Description Renaud Métrich 2022-03-01 15:15:18 UTC
Description of problem:

Many debugging messages such as "Activating default unit: xxx" are not recorded in the journal when booting, even with LogLevel=debug.

See below a "standard boot":
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
# journalctl -b | grep "Activating default"
--> nothing
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

However, by specifying `systemd.log_target=kmsg` on the kernel command line, those messages are recorded (because they are pushed to kmsg buffer, then read by the journal later):
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
# dmesg | grep "Activating default"
[    1.138475] systemd[1]: Activating default unit: default.target
[   24.949661] systemd[1]: Activating default unit: default.target

# journalctl -b | grep "Activating default"
Mar 01 16:09:11 vm-relabel8 systemd[1]: Activating default unit: default.target
Mar 01 16:09:35 vm-relabel8 systemd[1]: Activating default unit: default.target
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

This is super annoying when debugging boot issues.


Version-Release number of selected component (if applicable):

systemd-239-45.el8.x86_64 or later

How reproducible:

Always

Steps to Reproduce:
1. Enable debug level `systemd.log_level=debug` on kernel command line
2. Boot

Actual results:

No "Activating default unit" message (it's at debug level)

Expected results:

"Activating default unit" message

Additional info:

Comment 1 David Tardon 2022-03-02 12:41:25 UTC
I can reproduce this reliably, except that changing the log target makes no difference.

Comment 2 Renaud Métrich 2022-03-02 12:59:13 UTC
"systemd.log_level=debug systemd.log_target=kmsg log_buf_len=15M printk.devkmsg=on" solves the issue for me.

which is expected since systemd then writes to /dev/kmsg, which already exists just after switching root.
Whereas for some time the /run/systemd/journal/dev-log socket is not there or somehow usable, which causes systemd to fallback to /dev/ksmg, as seen by stracing systemd early:

-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
13:50:07.182971 sendmsg(46<UNIX:[22636->12503]>, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="PRIORITY=7\nSYSLOG_FACILITY=3\nCODE_FILE=../src/core/service.c\nCODE_LINE=406\nCODE_FUNC=on_fd_store_io\nUNIT=systemd-journald.servic"..., iov_len=203}, {iov_base="MESSAGE=", iov_len=8}, {iov_base="systemd-journald.service: Received EPOLLHUP on stored fd 75 (stored), closing.", iov_len=78}, {iov_base="\n", iov_len=1}], msg_iovlen=4, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 290 <0.000007>

--> last message before switching root sent to dev-log socket

 :
13:49:31.725316 writev(3</dev/kmsg<char 1:11>>, [{iov_base="<31>", iov_len=4}, {iov_base="systemd", iov_len=7}, {iov_base="[1]: ", iov_len=5}, {iov_base="Activating default unit: default.target", iov_len=39}, {iov_base="\n", iov_len=1}], 5) = 56 <0.000006>
 :
13:49:31.799750 epoll_ctl(4<anon_inode:[eventpoll]>, EPOLL_CTL_ADD, 41<UNIX:[12491,"/run/systemd/journal/dev-log"]>, {EPOLLIN, {u32=2066902816, u64=93898641927968}}) = 0 <0.000010>

--> listening on the socket but systemd-journald not yet there

13:49:32.138939 epoll_ctl(4<anon_inode:[eventpoll]>, EPOLL_CTL_DEL, 41<UNIX:[12491,"/run/systemd/journal/dev-log"]>, NULL) = 0 <0.000003>

--> systemd-journald now there, stop listening on the socket

 :
13:49:32.139009 writev(3</dev/kmsg<char 1:11>>, [{iov_base="<31>", iov_len=4}, {iov_base="systemd", iov_len=7}, {iov_base="[1]: ", iov_len=5}, {iov_base="systemd-journald.socket: Changed listening -> running", iov_len=53}, {iov_base="\n", iov_len=1}], 5) = 70 <0.000002>

--> last message sent to /dev/kmsg

13:49:32.140149 sendmsg(46<UNIX:[22636->12503]>, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="PRIORITY=7\nSYSLOG_FACILITY=3\nCODE_FILE=../src/core/service.c\nCODE_LINE=3697\nCODE_FUNC=service_notify_message\nUNIT=systemd-journa"..., iov_len=212}, {iov_base="MESSAGE=", iov_len=8}, {iov_base="systemd-journald.service: Got notification message from PID 1097 (WATCHDOG=1)", iov_len=77}, {iov_base="\n", iov_len=1}], msg_iovlen=4, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 298 <0.000331>

--> first message sent to dev-log socket after switch root
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Comment 5 Renaud Métrich 2022-11-10 13:47:14 UTC
Hello,

Please fix this issue ASAP, it's making us (Support and Customers) lose a lot of time when troubleshooting boot issues, which are usually due to ordering cycles which do not appear because of that bug.

Renaud.

Comment 11 Plumber Bot 2023-01-27 12:00:56 UTC
fix merged to github main branch -> https://github.com/redhat-plumbers/systemd-rhel8/pull/358

Comment 17 errata-xmlrpc 2023-05-16 09:07:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (systemd bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:2985