Bug 1715699

Summary: LiveOS boot, journalctl is missing many early messages
Product: [Fedora] Fedora Reporter: Chris Murphy <bugzilla>
Component: systemdAssignee: systemd-maint
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: bugzilla, fgrose, gmarr, kparal, lnykryn, mattdm, mkolman, msekleta, robatino, satellitgo, sgallagh, ssahani, s, systemd-maint, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: RejectedBlocker PrioritizedBug
Fixed In Version: systemd-243~rc1-1.fc31 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-07-30 20:35:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Attachments:
Description Flags
dmesg
none
journal without problem
none
F30 dmesg
none
F30 journal
none
journald forward to kmsg full none

Description Chris Murphy 2019-05-31 02:54:05 UTC
Description of problem:

About 20 seconds of early boot journal messages are missing when I invoke 'journalctl -b -o short-monotonic' in a LiveOS booted system

Version-Release number of selected component (if applicable):
systemd-242-3.git7a6d834.fc31.x86_64



How reproducible:
Always


Steps to Reproduce:
https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20190527.n.0/compose/Workstation/x86_64/iso/Fedora-Workstation-Live-x86_64-Rawhide-20190527.n.0.iso
1. Boot the ISO in a UEFI qemu-kvm with boot params: systemd.log_level=debug
2. journalctl -b -o short-monotonic
3.

Actual results:

[root@localhost ~]# journalctl -b -o short-monotonic
-- Logs begin at Thu 2019-05-30 22:21:52 EDT, end at Fri 2019-05-31 02:21:50 EDT. --
[   21.466302] localhost systemd-tmpfiles[1091]: Running remove action for entry C /etc/nsswitch.conf
[   21.599077] localhost systemd-tmpfiles[1091]: Running remove action for entry C /etc/pam.d


Expected results:

I expect to see everything starting with what I see in dmesg

[root@localhost ~]# dmesg
[    0.000000] Linux version 5.2.0-0 .....




Additional info:


Coincidentally I get a hint when 'systemctl status zram-swap.service'

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.


Even though this is a LiveOS boot the journal has been flushed to "persistent" /var/log/journal, which is really a devicemapper rw overlay. But that really shouldn't matter.

[root@localhost ~]# ls -l /var/log/journal/32cb8f7ec57642fe807217f3cce87598/
total 24596
-rw-r-----+ 1 root systemd-journal 8388608 May 30 22:21 system@f08bc9b274404513902cc16bbd5a38b4-000000000000188e-00058a2906e6489c.journal
-rw-r-----+ 1 root systemd-journal 8388608 May 30 22:42 system.journal
-rw-r-----+ 1 root systemd-journal 8388608 May 30 22:26 user-1000.journal
[root@localhost ~]# 


OK? So system.journal got too big probably because of the debug option, and it get rotated, which is fine, but why is it dropping all of these messages? Could it have something to do with time confusion? The LiveOS VM starts with some idea of time, and during boot I'm guessing chronyd gets the correct time.


[root@localhost ~]# journalctl --list-boots
 0 6089528010014ef2a1ab62e80438132c Fri 2019-05-31 02:21:37 EDT—Thu 2019-05-30 22:44:20 EDT
[root@localhost ~]# 


dmesg shows the lines right before during and after the first line reported by journalctl; doesn't seem related.

[   20.607527] iTCO_wdt: Found a ICH9 TCO device (Version=2, TCOBASE=0x0660)
[   20.631571] iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)
[   22.571700] Rounding down aligned max_sectors from 4294967295 to 4294967288
[   22.587210] db_root: cannot open: /etc/target
[   23.037280] iscsi: registered transport (iser)

There are a ton of systemd messages, all the same:

[   23.639817] systemd-journald[933]: Journal effective settings seal=no compress=yes compress_threshold_bytes=512B


If I tar those three journal files, and move them to a Fedora 30 system with systemd-241-8.git9ef65cb.fc30.x86_64, and point journalctl -D to that dir containing the journal files, I get the same result. So it's like they really aren't there and somehow they really did get dropped.

Comment 1 Chris Murphy 2019-05-31 02:55:02 UTC
Created attachment 1575476 [details]
dmesg

Comment 2 Chris Murphy 2019-05-31 03:40:49 UTC
Unfortunately the 'virsh console' trick using either systemd.log_target=console or systemd.journald.forward_to_console=1 does not help. There is too big of a gap between the end of what console shows, and the point where the journal somehow gets reset/dropped.


[   44.318819] 8021q: adding VLAN 0 to HW filter on device enp1s0

Fedora 31 (Workstation Edition)
Kernel 5.2.0-0.rc1.git3.1.fc31.x86_64 on an x86_64 (ttyS0)

localhost login: root
[   61.443308] fuse: init (API version 7.30)
[root@localhost ~]# journalctl -b -o short-monotonic
-- Logs begin at Thu 2019-05-30 23:36:50 EDT, end at Thu 2019-05-30 23:36:56 ED>
[   55.359850] localhost chronyd[1511]: Selected source 69.164.213.136
[   55.391811] localhost chronyd[1511]: System clock TAI offset set to 37 secon>
[   55.392021] localhost chronyd[1511]: System clock wrong by -14399.943210 sec>
[   55.392092] localhost chronyd[1511]: System clock was stepped by -14399.9432>
[   55.401592] localhost systemd-machined[1198]: Got message type=signal sender>


Why is the clock this wrong in a VM? I'd expect it gets its "hardware" clock time from the host.

Comment 3 Chris Murphy 2019-05-31 05:27:48 UTC
Created attachment 1575555 [details]
journal without problem

1 out of ~20 attempts, the problem doesn't happen. A race?

[   11.674192] localhost dracut-initqueue[622]: + dd if=/dev/null of=/overlay bs=1024 count=1 seek=33554432
[   11.674192] localhost dracut-initqueue[622]: + '[' -n '' -a -n '' ']'
[   11.783653] localhost dracut-initqueue[622]: ++ losetup -f --show /overlay
[   11.789944] localhost dracut-initqueue[622]: + OVERLAY_LOOPDEV=/dev/loop2
[   11.789944] localhost dracut-initqueue[622]: + over=/dev/loop2
[   11.789944] localhost dracut-initqueue[622]: + '[' -z '' ']'
[   11.789944] localhost dracut-initqueue[622]: + '[' -n '' ']'
[   11.789944] localhost dracut-initqueue[622]: + base=/dev/loop1
[   11.789944] localhost dracut-initqueue[622]: + '[' -n '' ']'
[   11.789944] localhost dracut-initqueue[622]: + '[' -z '' ']'
[   11.855020] localhost dracut-initqueue[622]: + echo 0 13635584 snapshot /dev/loop1 /dev/loop2 PO 8

This is suspicious. What is this?

And later...

[liveuser@localhost ~]$ losetup
NAME       SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE                               DIO LOG-SEC
/dev/loop1         0      0         0  1 /LiveOS/rootfs.img                        0     512
/dev/loop2         0      0         0  0 /overlay (deleted)                        0     512
/dev/loop0         0      0         0  1 /run/initramfs/live/LiveOS/squashfs.img   0     512


So a temporary overlay that's later deleted? I don't see what deletes it. Or if it gets used for anything.

Comment 4 Chris Murphy 2019-06-06 04:14:52 UTC
I can reproduce this in a default virt-manager qemu-kvm (UEFI) and booting the first menu entry without any additional options. First journalctl entry is 15.852705 seconds.

Comment 5 Chris Murphy 2019-06-06 04:16:11 UTC
Comment 4, I used Fedora-Workstation-Live-x86_64-30-1.2.iso

So it's not a new problem...

Comment 6 Fedora Blocker Bugs Application 2019-06-06 04:16:28 UTC
Proposed as a Blocker for 31-beta by Fedora user chrismurphy using the blocker tracking app because:

 Seems to violate basic criterion.
https://fedoraproject.org/wiki/Basic_Release_Criteria

A system logging infrastructure must be available, enabled by default, and working.

Comment 7 Chris Murphy 2019-06-19 01:43:04 UTC
Created attachment 1581997 [details]
F30 dmesg

systemd.log_level=debug systemd.log_target=kmsg log_buf_len=10M rd.debug rd.udev.debug

This excerpt shows journal files being deleted, that seems bad.

[   24.904819] systemd-journald[922]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
[   24.908286] systemd-journald[922]: Reserving 147626 entries in hash table.
[   24.910792] systemd-journald[922]: Vacuuming...
[   24.921802] systemd-journald[922]: Deleted archived journal /var/log/journal/05d0a9c86a0e4bbcb36c5e0082b987ee/system@818f3f40f19849e08a1b37b9c1e304f1-0000000000000001-00058ba31bec725e.journal (8.0M).
[   24.921808] systemd-journald[922]: Vacuuming done, freed 8.0M of archived journals from /var/log/journal/05d0a9c86a0e4bbcb36c5e0082b987ee.

Comment 8 Chris Murphy 2019-06-19 01:48:08 UTC
Created attachment 1581998 [details]
F30 journal

F30 journal

systemd.log_level=debug systemd.log_target=kmsg log_buf_len=10M rd.debug rd.udev.debug

This is the entire journal for the same boot as comment 7's dmesg attachment. The journal begins at:
[   24.901381] localhost-live systemd[1]: Time has been changed

This really seems like a systemd-journald bug that it's deleting journals, but then I also wonder if it's appropriate for install media boots to store journals in /var rather than /run, just because why write to persistent media?

Comment 9 Chris Murphy 2019-06-19 03:22:03 UTC
/var/log/journal exists on LiveOS media and therefore systemd-journald flushes there; for whatever reason I can't figure out, it thinks the ~750MB / is small enough that in some cases it vacuums older journals, and I suspect that's why the journal is missing entries. They've been deleted.

/var/log/journal does not exist on netinstalls, instead /run/log/journal is used, and this problem doesn't reproduce there.

I've asked on systemd-devel@ list if /var/log/journal should exist with LiveOS install media. If not, then two things:

a anaconda when creating Live media, needs to not create /var/log/journal
b anaconda when installing to a real system, needs to create /var/log/journal

And then sytsemd-journald's storage default of auto will use the correct location in each case.

Comment 10 Chris Murphy 2019-06-20 17:28:28 UTC
Upstream discussion.
https://lists.freedesktop.org/archives/systemd-devel/2019-June/042941.html

Comment 11 Stephen Gallagher 2019-07-15 13:39:14 UTC
I don't think I would block on this; the logs work once the system is up and running, so it's really only the early-boot stuff that gets lost. Bad, but not necessarily worth blocking on (it won't matter to 99% of users). It's absolutely a bug worth fixing, but I don't think it's worth slipping the release for (particularly since we know from this conversation that it has been like this since at least F30 if not earlier).

Comment 12 Chris Murphy 2019-07-15 14:50:13 UTC
I agree regular users aren't likely affected by this.

But QA could at anytime run into an early startup bug, and unravelling it will be made difficult by this bug. Maybe this is a "conditional blocker" where a future condition could mean this bug is blocker worthy? I'm not sure how that helps progress the bug though. If QA finds itself in need of these early journal messages, now they're stuck waiting for two fixes

Summary for the blocker review meeting since this bug is a bit of a mess:

1.  for some unknown reason systemd-journald thinks from the outset there's no space for journals.

[    9.179050] localhost systemd-journald[933]: System journal
(/var/log/journal/05d0a9c86a0e4bbcb36c5e0082b987ee) is 8.0M, max 8.0M,
0B free.

2. that means only one system journal can exist at a time

3. anything that causes system journal to reach 8+ MiB triggers journal rotation, and the current journal is erased, e.g. using any of systemd, dracut, or udev debug options. It's likely any process that spews a lot of messages also will trigger this, i.e. anaconda itself dumps quite a lot of messages into the journal during the installation process.


This is the last upstream message on the issue.
https://lists.freedesktop.org/archives/systemd-devel/2019-June/042950.html


And hilariously, trying to debug this bug is hindered by the bug.

Comment 13 Chris Murphy 2019-07-15 15:05:30 UTC
It's kinda misleading to say only early debug messages are dropped. That's true in my case.

But anything that produces many messages can cause journal rotation and loss of the current journal's entries. There could be a late problem with installation, a crash, late device discovery - that could only at that time bump the journal over 8MiB, triggering journal rotation, leading to *all* prior messages up to that point being lost.

Comment 14 Geoffrey Marr 2019-07-15 19:04:45 UTC
Discussed during the 2019-07-15 blocker review meeting: [1]

The decision to classify this bug as a "RejectedBlocker" was made as we do not believe it breaks the release criteria, it's just an unfortunate issue. The system logging function is present and does broadly work. This can be re-considered if further information or a different justification arises.

[1] https://meetbot.fedoraproject.org/fedora-blocker-review/2019-07-15/f31-blocker-review.2019-07-15-16.05.txt

Comment 15 Chris Murphy 2019-07-15 23:57:04 UTC
This bug now happens without any debug options set - just boot the image choosing "Start Fedora..."

Fedora-Workstation-Live-x86_64-Rawhide-20190712.n.0.iso
Fedora-Workstation-Live-x86_64-Rawhide-20190715.n.0.iso

# journalctl -o short-monotonic
-- Logs begin at Mon 2019-07-15 17:10:35 EDT, end at Mon 2019-07-15 21:10:34 EDT. --
[   25.172215] localhost-live systemd[1486]: Starting D-Bus User Message Bus Socket.

Everything before that is missing. I don't see how that is consistent with the "and working" part of the release criterion.

I also just realized I reported this in bug 1676333 during the Fedora 30 cycle, but must've lost track of it and never nominated it as a blocker before. But the idea all bugs get grandfathered and can never be blocker bugs, is not reasonable.

The interesting thing in that dup bug is: "OK I'm not seeing this with systemd-241~rc2-2.fc30.x86_64 in the 20190211 Live media."

Comment 16 Chris Murphy 2019-07-15 23:58:39 UTC
*** Bug 1676333 has been marked as a duplicate of this bug. ***

Comment 17 Chris Murphy 2019-07-19 02:42:31 UTC
Created attachment 1591900 [details]
journald forward to kmsg full

1. Boot this in VM (qemu-kvm) Fedora-Workstation-Live-x86_64-Rawhide-20190715.n.0.iso, which has systemd-242-5.git7a6d834.fc31.x86_64

2. # cat /proc/cmdline
BOOT_IMAGE=/images/pxeboot/vmlinuz root=live:CDLABEL=Fedora-WS-Live-rawh-20190715-n-1 rd.live.image rd.debug rd.udev.debug systemd.log_level=debug systemd.log_target=kmsg log_buf_len=16M printk.devkmsg=on

3. # df -h
...
/dev/mapper/live-rw  6.4G  5.7G  648M  91% /

----excerpts from attached with comments----

30,17657,11742052,-;systemd[1]: Starting Switch Root...
...
47,19636,16754831,-;systemd-journald[905]: Fixed min_use=1.0M max_use=648.7M max_size=81.0M min_size=512.0K keep_free=973.1M n_max_files=100

The max_use matches with `df` but the keep_free makes no sense to me where it's coming from. That's above the available space. That's not reasonable.

Comment 18 Zbigniew Jędrzejewski-Szmek 2019-07-19 17:10:56 UTC
https://github.com/systemd/systemd/pull/13120

Comment 19 Fedora Blocker Bugs Application 2019-07-23 16:32:53 UTC
Proposed as a Blocker for 31-beta by Fedora user chrismurphy using the blocker tracking app because:

 See comment 15, the problem happens on Workstation Live without any boot time modifications, and it is a release blocking image, and logging clearly doesn't work.

Comment 20 Matthew Miller 2019-07-26 19:45:19 UTC
It seems likely that the patch in comment #18 will address this, making it irrelevant,  but in the event it doesn't, I suggest we make this a prioritized bug and not a blocker.

Comment 21 Chris Murphy 2019-07-26 22:28:15 UTC
I think the systemd keep_free behavior is the most central bug, but it's triggered because the free space on Fedora Lives has been progressively diminishing.

From `df -h` "avail" column
Fedora 31 (20190715), 656M
Fedora 30, 859M
Fedora 29, 1.1G
Fedora 28, 1.5G

The root file system volume size is hard set on line 13:
https://pagure.io/fedora-kickstarts/blob/master/f/fedora-live-workstation.ks

But the OS itself (the payload if you will) keeps getting bigger, so free space keeps shrinking each release. The two issues collided starting with Fedora 30, where this bug is triggered only if I enabled debug options. That causes the journal to get bigger faster, triggering rotation and the aggressive vacuuming. Then in Fedora 31, latest rawhides, this now happens on a normal boot without adding any debug options.

For instance, if I use l-i-t-d to create media with persistent overlay, the problem still happens. But if I then only change journald.conf to set SystemKeepFree=10M and reboot, the problem doesn't happen.

Therefore two ways to make sure this doesn't actually block:
a. fix systemd
b. increase root file system size

Comment 22 Geoffrey Marr 2019-07-29 18:28:11 UTC
Discussed during the 2019-07-29 blocker review meeting: [1]

The decision to classify this bug as a "RejectedBlocker" was made as this is not a blocker as the logging requirement applies only post-install. Opinions on whether this bug should be a blocker if the criteria were to apply to live environments were split, so we may revisit the bug once more if the criteria are changed and the bug is not fixed by then.

[1] https://meetbot.fedoraproject.org/fedora-blocker-review/2019-07-29/f31-blocker-review.2019-07-29-16.02.txt

Comment 23 Kamil Páral 2019-07-30 08:38:44 UTC
Nominating for a PrioritizedBug, since this was rejected as a blocker. Chris, please provide a short summary of the problem, thanks.

Comment 24 Zbigniew Jędrzejewski-Szmek 2019-07-30 20:35:31 UTC
Built in rawhide.