Red Hat Bugzilla – Bug 1294002
failed to look up boot -1 no such boot ID in journal
Last modified: 2016-12-04 23:45 EST
Created attachment 1109064 [details]
Description of problem:
Using -b switch I can't see previous boots, I just have to omit the -b switch and then scroll through a long pile to find what I'm looking for.
[root@f23a ~]# journalctl --list-boots
-5 eb5bc3dfefb849a5935ace8474373181 Sun 2015-12-20 03:44:32 MST—Sun 2015-12-20 03:47:58 MST
-4 22feafe74d6a4b9d940459cecc96dbf4 Sun 2015-12-20 21:25:41 MST—Mon 2015-12-21 00:13:10 MST
-3 99f2036a416446d8a2f0f49e2ca6d05a Mon 2015-12-21 01:03:24 MST—Mon 2015-12-21 01:04:13 MST
-2 78ddbd9a3e5244e29a230dc0b15ad2a5 Tue 2015-12-22 12:07:02 MST—Tue 2015-12-22 22:23:07 MST
-1 6b67a7de83a54b9d8be283dcccce9eef Wed 2015-12-23 09:22:54 MST—Wed 2015-12-23 13:29:58 MST
0 e9db32cba47c45029752b628b6379097 Wed 2015-12-23 17:46:53 MST—Wed 2015-12-23 17:56:52 MST
So journalctl -b-1 should show the last boot but it fails.
Version-Release number of selected component (if applicable):
With the journal files I currently have, always.
If I erase them and start clean, and never have any crashes, then -b-1, -b-2 etc will work OK. But as soon as there's an unclean shutdown, there is some kind of journal problem always. It might be this problem as reported, or it might be journal entries out of order.
Steps to Reproduce:
1. journalctl -b-1
Failed to look up boot -1: No such boot ID in journal
Since other options appear to find the journal entries (they aren't completely corrupted, I can find them other ways), this one ought to work also so there must be some cross indexing or hint that broke in these log files.
journalctl by itself show all the boots in one long list that I can appear to scroll through. Also this seems to work:
[root@f23a ~]# journalctl --since 2015-12-23
You can see at the end that something (I think it was lvmthinp) soaked all the memory and resulted in a bunch of oom killers that never worked enough to get back control so the powerkey was used (that's the "crash" in this case).
The journal logs are on btrfs on a hard drive partition, default mount options, nothing fancy. Btrfs has no complaints on mount, scrub comes up clean also.
HUH. This is vaguely interesting, and also not expected behavior.
If I use 'journalctl --since 2015-12-23' it ends with:
Dec 23 17:42:01 f23a.localdomain systemd-logind: Power key pressed.
lines 11375-11429/11429 (END)
But this is not the end of the journal. There is another boot after this which I can access with 'journalctl -b' and starts with
Dec 23 17:46:53 localhost systemd-journal: Runtime journal is using 8.0M (max allowed 193.6M, trying to leave 290.4M free of 1.8G available → current limit 193.6M).
I would expect --since 2015-12-23 would eventually roll into this boot, but instead it ends at a previous boot.
And --list-boots lists the current boot, which --since isn't rolling into.
Yeah, the root issue seems to be this:
Can't iterate through firstname.lastname@example.org~, ignoring: Bad message
File email@example.com~ removed.
Failed to look up boot -1: No such boot ID in journal
So we find the journal file we want to show, but than we refuse to show it because it is corrupted. Instead, we should show as much as possible. This would help in various other cases, not only here.
First thing is to fix journalctl to actually allow looking at files:
This is still a problem in systemd-229-9.fc24.x86_64 where -b-1, -b-2, -b-3, -b-4 are showing maybe a dozen lines rather than showing the actual start to finish boot that includes all the bios and memory initialization stuff from the kernel, initramfs unpacking, systemd and services starting.
For example -b-4
[root@f24s ~]# journalctl -b-4
-- Logs begin at Sat 2016-01-30 22:04:43 MST, end at Wed 2016-08-10 08:40:16 MDT. --
Aug 01 22:30:46 f24s systemd: Reached target Timers.
Aug 01 22:30:46 f24s systemd: Starting D-Bus User Message Bus Socket.
Aug 01 22:30:46 f24s systemd: Reached target Paths.
Aug 01 22:30:47 f24s systemd: Listening on D-Bus User Message Bus Socket.
Aug 01 22:30:47 f24s systemd: Reached target Sockets.
Aug 01 22:30:47 f24s systemd: Reached target Basic System.
Aug 01 22:30:47 f24s systemd: Reached target Default.
Aug 01 22:30:47 f24s systemd: Startup finished in 146ms.
Aug 01 22:31:20 f24s systemd: Stopped target Default.
Aug 01 22:31:20 f24s systemd: Stopped target Basic System.
Aug 01 22:31:20 f24s systemd: Stopped target Paths.
Aug 01 22:31:20 f24s systemd: Stopped target Sockets.
Aug 01 22:31:20 f24s systemd: Closed D-Bus User Message Bus Socket.
Aug 01 22:31:20 f24s systemd: Reached target Shutdown.
Aug 01 22:31:20 f24s systemd: Starting Exit the Session...
Aug 01 22:31:20 f24s systemd: Stopped target Timers.
Aug 01 22:31:21 f24s systemd: Received SIGRTMIN+24 from PID 2234 (kill).
That's it, and it's the wrong date.
-4 c9d3d01eb9a9483aa9f295bc22b96415 Wed 2016-07-13 11:28:18 MDT—Thu 2016-07-14 21:23:55 MDT
Boot -4 is Jul 13, two weeks before the information -b-4 switch is showing. So it's still totally busted near as I can tel.
If I use # journalctl --since="2016-07-13 11:28:18" copy/pasting directly from the list-boots output, I get exactly what I want. So the info I want is clearly in the journal files, it's clearly accessible by --since, but not by -b.
It's even worse with current systemd git :(
$ ./journalctl -D /var/tmp/d98f033569044c0887c819c792012e6c --list-boots
-9 eb5bc3dfefb849a5935ace8474373181 Sun 2015-12-20 05:44:32 EST—Sun 2015-12-20 05:47:58 EST
-8 22feafe74d6a4b9d940459cecc96dbf4 Sun 2015-12-20 23:25:41 EST—Mon 2015-12-21 02:13:10 EST
-7 99f2036a416446d8a2f0f49e2ca6d05a Mon 2015-12-21 03:03:24 EST—Mon 2015-12-21 03:04:13 EST
-6 78ddbd9a3e5244e29a230dc0b15ad2a5 Tue 2015-12-22 14:07:02 EST—Wed 2015-12-23 00:23:07 EST
-5 6b67a7de83a54b9d8be283dcccce9eef Wed 2015-12-23 11:22:54 EST—Wed 2015-12-23 15:29:58 EST
-4 ddd6ad5f748d40728d5326a471c17ed0 Wed 2015-12-23 15:30:21 EST—Wed 2015-12-23 16:27:43 EST
-3 16ed3611de4743369a53373a062eb7dd Wed 2015-12-23 16:28:06 EST—Wed 2015-12-23 16:29:58 EST
-2 96f9e3146557409c9f585f0e11c3f5ee Wed 2015-12-23 16:35:33 EST—Wed 2015-12-23 17:22:55 EST
-1 f4d7c894b77545f0ba844bcecb2c187b Wed 2015-12-23 17:23:19 EST—Wed 2015-12-23 17:25:19 EST
0 e9db32cba47c45029752b628b6379097 Wed 2015-12-23 19:46:53 EST—Wed 2015-12-23 20:02:00 EST
$ ./journalctl -D /var/tmp/d98f033569044c0887c819c792012e6c -b -1
Data from the specified boot (-1) is not available: No such boot ID in journal
$ ./journalctl -D /var/tmp/d98f033569044c0887c819c792012e6c -b -2
Data from the specified boot (-2) is not available: No such boot ID in journal
Well in some ways it's a better failure to return nothing with a bogus error, rather than returning incomplete data from a totally different boot.
*** Bug 1295612 has been marked as a duplicate of this bug. ***
This message is a reminder that Fedora 23 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 23. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora 'version'
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.
Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 23 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
Still a problem with systemd-231-10.fc25.x86_64.
Created attachment 1227967 [details]
[chris@f25s ~]$ sudo journalctl --list-boots
-2 d6b4e63d8a19482eb9e4f7383e191bb3 Fri 2016-12-02 22:53:11 MST—Sat 2016-12-03 00:01:04 MST
-1 44de0ff7a2674f9d9dc7533d49d0c0f5 Sat 2016-12-03 00:20:09 MST—Sun 2016-12-04 21:33:26 MST
0 bbc2079f1cc448469b9a8a4839af1c59 Sun 2016-12-04 21:34:47 MST—Sun 2016-12-04 21:38:03 MST
# journalctl --verify
e18fd8: Unused data (entry_offset==0)
e18fd8: Invalid hash (920d2338a477f20 vs. edb8b54ac6b13cdc
e18fd8: Invalid object contents: Bad message
File corruption detected at /firstname.lastname@example.org~:e18fd8 (of 16777216 bytes, 88%).
FAIL: /email@example.com~ (Bad message)
There are only four journal files. I know that boots -2 and -1 were working before I had to do a forced reboot with echo b > /proc/sysrq-trigger, but now they aren't working when I do journalctl -b-2 and -b-1 it shows a handful of lines instead of what's full in the log using either --since or just omitting -b, so basically -b X is still broken.
Attaching a tar.bz2 of all four journal files on this cleanly installed F25 system.