Hide Forgot
Description of problem: at https://bugzilla.redhat.com/show_bug.cgi?id=1292447 you can see systemd floods err messages like this: [203270.192700] systemd-journald[35206]: Failed to write entry (20 items, 704 bytes), ignoring: Invalid argument into dmesg! and my machine hang! My question is: Why systemd-journald drop it's log message into dmesg? when systemd-journald error occurs Version-Release number of selected component (if applicable): How reproducible: when journalctl --verify check error jounal log files; Steps to Reproduce: 1.journalctl --verify 2.systemctm restart systemd-journald 3.journalctl -f Actual results: Expected results: I think error happened at systemd-journald,the error message should not drop into dmesg? because it will make machine hang, what do you think of it ? Additional info:
On error we write to kernel ring buffer, because that is the only place where we can write the logs so they could be analyzed by an administrator. This will not make the machine hang. Also your reproducer does not work for me. [0 root@qeos-232 ~]# journalctl --verify PASS: /run/log/journal/d20fb34accd74a158e40457bf1d641bf/system.journal [0 root@qeos-232 ~]# systemctl restart systemd-journald [0 root@qeos-232 ~]# journalctl -f -- Logs begin at St 2016-09-07 08:22:49 EDT. -- zář 07 08:29:18 qeos-232.lab.eng.rdu2.redhat.com polkitd[635]: Registered Authentication Agent for unix-process:1332:38929 (system bus name :1.27 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale cs_CZ.UTF-8) zář 07 08:29:18 qeos-232.lab.eng.rdu2.redhat.com systemd[1]: Stopping Flush Journal to Persistent Storage... zář 07 08:29:18 qeos-232.lab.eng.rdu2.redhat.com systemd-journal[476]: Journal stopped If you have some stable reproducer, will you be willing to test a sneak preview of systemd for 7.3? It contains quite a lot of journald fixes. https://copr.fedorainfracloud.org/coprs/lnykryn/systemd-rhel-staging/ But please note that this build is not supported in any way and should not be used on production machines.
(In reply to Lukáš Nykrýn from comment #2) > On error we write to kernel ring buffer, because that is the only place > where we can write the logs so they could be analyzed by an administrator. > This will not make the machine hang. > > Also your reproducer does not work for me. > [0 root@qeos-232 ~]# journalctl --verify > PASS: /run/log/journal/d20fb34accd74a158e40457bf1d641bf/system.journal > > [0 root@qeos-232 ~]# systemctl restart systemd-journald > [0 root@qeos-232 ~]# journalctl -f > -- Logs begin at St 2016-09-07 08:22:49 EDT. -- > zář 07 08:29:18 qeos-232.lab.eng.rdu2.redhat.com polkitd[635]: Registered > Authentication Agent for unix-process:1332:38929 (system bus name :1.27 > [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path > /org/freedesktop/PolicyKit1/AuthenticationAgent, locale cs_CZ.UTF-8) > zář 07 08:29:18 qeos-232.lab.eng.rdu2.redhat.com systemd[1]: Stopping Flush > Journal to Persistent Storage... > zář 07 08:29:18 qeos-232.lab.eng.rdu2.redhat.com systemd-journal[476]: > Journal stopped > > > If you have some stable reproducer, will you be willing to test a sneak > preview of systemd for 7.3? It contains quite a lot of journald fixes. > https://copr.fedorainfracloud.org/coprs/lnykryn/systemd-rhel-staging/ > But please note that this build is not supported in any way and should not > be used on production machines. Thanks for your reply. First,,This phenomenon has the probability to appear, the premise is when excute journalctl --verify get message like this: 3a19c98: unused data (entry_offset==0) 3a19ce8: invalid object File corruption detected at /var/log/journal/918f61018af84ac098688cb2af83f6be/system.journal:3a19ce8 (of 67108864 bytes, 90%). FAIL: /var/log/journal/918f61018af84ac098688cb2af83f6be/system.journal (Cannot assign requested address) PASS: /var/log/journal/918f61018af84ac098688cb2af83f6be/user-120063.journal then,drop floods err message into dmesg like this: dmesg | tail [989939.739946] systemd-journald[34605]: Failed to write entry (12 items, 328 bytes), ignoring: Cannot assign requested address [989940.206097] systemd-journald[34605]: Failed to write entry (23 items, 575 bytes), ignoring: Cannot assign requested address [989940.217646] systemd-journald[34605]: Failed to write entry (23 items, 615 bytes), ignoring: Cannot assign requested address [989940.259922] systemd-journald[34605]: Failed to write entry (23 items, 604 bytes), ignoring: Cannot assign requested address [989940.740271] systemd-journald[34605]: Failed to write entry (23 items, 595 bytes), ignoring: Cannot assign requested address [989940.751819] systemd-journald[34605]: Failed to write entry (23 items, 595 bytes), ignoring: Cannot assign requested address [989940.763329] systemd-journald[34605]: Failed to write entry (17 items, 380 bytes), ignoring: Cannot assign requested address [989942.711210] systemd-journald[34605]: Failed to write entry (23 items, 584 bytes), ignoring: Cannot assign requested address [989942.722793] systemd-journald[34605]: Failed to write entry (23 items, 624 bytes), ignoring: Cannot assign requested address Because of droping flood err message into "kernel ring buffer" with printk and then causing some compoment or module unstable , Finally when I do Pressure test on my operation system ,it makes my OS hang.when I fix this bug(https://github.com/systemd/systemd/pull/1663/files#r42934549) it will not hang! I kown this is a systemd bug already fixed :https://github.com/systemd/systemd/pull/1663/files#r42934549 1. I just adoubt that ,When systemd-journal err occurs ,should err message must drop into "kernel ring buffer"? 2. another compoment of systemd err occurs ,it's err message also drop into "kernel ring buffer"? 3. Have some swicth to turn it off? 4. when my systemd-journald err occurs about : 989939.739946] systemd-journald[34605]: Failed to write entry (12 items, 328 bytes), ignoring: Cannot assign requested address [989940.206097] systemd-journald[34605]: Failed to write entry (23 items, 575 bytes), ignoring: Cannot assign requested address [989940.217646] systemd-journald[34605]: Failed to write entry (23 items, 615 bytes), ignoring: Cannot assign requested address [989940.259922] systemd-journald[34605]: Failed to write entry (23 items, 604 bytes), ignoring: Cannot assign requested address "journalctl -f" can get normal log message like authpriv,sudo,su ... can "Failed to write entry " err message drop into "journalctl -f "?
Again, kernel buffer is a place where we can sore data in that we can't write those data elsewhere, one of the use-cases is the early boot. This behavior can't be turned off. The bug you mentioned should be fixed in rhel-7.3. If you are still experiencing some issues that could be reproduced with the latest build from copr, feel free to reopen this bug.