Bug 1373888 - when systemd-journald err occurs ,Why systemd drop systemd-journald 's error log message into dmesg at rhel7.2?
Summary: when systemd-journald err occurs ,Why systemd drop systemd-journald 's error...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: systemd
Version: 7.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: systemd-maint
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-07 11:26 UTC by muahao
Modified: 2016-09-08 09:12 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-09-08 09:12:08 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description muahao 2016-09-07 11:26:18 UTC
Description of problem:

at https://bugzilla.redhat.com/show_bug.cgi?id=1292447 you can see systemd  
floods err messages like this: [203270.192700] systemd-journald[35206]: Failed to write entry (20 items, 704 bytes), ignoring: Invalid argument   into dmesg!
and my machine hang!

My question is:
Why systemd-journald drop it's log message into dmesg? 
when systemd-journald error occurs

Version-Release number of selected component (if applicable):


How reproducible:
when journalctl --verify  check error jounal log files;


Steps to Reproduce:
1.journalctl --verify
2.systemctm restart systemd-journald
3.journalctl -f 

Actual results:


Expected results:
I think error happened at systemd-journald,the error message should not drop into dmesg? because it will make machine hang, what do you think of it ?

Additional info:

Comment 2 Lukáš Nykrýn 2016-09-07 12:39:21 UTC
On error we write to kernel ring buffer, because that is the only place where we can write the logs so they could be analyzed by an administrator. This will not make the machine hang.

Also your reproducer does not work for me.
[0 root@qeos-232 ~]# journalctl --verify
PASS: /run/log/journal/d20fb34accd74a158e40457bf1d641bf/system.journal                                                                                              
[0 root@qeos-232 ~]# systemctl restart systemd-journald
[0 root@qeos-232 ~]# journalctl -f 
-- Logs begin at St 2016-09-07 08:22:49 EDT. --
zář 07 08:29:18 qeos-232.lab.eng.rdu2.redhat.com polkitd[635]: Registered Authentication Agent for unix-process:1332:38929 (system bus name :1.27 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale cs_CZ.UTF-8)
zář 07 08:29:18 qeos-232.lab.eng.rdu2.redhat.com systemd[1]: Stopping Flush Journal to Persistent Storage...
zář 07 08:29:18 qeos-232.lab.eng.rdu2.redhat.com systemd-journal[476]: Journal stopped


If you have some stable reproducer, will you be willing to test a sneak preview of systemd for 7.3? It contains quite a lot of journald fixes.
https://copr.fedorainfracloud.org/coprs/lnykryn/systemd-rhel-staging/
But please note that this build is not supported in any way and should not be used on production machines.

Comment 3 muahao 2016-09-08 02:56:41 UTC
(In reply to Lukáš Nykrýn from comment #2)
> On error we write to kernel ring buffer, because that is the only place
> where we can write the logs so they could be analyzed by an administrator.
> This will not make the machine hang.
> 
> Also your reproducer does not work for me.
> [0 root@qeos-232 ~]# journalctl --verify
> PASS: /run/log/journal/d20fb34accd74a158e40457bf1d641bf/system.journal      
> 
> [0 root@qeos-232 ~]# systemctl restart systemd-journald
> [0 root@qeos-232 ~]# journalctl -f 
> -- Logs begin at St 2016-09-07 08:22:49 EDT. --
> zář 07 08:29:18 qeos-232.lab.eng.rdu2.redhat.com polkitd[635]: Registered
> Authentication Agent for unix-process:1332:38929 (system bus name :1.27
> [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path
> /org/freedesktop/PolicyKit1/AuthenticationAgent, locale cs_CZ.UTF-8)
> zář 07 08:29:18 qeos-232.lab.eng.rdu2.redhat.com systemd[1]: Stopping Flush
> Journal to Persistent Storage...
> zář 07 08:29:18 qeos-232.lab.eng.rdu2.redhat.com systemd-journal[476]:
> Journal stopped
> 
> 
> If you have some stable reproducer, will you be willing to test a sneak
> preview of systemd for 7.3? It contains quite a lot of journald fixes.
> https://copr.fedorainfracloud.org/coprs/lnykryn/systemd-rhel-staging/
> But please note that this build is not supported in any way and should not
> be used on production machines.


Thanks for your reply.

First,,This phenomenon has the probability to appear, the premise is when excute journalctl --verify  get message like this:

3a19c98: unused data (entry_offset==0)
3a19ce8: invalid object
File corruption detected at /var/log/journal/918f61018af84ac098688cb2af83f6be/system.journal:3a19ce8 (of 67108864 bytes, 90%).
FAIL: /var/log/journal/918f61018af84ac098688cb2af83f6be/system.journal (Cannot assign requested address)
PASS: /var/log/journal/918f61018af84ac098688cb2af83f6be/user-120063.journal

 
then,drop floods err message into dmesg like this:

dmesg | tail

[989939.739946] systemd-journald[34605]: Failed to write entry (12 items, 328 bytes), ignoring: Cannot assign requested address
[989940.206097] systemd-journald[34605]: Failed to write entry (23 items, 575 bytes), ignoring: Cannot assign requested address
[989940.217646] systemd-journald[34605]: Failed to write entry (23 items, 615 bytes), ignoring: Cannot assign requested address
[989940.259922] systemd-journald[34605]: Failed to write entry (23 items, 604 bytes), ignoring: Cannot assign requested address
[989940.740271] systemd-journald[34605]: Failed to write entry (23 items, 595 bytes), ignoring: Cannot assign requested address
[989940.751819] systemd-journald[34605]: Failed to write entry (23 items, 595 bytes), ignoring: Cannot assign requested address
[989940.763329] systemd-journald[34605]: Failed to write entry (17 items, 380 bytes), ignoring: Cannot assign requested address
[989942.711210] systemd-journald[34605]: Failed to write entry (23 items, 584 bytes), ignoring: Cannot assign requested address
[989942.722793] systemd-journald[34605]: Failed to write entry (23 items, 624 bytes), ignoring: Cannot assign requested address


Because of droping flood err message into "kernel ring buffer" with printk and then causing some compoment or module unstable , Finally when I do Pressure test on my operation system ,it makes my OS hang.when I fix this bug(https://github.com/systemd/systemd/pull/1663/files#r42934549)  it will not hang!

I kown this is a systemd bug already fixed :https://github.com/systemd/systemd/pull/1663/files#r42934549

1. I just adoubt that ,When systemd-journal err occurs ,should err message must drop into "kernel ring buffer"?

2. another compoment of systemd err occurs ,it's err message also drop into "kernel ring buffer"?

3. Have some swicth to turn it off?

4. when my systemd-journald err occurs about :
989939.739946] systemd-journald[34605]: Failed to write entry (12 items, 328 bytes), ignoring: Cannot assign requested address
[989940.206097] systemd-journald[34605]: Failed to write entry (23 items, 575 bytes), ignoring: Cannot assign requested address
[989940.217646] systemd-journald[34605]: Failed to write entry (23 items, 615 bytes), ignoring: Cannot assign requested address
[989940.259922] systemd-journald[34605]: Failed to write entry (23 items, 604 bytes), ignoring: Cannot assign requested address

"journalctl -f" can get normal log message like authpriv,sudo,su ...
can "Failed to write entry " err message drop into "journalctl -f "?

Comment 4 Lukáš Nykrýn 2016-09-08 09:12:08 UTC
Again, kernel buffer is a place where we can sore data in that we can't write those data elsewhere, one of the use-cases is the early boot. This behavior can't be turned off.

The bug you mentioned should be fixed in rhel-7.3.

If you are still experiencing some issues that could be reproduced with the latest build from copr, feel free to reopen this bug.


Note You need to log in before you can comment on or make changes to this bug.