Bug 847207
Summary: | journald should say when it drops messages when syslog can't keep up with the log stream | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Jan Friesse <jfriesse> |
Component: | systemd | Assignee: | systemd-maint |
Status: | CLOSED NEXTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 17 | CC: | abeekhof, asalkeld, fdinitto, johannbg, lnykryn, lpoetter, metherid, msekleta, notting, plautrba, rhel, systemd-maint, theinric, vpavlin |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2012-10-07 22:44:18 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Jan Friesse
2012-08-10 05:36:48 UTC
Created attachment 603422 [details]
node4-messages.xz - /var/log/messages
Created attachment 603423 [details]
node4-corosynclog.xz - /var/log/cluster/corosync.log
libqb just goes through the targets and calls it's write method. So if you are getting the file message then libqb is calling syslog(). Unfortunately syslog() has a void return so I can't check for failed log message. I suspect the kernel buffer is full (/dev/log) and it's just getting dropped. Is this happening with threaded or non-threaded logging? Created attachment 603845 [details]
diagnostic patch for libqb
Hi Honza
Can you apply the attached patch to libqb and re-run?
It should log to syslog on shutdown the number of logs written
by each target.
Like this:
Aug 13 13:26:15 elf corosync[19489]: corosync:[0] wrote 35 messages
Aug 13 13:26:15 elf corosync[19489]: corosync:stderr[1] wrote 2 messages
Aug 13 13:26:15 elf corosync[19489]: corosync:/var/log/cluster/corosync.log[4] wrote 35 messages
Note: "corosync:[0]" is syslog.
So if libqb counts the number of messages sent to syslog == number sent
to file then the problem is syslog else it is probably a libqb bug.
-Angus
(In reply to comment #4) > Created attachment 603845 [details] > diagnostic patch for libqb > > Hi Honza > > Can you apply the attached patch to libqb and re-run? > > It should log to syslog on shutdown the number of logs written > by each target. > > Like this: > Aug 13 13:26:15 elf corosync[19489]: corosync:[0] wrote 35 messages > Aug 13 13:26:15 elf corosync[19489]: corosync:stderr[1] wrote 2 messages > Aug 13 13:26:15 elf corosync[19489]: > corosync:/var/log/cluster/corosync.log[4] wrote 35 messages > > Note: "corosync:[0]" is syslog. > > So if libqb counts the number of messages sent to syslog == number sent > to file then the problem is syslog else it is probably a libqb bug. > > -Angus Angus, I will try patch and let you know when CTS will hit bug (I'm not able to reproduce it locally, and it happens really from time to time - like 0.5 times per whole CTS -> at least 103 starts + shut downs multiply 4 nodes). And this is happening WITHOUT threaded mode enabled (as current master is, I didn't yet applied patch to make some targets threaded). Created attachment 605107 [details]
/var/log/cluster/corosync.log - with diagnostic patch
Created attachment 605108 [details]
/var/log/messages - with diagnostic logs
Finally able to reproduce again. As can be seen, 22:14:13 at corosync.log contains "main.c:273 Completed service synchronization, ready to provide s ervice." but /var/log/messages doesn't. On exit, libqb says: corosync:[0] wrote 91 messages corosync:/var/log/cluster/corosync.log[4] wrote 91 messages So probably it's not libqb problem (and also not corosync one). I will try busybox syslogd (which seems to be simple/stupid enough to simply work) and post results. Ok, so I've tried busybox syslogd and result was the same, some messages was missing. It's for sure not problem in libqb. So changing component to rsyslog (even it's probably NOT rsyslogd bug, but somewhere bellow -> glibc, kernel, ..., but I'm pretty sure rsyslogd maintainer is more competent to find out good component then I'm). Created attachment 605595 [details]
Quiet reliable reproducer for syslog problem
This is quiet reliable reproducer for syslog problem. It DOESN'T use libqb, but only pure openlog/syslog/closelog. It creates 20 childs and each of child write 10000 messages in form
session=SESSION_ID, int_pid=INTERNAL_PID, msg_no=MESSAGE_NUMBER, rand string=RAND_STRING
where SESSION_ID is random number generated at start of program and written to stdout at start (so it's easy to grep). INTERNAL_PID is internal child pid and can be <0, 19>. MESSAGE_NUMBER is self explaining and can be <0, 9999> (numbers are increasing and ordered). RAND_STRING is random string of characters from ASCII table starting from character '0'
So usage is following:
1.) gcc test.c
2.) ./a.out
result is "session=SESSION_ID started with 20 forks and 10000 msgs = 200000"
3.) when program finishes, it's time to find out HOW many messages was written, so:
cat /var/log/messages | grep 'syslogbug.*session=SESSION_ID,' | wc -l
so concrete example:
./a.out
session=1175375171 started with 20 forks and 10000 msgs = 200000
[root@cts-node1 test]# cat /var/log/messages | grep 'syslogbug.*session=1175375171,' | wc -l
199952
There should be 200000 lines!
Also it's important to say that test.c is really only reproducer. SAME behavior happens even when less messages are logged (also with bigger pauses between messages) and also in single threaded/single process apps.
And just for sure, please note that $SystemLogRateLimitInterval 0 is set in rsyslog.conf. It's fc17 + updates from Fri.
Just a note, fc16 + updates is NOT affected. I've tried to reproduce this with rsyslog-5.8.10-2.fc17.x86_64 Unsuccessfully at first. I had to decrease the delay between messages from 10 ms to 1 ms. This alone didn't help either, but something came up when running the test in a loop: # while sleep 1; do date && ./a.out |& tee /tmp/sess-id && \ grep -c "$(sed -e 's/\(session=[^ ]*\) .*/\1/' /tmp/sess-id)" /var/log/messages; done First several runs completed with 200002 messages but then some losses started to appear. I've tried tweaking some directives and this one had the most impact: $MainMsgQueueSize 200000 My guess is that this is purely a performance issue: the main queue gradually fills up and messages that can't be queued are dropped. There's not much to do about this besides making the processes block or enlarging the queues. Please bear in mind that on f17, there stands journald inbetween rsyslog and the process attempting to log a message: /etc/rsyslog.d/listen.conf: $SystemLogSocketName /run/systemd/journal/syslog One weird note. FC17 + ext2 is NOT affected. FC17 + ext4 is affected. (In reply to comment #12) > I've tried to reproduce this with > rsyslog-5.8.10-2.fc17.x86_64 > Unsuccessfully at first. I had to decrease the delay between messages from > 10 ms to 1 ms. > This alone didn't help either, but something came up when running the test > in a loop: > > # while sleep 1; do date && ./a.out |& tee /tmp/sess-id && \ > grep -c "$(sed -e 's/\(session=[^ ]*\) .*/\1/' /tmp/sess-id)" > /var/log/messages; done > > First several runs completed with 200002 messages but then some losses > started to appear. > I've tried tweaking some directives and this one had the most impact: > > $MainMsgQueueSize 200000 > > My guess is that this is purely a performance issue: the main queue > gradually fills up and messages that can't be queued are dropped. There's > not much to do about this besides making the processes block or enlarging > the queues. > I believe this is expected. I mean, syslog is api which returns void. In other words, there is NO way how to find out if log message was logger correctly or not. In such case I would expect such API cannot ever fall. If it falls (like no space on disk, ...) there should be log ether on console, dmesg, ... Also why RHEL 6 isn't affected? > Please bear in mind that on f17, there stands journald inbetween rsyslog and > the process attempting to log a message: > /etc/rsyslog.d/listen.conf: $SystemLogSocketName /run/systemd/journal/syslog Ya, if you feel it's problem in systemd, please reassign to journald ... (as I noted in comment, busybox syslogd is ALSO affected, so it's probably not problem of rsyslog, but you should be able to identify correct component better then I'm) > I've tried tweaking some directives and this one had the most impact:
>
> $MainMsgQueueSize 200000
This really doesn't solve problem for me. Only thing which really does (WEIRD) is ext2.
So I've spent a little more time testing issue, and found following: Command: for i in `seq 1 5`;do ses_id=`./a.out 2>&1 | cut -d ' ' -f 1 | cut -d '=' -f 2`; cat /var/log/messages | grep 'syslogbug.*session='$ses_id',' | wc -l;done systemd-journald.conf unchanged ext4: 199996 199998 199997 199997 199999 ext3: 200000 200000 200000 200000 200000 ext2: 200000 200000 199996 200000 200000 After "tweaking" systemd-journald.conf: RuntimeMaxUse=10 RuntimeMaxFileSize=100K ext4: 200000 200000 195150 197479 199973 In other words, problem seems to be not fs specific and really looks like systemd-journald problem. > I believe this is expected. I mean, syslog is api which returns void. In > other words, there is NO way how to find out if log message was logger > correctly or not. In such case I would expect such API cannot ever fall. I don't believe this is the case. There are many ways in which it can fail. When applications are logging directly to rsyslog, the daemon can be configured to either make the applications block when attempting to write to the socket or drop the excessive messages. The first approach ensures messages get at least to the daemon, but this can also cause system hangs if the daemon gets overrun or stalls for some other reason. Things get even more complicated with journald involved. > If it falls (like no space on disk, ...) there should be log ether on console, > dmesg, ... If the socket itself fills and messages on the sending side are dropped because of that, rsyslog has no way of detecting it. You can try to compare logs from rsyslog and journal [1] to se if there are some discrepancies. > Ya, if you feel it's problem in systemd, please reassign to journald Feel free to do so. [1] http://www.freedesktop.org/software/systemd/man/systemd-journald.service.html (In reply to comment #17) > > I believe this is expected. I mean, syslog is api which returns void. In > > other words, there is NO way how to find out if log message was logger > > correctly or not. In such case I would expect such API cannot ever fall. > > I don't believe this is the case. There are many ways in which it can fail. > > When applications are logging directly to rsyslog, the daemon can be > configured to either make the applications block when attempting to write to > the socket or drop the excessive messages. What is the configuration option for that? I would like to see how it's implemented, because I've didn't know that it's possible to make socket non-blocking on both side when one side wants blocking. The first approach ensures > messages get at least to the daemon, but this can also cause system hangs if > the daemon gets overrun or stalls for some other reason. I hope blocking behavior is default. > Things get even more complicated with journald involved. > > > If it falls (like no space on disk, ...) there should be log ether on console, > > dmesg, ... > > If the socket itself fills and messages on the sending side are dropped > because of that, rsyslog has no way of detecting it. > But I'm not using any non-blocking API. So if socket (/dev/log) buffer is full, application simply blocks. So whoever created /dev/log don't need to detect that. > You can try to compare logs from rsyslog and journal [1] to se if there are > some discrepancies. > > > Ya, if you feel it's problem in systemd, please reassign to journald > > Feel free to do so. Will do. > > [1] > http://www.freedesktop.org/software/systemd/man/systemd-journald.service.html When journald forwards messages to syslog and syslog is stuck it will not wait for the syslog daemon. (In reply to comment #19) > When journald forwards messages to syslog and syslog is stuck it will not > wait for the syslog daemon. So what you wanted to say is, that there is no way how to reliably log message in non-failure environment? (Different part of the cluster stack here) Personally I'm fine with messages being lost/discarded[1], as long as it doesn't happen silently. Basically, I'd like to see the existing rsyslog behaviour preserved by journald. The pattern we currently look for is: "rsyslogd.* imuxsock lost .* messages from pid .* due to rate-limiting", I think its important that journald log something similar. [1] Pacemaker no longer sends info/debug/trace logs to syslog anyway, so we rarely get rate-limited. If I /really/ want every single log, I write directly to a logfile. So, here's what I propose: a) we should give rsyslog more room to catchup with log messages without having to freeze journald for that too. I filed https://bugzilla.redhat.com/show_bug.cgi?id=857385 requesting a new sockopt so that we can queue much more messages in the syslog socket before we start droppping things. b) I'll change journald to count how many messages we drop and write a message about it as first thing into syslog when syslog takes messages again. Both together should make delivery much more reliable while still making things robust and introspectable. b) is now fixed in git, and soon in F18. systemd-190-1.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/systemd-190-1.fc18 Package systemd-191-2.fc18, rtkit-0.11-3.fc18: * should fix your issue, * was pushed to the Fedora 18 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing systemd-191-2.fc18 rtkit-0.11-3.fc18' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-14581/rtkit-0.11-3.fc18,systemd-191-2.fc18 then log in and leave karma (feedback). Package glibc-2.16-17.fc18, systemd-192-1.fc18, selinux-policy-3.11.1-23.fc18, rtkit-0.11-3.fc18: * should fix your issue, * was pushed to the Fedora 18 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing glibc-2.16-17.fc18 systemd-192-1.fc18 selinux-policy-3.11.1-23.fc18 rtkit-0.11-3.fc18' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-14581/selinux-policy-3.11.1-23.fc18,rtkit-0.11-3.fc18,systemd-192-1.fc18,glibc-2.16-17.fc18 then log in and leave karma (feedback). Package glibc-2.16-17.fc18, rtkit-0.11-3.fc18, systemd-193-1.fc18: * should fix your issue, * was pushed to the Fedora 18 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing glibc-2.16-17.fc18 rtkit-0.11-3.fc18 systemd-193-1.fc18' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-14581/rtkit-0.11-3.fc18,systemd-193-1.fc18,glibc-2.16-17.fc18 then log in and leave karma (feedback). |