Bug 1292447
Summary: | journald stop logging | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Kevin Cousin <kevin> | |
Component: | systemd | Assignee: | systemd-maint | |
Status: | CLOSED ERRATA | QA Contact: | Branislav Blaškovič <bblaskov> | |
Severity: | high | Docs Contact: | ||
Priority: | urgent | |||
Version: | 7.2 | CC: | bblaskov, dror, ealcaniz, ellis, fkrska, giany007, grenier, jaearick, jaroslaw.polok, jerome.le-tanou, jscalf, jscotka, kevin, knappch, kompastver, mathieu+redhat, mdshaikh, mmatsuya, morozsm, msekleta, ochalups, pasik, pdwyer, qe-baseos-daemons, rcernin, rhbugzilla, rmeggins, rpiddapa, rsawhill, sknauss, ssahani, systemd-maint-list, systemd-maint, tbowling, tfrazier, tis, wholesale, wliu | |
Target Milestone: | rc | Keywords: | OtherQA, ZStream | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | systemd-219-21.el7 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1331339 1363687 (view as bug list) | Environment: | ||
Last Closed: | 2016-11-04 00:48:53 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1203710, 1289485, 1313485, 1331339, 1363687 |
Description
Kevin Cousin
2015-12-17 13:38:20 UTC
The issue appears after upgrade to 7.2. Here is dmesg output : [92658.758413] systemd-journald[17718]: Failed to write entry (21 items, 547 bytes), ignoring: Cannot assign requested address [92658.818970] systemd-journald[17718]: Failed to write entry (21 items, 888 bytes), ignoring: Cannot assign requested address [92658.834527] systemd-journald[17718]: Failed to write entry (21 items, 594 bytes), ignoring: Cannot assign requested address [92658.854182] systemd-journald[17718]: Failed to write entry (21 items, 847 bytes), ignoring: Cannot assign requested address [92658.870043] systemd-journald[17718]: Failed to write entry (21 items, 580 bytes), ignoring: Cannot assign requested address [92658.875305] systemd-journald[17718]: Failed to write entry (21 items, 895 bytes), ignoring: Cannot assign requested address Hello, We have the same problem with an up-to-date RHEL 7.2 Server. This server hosts a proxy server (squid-3.3.8) and we send all access log to syslog (rsyslog-7.4.7). After a while, we get no more log and instead we have thousands of journald error lines : --------------------------------------- #dmesg ... [333311.366625] systemd-journald[810]: Failed to write entry (20 items, 616 bytes), ignoring: Cannot assign requested address [333311.401662] systemd-journald[810]: Failed to write entry (20 items, 633 bytes), ignoring: Cannot assign requested address [333311.476621] systemd-journald[810]: Failed to write entry (20 items, 615 bytes), ignoring: Cannot assign requested address ... --------------------------------------- --------------------------------------- # systemctl status systemd-journald ● systemd-journald.service - Journal Service Loaded: loaded (/usr/lib/systemd/system/systemd-journald.service; static; vendor preset: disabled) Active: active (running) since Thu 2015-12-17 14:06:59 CET; 3 days ago Docs: man:systemd-journald.service(8) man:journald.conf(5) Main PID: 810 (systemd-journal) Status: "Processing requests..." CGroup: /system.slice/systemd-journald.service └─810 /usr/lib/systemd/systemd-journald Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable. --------------------------------------- --------------------------------------- # journalctl --verify 4f7160: invalid object File corruption detected at /run/log/journal/54ddca845fdd4b4c95e9da7b01a9b28f/system.journal:4f7160 (of 8388608 bytes, 62%). FAIL: /run/log/journal/54ddca845fdd4b4c95e9da7b01a9b28f/system.journal (Cannot assign requested address) PASS: /run/log/journal/54ddca845fdd4b4c95e9da7b01a9b28f/system ... --------------------------------------- If we removed the corrupted file and restart the service, we receive again logs but the problem appears again after a while Do you have an idea to permanently correct this dysfunction ? Thank you. Bonjour ;-) In the meantime, I am using the following workaround on my servers /etc/rsyslog.conf #$ModLoad imjournal # provides access to the systemd journal $ModLoad imklog # reads kernel messages (the same are read from journald) #$OmitLocalLogging on #$IMJournalStateFile imjournal.state /etc/systemd/journald.conf [Journal] #Storage=none Storage=persistent ForwardToSyslog=yes If you choose Storage=none, journald will not try to log anything itself. Use Storage=persistent if you still want to see if journald is ok or not. The previous workaround did not help in my case. I had to revert systemd packages + rsyslog. Still this is not a proper solution. I find this issue critical. This bug is a major pain for us and I hope it get fixed soon. It seems to happen most often on our web servers (which generate a lot of syslog traffic). I have found no way to get rsyslog restarted and working again without a reboot. The usual things of "systemctl rsyslog restart" and "systemctl daemon-reload" do not work. Any way to recover without a reboot? More detailed steps to reproduce would be very appreciated. I can't reproduce on up2date RHEL-7.2. I recently tried turning compression off in /etc/systemd/journald.conf. After deleting all the journal files and restarting journald the problem went away. I've backported possible fix for this issue. In case anyone is willing to test, fell free to grab these test rpms. http://people.redhat.com/~msekleta/systemd-219-19.el7.0.bz1292447.0.src.rpm/ I ran Michal's rpms yesterday for a while on a test box (you have to download and update all of the rpms), and the system ran fine. But it puts out very little syslogging, so this is not a fair test. Redhat put out some systemd patches yesterday, probably related to the glibc bug: systemd.x86_64 219-19.el7_2.4 systemd-libs.x86_64 219-19.el7_2.4 systemd-sysv.x86_64 219-19.el7_2.4 I would guess that Michal's fixes are NOT in these updates, correct? > Redhat put out some systemd patches yesterday, probably related to the glibc
> bug:
>
> systemd.x86_64 219-19.el7_2.4
> systemd-libs.x86_64 219-19.el7_2.4
> systemd-sysv.x86_64 219-19.el7_2.4
>
> I would guess that Michal's fixes are NOT in these updates, correct?
There was nothing journal or glibc related in that update.
One of my server is running Michal's rpms since yesterday. There are a lot of syslogging and the journal is still ok. I installed Michal's rpm last week and my logs looks fine. Thanks for providing test results. Based on your input I am granting devel_ack for 7.3. Michal's patch seem to solve the issue on my side too. FWIW, I have noticed a correlation between oom-killer and rsyslog problems a couple of hours later. We run coldfusion on some web servers, which uses java, which eats memory. Due to memory leaks, oom-killer has to whack java every so often. systemctl will autorestart coldfusion. After that, rsyslog will fail within a few hours. pushed to staging https://github.com/lnykryn/systemd-rhel/commit/f45b66a348f5778bd391ad1b0a0e09bf5789b415?bz=1292447 https://github.com/lnykryn/systemd-rhel/commit/d205f5f85569e2dddca96362ce2db4e2a0b99d00 -> post This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions *** Bug 1276563 has been marked as a duplicate of this bug. *** When does Redhat plan to release an official patch for this bug? It is still killing me and I don't want to run the previously available rpms on critical production systems. I was hoping a real patch would be out by now.... There is a new leak in 219-19.el7_2.7. I installed the patched RPMs linked above and the problem did not go away. I don't know how to tell you to reproduce it, it's a tiny vps with 128 MB of ram, and systemd-journald will slowly eat up that ram fully. If I kill the process or use systemctl to restart it, the memory usage goes back down to normal, and then slowly goes back up. I don't know what you'd need from me to help with this, but if you let me know how, I can do it (or at least find out how). Had the same issue. Workaround worked for me: echo "Compress=no" >> /etc/systemd/journald.conf systemctl restart systemd-journald; systemctl restart rsyslog.service Basically, disabled compression. Because we can't reproduce this problem internally we need more debugging information from a customer. First I'd start with stracing journald. Please start following command, strace -o journal-strace.log -tt -p $(pidof systemd-journald) and in new terminal run "logger foobar" 3 times, then Ctrl-C strace and upload journal-strace.log. From looking to the list of installed rpms, I would guess that there could be some conflict between journal and some 3rd party security software. Is it possible to try the system with those services turned off? I've exactly the same problem on one of my nodes: it stops logging and dmesg is full of "systemd-journald[17718]: Failed to write entry (21 items, 895 bytes), ignoring: Cannot assign requested address". I've tried to turn of the compression, however it doesn't work for me. I've made strace as Michal Sekletar suggested, plus lsof and more verbose strace, look here: https://gist.github.com/alvelcom/d3b9e8f201db607018b8736f7688b6aa --------------------------------------- $ journalctl --verify 2e227a8: invalid object File corruption detected at /run/log/journal/56745016c55f4dae8b17425e32eba949/system.journal:2e227a8 (of 50331648 bytes, 96%). FAIL: /run/log/journal/56745016c55f4dae8b17425e32eba949/system.journal (Cannot assign requested address) 7fffee8: unused data (entry_offset==0) PASS: /run/log/journal/56745016c55f4dae8b17425e32eba949/system PASS: /run/log/journal/56745016c55f4dae --------------------------------------- I deleted /run/log/journal/56745016c55f4dae8b17425e32eba949/system.journal, restarted systemd-journald, and after this actions journald seems returing to work. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2216.html Dropping the stale needinfo. If our input is still needed, please set the needinfo again. Dropping the stale needinfo. If our input is still needed, please set the needinfo again. |