Bug 2180883
| Summary: | rsyslog stops sending logs to elasticsearch | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Darin Sorrentino <dsorrent> |
| Component: | openstack-tripleo-heat-templates | Assignee: | Martin Magr <mmagr> |
| Status: | ON_DEV --- | QA Contact: | Leonid Natapov <lnatapov> |
| Severity: | high | Docs Contact: | mgeary <mgeary> |
| Priority: | high | ||
| Version: | 17.0 (Wallaby) | CC: | cjanisze, jelynch, lmadsen, mburns, mmagr, mrunge, pgrist |
| Target Milestone: | z2 | Keywords: | Triaged |
| Target Release: | 17.1 | Flags: | astillma:
needinfo?
(mmagr) pgrist: needinfo? (mmagr) jelynch: needinfo? (mmagr) |
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Known Issue | |
| Doc Text: |
Currently, Logrotate archives all log files once a day and Rsyslog stops sending logs to Elasticsearch
Workaround: Add "RsyslogReopenOnTruncate: true" to your environment file during deployment so that Rsyslog reopens all log files on log rotation.
Currently, RHOSP 17.1 uses an older puppet-rsyslog module with an incorrectly configured Rsyslog. Workaround: Manually apply patch [1] in `/usr/share/openstack-tripleo-heat-templates/deployment/logging/rsyslog-container-puppet.yaml` before deployment to configure Rsyslog correctly.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
The proposed upstream patch fails on the gate, Martin can you please take a look? Failed QA. There is newer puppet-rsyslog in upstream,so usage of rsyslog::config class parameter works there, but downstream we need to use rsyslog::server instead. A workaround exists, moving this to z1 Hi Martin, The target milestone is z2, but can we include this doc text as a Known Issue in 17.1 GA? And then update it to a Bug Fix in z2? The workaround suggests applying apply "patch [1]" but I don't see a link for [1]. Is it available downstream for 17.1 GA? Original doc text: Issue one: Cause: Logrotate archives all log files once a day. Consequence: Rsyslog stops sending logs to Elasticsearch Workaround (if any): Add "RsyslogReopenOnTruncate: true" to environment file during deployment. Result: Rsyslog reopens all log files on log rotation. Issue two: Cause: OSP-17.1 uses older puppet-rsyslog module Consequence: Rsyslog is incorrectly connfigured Workaround: Manually apply patch [1] in /usr/share/openstack-tripleo-heat-templates/deployment/logging/rsyslog-container-puppet.yaml before deployment Result: Rsyslog is configured correctly New doc text: Currently, Logrotate archives all log files once a day and Rsyslog stops sending logs to Elasticsearch Workaround: Add "RsyslogReopenOnTruncate: true" to your environment file during deployment so that Rsyslog reopens all log files on log rotation. Currently, RHOSP 17.1 uses an older puppet-rsyslog module with an incorrectly configured Rsyslog. Workaround: Manually apply patch [1] in `/usr/share/openstack-tripleo-heat-templates/deployment/logging/rsyslog-container-puppet.yaml` before deployment to configure Rsyslog correctly. |
Description of problem: After a stack update/deployment, logs appear to be sent to elasticsearch fine. Sometime over the next 24 hours, they stop for no apparently reason. Restarting rsyslog has no impact. Focused on contoller-0 to troubleshoot the issue. Noticed the following message at startup: imfile: no working or state file directory set, imfile will create state files in the current working directory Checked the default config for rsyslog and it shows it should be using /var/spool/rsyslog, however that directory was empty. Digging into the tripleo rsyslog-container-pupper.yaml I saw this: - name: create persistent state directory for rsyslog file: path: /var/lib/rsyslog.container state: directory setype: container_file_t Added line: global(workDirectory="/var/lib/rsyslog") to 50_openstack_logs.conf and restarted the rsyslog container. This resulted in imstate files being created in /var/lib/rsyslog and now caused the following message to show up multiple times in the podman logs: rsyslogd: imfile error: message received is larger than max msg size; message will be split and processed as another message [v8.2102.0-101.el9_0.1] Added the following directive to rsyslog.conf: $MaxMessageSize 8k Restarted the container and the logs finally started showing up in elastic search. Made the same changes to controller-1 and controller-2. After a 24 hour period, controller-0 continues to send logs to elasticsearch. The other 2 controllers have stopped again and I am unsure why. Restarting them seems to have no impact. Version-Release number of selected component (if applicable): Environment recently upgraded to 17.0.1, however this was not working on 17.0 either. How reproducible: 100% Steps to Reproduce: 1. Configure rsyslog to send to elasticsearch 2. Wait 24 hours 3. Actual results: Logs stop showing in elasticsearch Expected results: Logs show in elasticsearch Additional info: