Bug 1519679
Summary: | logging-fluentd not using output-ops-extra-localfile.conf after update from v3.6.173.0.21 to v3.6.173.0.49. | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jatan Malde <jmalde> |
Component: | Logging | Assignee: | Noriko Hosoi <nhosoi> |
Status: | CLOSED ERRATA | QA Contact: | Anping Li <anli> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 3.6.1 | CC: | aos-bugs, nhosoi, rmeggins, ronny.pettersen, rromerom |
Target Milestone: | --- | ||
Target Release: | 3.6.z | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause: There was a logic error in the fluentd startup script and when an ops cluster was first disabled then enabled, the proper ops configuration file was not enabled.
Consequence: Sub configuration files starting with output-ops-extra- did not have a chance to be called from the ops configuration file.
Fix: The logic error was fixed.
Result: When an ops cluster is first disabled then enabled, the proper ops configuration file is enabled and its sub configuration files are also enabled.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-01-23 17:58:09 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jatan Malde
2017-12-01 07:32:22 UTC
The problem lays in the fact that once the daemonset/logging-fluent exists it is not updated or replaced (not even the env variables) as seen here: https://github.com/openshift/openshift-ansible/blob/release-3.6/roles/openshift_logging_fluentd/tasks/main.yaml#L154-L186 Therefore if the aggregated logging is deployed without OPS cluster and later on with the OPS cluster (i.e. `openshift_logging_use_ops=true`) the OPS_HOST env variable will remain with value `logging-es`. That will cause the fluentd start script to consider OPS is not deployed using the filter-post-z-retag-one.conf instead of the filter-post-z-retag-two.conf The consequence is that all logs (ops and non-ops) will go to the non-ops outputs, ignoring the ops ones. verified with openshift3/logging-fluentd/images/v3.6.173.0.83-2 After added ops stack 1)The fluentd Environment OPS_HOST=logging-es-ops 2)The filter-post-z-retag-one.conf was replaced with filter-post-z-retag-two.conf The following code are added to filter system level log to ops es stack. <match journal.** system.var.log** **_default_** **_openshift_** **_openshift-infra_**> @type rewrite_tag_filter @label @OUTPUT rewriterule1 message .+ output_ops_tag rewriterule2 message !.+ output_ops_tag </match> 3) The kibana can view the projects logs and prior operations logs. The kibana-ops can view post operations logs (In reply to Ruben Romero Montes from comment #2) > The problem lays in the fact that once the daemonset/logging-fluent exists > it is not updated or replaced (not even the env variables) as seen here: > > https://github.com/openshift/openshift-ansible/blob/release-3.6/roles/ > openshift_logging_fluentd/tasks/main.yaml#L154-L186 > > Therefore if the aggregated logging is deployed without OPS cluster and > later on with the OPS cluster (i.e. `openshift_logging_use_ops=true`) the > OPS_HOST env variable will remain with value `logging-es`. Hi @Ruben, I revisited your comment #c2 and am worried that the customer's case may not be addressed by the PR #774. The customer's system is configured with OPS, but the both application logs and the system logs are both sent to the same Elasticsearch logging-es. Right? Now I wonder 1) deploying with no ops, then 2) redeploying with ops by ansible having `openshift_logging_use_ops=true`, but OPS_HOST value remains `logging-es` is the problem? The customer expects it's updated to `logging-ops-es`, but it did not happen? Thanks. (In reply to Anping Li from comment #5) Thank you Anping, for the verification. I'd assume the behaviour is acceptable for the customer. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0113 |