Description of problem: logging-fluentd not using output-ops-extra-localfile.conf after update from v3.6.173.0.21 to v3.6.173.0.49. The logs where not written to the file inside the fluentd pod. Version-Release number of selected component (if applicable): -OCP v3.6 -RHEL 7.4 How reproducible: Steps to Reproduce: 1. Depoly the logging project using ansible playbook files. 2. Initially do not use the variables 'openshift_logging_es_ops_host=logging-es-ops','openshift_logging_use_ops=true' in the inventory file 3. Once deployed check the environment variable value ES_HOST and OPS_HOST, both has the same value. Actual results: Due to same value the fluentd-ops file is not getting created inside the fluentd pod. $ oc rsh logging-fluentd-jvt5h sh-4.2# ls -ltr /var/fluentd-out/ total 3584 -rw-r--r--. 1 root root 1710120 Nov 29 13:25 fluentd.20171129.b55f1e3152b0f0b44 Expected results: Both the files must get created inside the fluentd pod. sh-4.2# ls -l /var/fluentd-out/ total 1792 -rw-r--r--. 1 root root 957428 Nov 29 03:43 fluentd-ops.20171129.b55f1b0aabe17a1c6 -rw-r--r--. 1 root root 235110 Nov 29 03:43 fluentd.20171129.b55f1b0aacb5bda36 Additional info: The configmap file is attached. - apiVersion: v1 data: fluent.conf: | # This file is the fluentd configuration entrypoint. Edit with care. @include configs.d/openshift/system.conf # In each section below, pre- and post- includes don't include anything initially; # they exist to enable future additions to openshift conf as needed. ## sources ## ordered so that syslog always runs last... @include configs.d/openshift/input-pre-*.conf @include configs.d/dynamic/input-docker-*.conf @include configs.d/dynamic/input-syslog-*.conf @include configs.d/openshift/input-post-*.conf ## <label @INGRESS> ## filters @include configs.d/openshift/filter-pre-*.conf @include configs.d/openshift/filter-retag-journal.conf @include configs.d/openshift/filter-k8s-meta.conf @include configs.d/openshift/filter-kibana-transform.conf @include configs.d/openshift/filter-k8s-flatten-hash.conf @include configs.d/openshift/filter-k8s-record-transform.conf @include configs.d/openshift/filter-syslog-record-transform.conf @include configs.d/openshift/filter-viaq-data-model.conf @include configs.d/openshift/filter-post-*.conf ## </label> <label @OUTPUT> ## matches @include configs.d/openshift/output-pre-*.conf @include configs.d/openshift/output-operations.conf @include configs.d/openshift/output-applications.conf # no post - applications.conf matches everything left ## </label> output-extra-localfile.conf: | <store> @type file path /var/fluentd-out/fluentd format json time_slice_format %Y%m%d time_slice_wait 1m buffer_chunk_limit 256m time_format %Y%m%dT%H:%M:%S%z compress gzip utc </store> output-ops-extra-localfile.conf: | <store> @type file path /var/fluentd-out/fluentd-ops format json time_slice_format %Y%m%d time_slice_wait 1m buffer_chunk_limit 256m time_format %Y%m%dT%H:%M:%S%z compress gzip utc </store> secure-forward.conf: | # @type secure_forward # self_hostname ${HOSTNAME} # shared_key <SECRET_STRING> # secure yes # enable_strict_verification yes # ca_cert_path /etc/fluent/keys/your_ca_cert # ca_private_key_path /etc/fluent/keys/your_private_key # for private CA secret key # ca_private_key_passphrase passphrase # <server> # or IP # host server.fqdn.example.com # port 24284 # </server> # <server> # ip address to connect # host xxx.xx.xx.x # specify hostlabel for FQDN verification if ipaddress is used for host # hostlabel server.fqdn.example.com # </server> throttle-config.yaml: | # Logging example fluentd throttling config file #example-project: # read_lines_limit: 10 # #.operations: # read_lines_limit: 100 kind: ConfigMap metadata: creationTimestamp: null name: logging-fluentd kind: List metadata: {}
The problem lays in the fact that once the daemonset/logging-fluent exists it is not updated or replaced (not even the env variables) as seen here: https://github.com/openshift/openshift-ansible/blob/release-3.6/roles/openshift_logging_fluentd/tasks/main.yaml#L154-L186 Therefore if the aggregated logging is deployed without OPS cluster and later on with the OPS cluster (i.e. `openshift_logging_use_ops=true`) the OPS_HOST env variable will remain with value `logging-es`. That will cause the fluentd start script to consider OPS is not deployed using the filter-post-z-retag-one.conf instead of the filter-post-z-retag-two.conf The consequence is that all logs (ops and non-ops) will go to the non-ops outputs, ignoring the ops ones.
verified with openshift3/logging-fluentd/images/v3.6.173.0.83-2 After added ops stack 1)The fluentd Environment OPS_HOST=logging-es-ops 2)The filter-post-z-retag-one.conf was replaced with filter-post-z-retag-two.conf The following code are added to filter system level log to ops es stack. <match journal.** system.var.log** **_default_** **_openshift_** **_openshift-infra_**> @type rewrite_tag_filter @label @OUTPUT rewriterule1 message .+ output_ops_tag rewriterule2 message !.+ output_ops_tag </match> 3) The kibana can view the projects logs and prior operations logs. The kibana-ops can view post operations logs
(In reply to Ruben Romero Montes from comment #2) > The problem lays in the fact that once the daemonset/logging-fluent exists > it is not updated or replaced (not even the env variables) as seen here: > > https://github.com/openshift/openshift-ansible/blob/release-3.6/roles/ > openshift_logging_fluentd/tasks/main.yaml#L154-L186 > > Therefore if the aggregated logging is deployed without OPS cluster and > later on with the OPS cluster (i.e. `openshift_logging_use_ops=true`) the > OPS_HOST env variable will remain with value `logging-es`. Hi @Ruben, I revisited your comment #c2 and am worried that the customer's case may not be addressed by the PR #774. The customer's system is configured with OPS, but the both application logs and the system logs are both sent to the same Elasticsearch logging-es. Right? Now I wonder 1) deploying with no ops, then 2) redeploying with ops by ansible having `openshift_logging_use_ops=true`, but OPS_HOST value remains `logging-es` is the problem? The customer expects it's updated to `logging-ops-es`, but it did not happen? Thanks.
(In reply to Anping Li from comment #5) Thank you Anping, for the verification. I'd assume the behaviour is acceptable for the customer.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0113