Description of problem: I patched our 3.6 to the latest errata(RHBA-2017:3049 - OpenShift Container Platform 3.6.173.0.49) and something seems off with logging. We setup a secure forwarder which forwards our logs to splunk which has been working fine but after this upgrade a couple things happened. The configmaps I modified seemed to have been wiped out which has never happened before during and upgrade. It seems like fluentd is writing a lot of logs to stdout and seems like the metadata around container logs is gone like kubernetes_namespace, pod name , container name, etc..... Please let us know if any further details are required. Version-Release number of selected component (if applicable): OCP 3.6 How reproducible: Always Steps to Reproduce: 1.As mentioned in description 2. 3. Actual results: Expected results: Additional info:
(In reply to Miheer Salunke from comment #0) > Description of problem: > > I patched our 3.6 to the latest errata(RHBA-2017:3049 - OpenShift Container > Platform 3.6.173.0.49) and something seems off with logging. We setup a > secure forwarder which forwards our logs to splunk which has been working > fine but after this upgrade a couple things happened. > > The configmaps I modified seemed to have been wiped out which has never > happened before during and upgrade. Which configmaps did you modify, and what were those modifications? > > It seems like fluentd is writing a lot of logs to stdout Like what? > and seems like the > metadata around container logs is gone like kubernetes_namespace, pod name , > container name, etc..... Can you provide example Elasticsearch searches to demonstrate this e.g. https://docs.google.com/document/d/1MHvHwVSkkO5ohus2Pl3aFcvxXfSAY7qVEblIM1xgcXk/edit#heading=h.c0kdwi7yimxo > > Please let us know if any further details are required. https://github.com/openshift/origin-aggregated-logging/blob/master/hack/logging-dump.sh > > Version-Release number of selected component (if applicable): > OCP 3.6 > > How reproducible: > Always > > Steps to Reproduce: > 1.As mentioned in description > 2. > 3. > > Actual results: > > > Expected results: > > > Additional info:
@Eric are we preserving configmaps at all now?
That is planned as a 3.8 feature... For now a customer can set the contents of their specific config files to one of the following variables to maintain it: fluentd_config_contents fluentd_throttle_contents fluentd_secureforward_contents
We should now be preserving the configmap changes in 3.9
(In reply to ewolinet from comment #3) > That is planned as a 3.8 feature... > > For now a customer can set the contents of their specific config files to > one of the following variables to maintain it: > > fluentd_config_contents > fluentd_throttle_contents > fluentd_secureforward_contents Can you explain what you mean by this? These variables are not documented anywhere I can find, and customer is looking for a way to preserve configmaps in their upgrade to 3.7
(In reply to Steven Walter from comment #5) > Can you explain what you mean by this? These variables are not documented > anywhere I can find, and customer is looking for a way to preserve > configmaps in their upgrade to 3.7 Sure.. we don't document those variables since they could cause a misconfiguration within your cluster in situations where we are provided necessary configmap changes. They are commented out in the bottom of the defaults/main.yml for the openshift-logging role [1]. Those variables each correspond with a file within the fluentd configmap data section. The intent is that you set the value of the variable equal the contents of the configmap section and the installer would use the variable contents instead of the files provided within the role when building the configmap for doing `oc apply` with. We phased those out for 3.9 in favor of patching the configmap changes on an existing system onto the files we provide. At this time there isn't a plan to backport it, but it doesn't seem unreasonable to do that so the variables linked below are not needed. [1] https://github.com/openshift/openshift-ansible/blob/release-3.7/roles/openshift_logging/defaults/main.yml#L179
The configmap are overwritten when redeploy logging with openshift3/ose-ansible/images/v3.7.40-1.
@anli, can you please provide the process you used to test this for 3.7.40? I am unable to verify this locally with the following steps -- 1) Deploy logging 2) oc edit configmap/logging-fluentd 3) Rerun openshift-logging.yml playbook 4) Check contents of configmap/logging-fluentd I see the contents are preserved from what I manually edited them to be after rerunning the playbook.
@ewolinet, The root causes may be the logging_namespaces. I am using a different namespace openshift-logging. 1. deploy logging. openshift_logging_es_pvc_dynamic=true openshift_logging_es_number_of_shards=1 openshift_logging_es_number_of_replicas=0 openshift_logging_es_memory_limit=2Gi openshift_logging_es_cluster_size=1 openshift_logging_purge_logging=true openshift_logging_namespace=openshift-logging openshift_logging_install_logging=true 2. Enable throttle-config.yaml in logging-fluentd configmap 3. Redeploy logging with same inventory file.
Thanks @anli, I can recreate this when using a different namespace. I will have a PR today to resolve this.
https://github.com/openshift/openshift-ansible/pull/7703
The fix haven't been merged to ose-ansible/images/v3.7.42-2
Verified on logging-curator-v3.7.61-1 logging-elasticsearch-v3.7.61-1 logging-fluentd-v3.7.61-1
(In reply to Qiaoling Tang from comment #23) > Verified on logging-curator-v3.7.61-1 > logging-elasticsearch-v3.7.61-1 > logging-fluentd-v3.7.61-1 Verified on ose-ansible v3.7.61
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2337