Description of problem: During upgrade from 4.4 -> 4.5, if CLO upgrades before EO does, fluentd will be updated to write to a '*-write' endpoint which should be an alias but if EO has not yet created the alias then fluent will cause a write index to be created instead. Then when EO upgrades this can cause issues with index management or loss of data that had been written by fluentd already. Version-Release number of selected component (if applicable): 4.4 -> 4.5 How reproducible: Always Steps to Reproduce: 1. Upgrade CLO 2. Check ES indices Actual results: Fluentd causes write index to be created, incorrectly Expected results: Fluentd should wait until the alias is in place and then proceed to push its logs Additional info:
1) should be export the PATH ruby and Library libruby.so.2.5 $oc logs fluentd-k6t8c -c fluentd-init ./wait_for_es_version.sh: line 3: ruby: command not found $docker run -it --entrypoint /opt/rh/rh-ruby25/root/usr/bin/ruby ose-logging-fluentd:v4.5.0 --help /opt/rh/rh-ruby25/root/usr/bin/ruby: error while loading shared libraries: libruby.so.2.5: cannot open shared object file: No such file or directory 2) wait_for_es_version.sh shouldn't be executed when deploying fluentd only.
Verified on the CI images 1) Upgrade CLO to 4.6. one fluend is Init:CrashLoopBackOff. $oc get pods fluentd-2cxs9 1/1 Running 0 7m42s fluentd-2mkwn 1/1 Running 0 7m42s fluentd-c6vs2 1/1 Running 0 7m42s fluentd-fkdmv 1/1 Running 0 7m42s fluentd-qcvgv 1/1 Running 0 7m42s fluentd-rtcsn 1/1 Running 0 7m42s fluentd-vn8fd 0/1 Init:CrashLoopBackOff 5 4m33s $ oc logs fluentd-vn8fd -c fluentd-init Elasticsearch is currently version: 5.6.16 - Expecting it to be at least: 6 2) Upgrade EO to 4.6. The ES pods are not Ready during upgrade. no data are received, no -write index. 3) After EO upgrade, the infra-000001, app-000001 are created in ES cluster. The fluentd start to upgrade. The doc.count increase in the old indices(.operatation.xxx and project.xxx indices) and new indices(infra-000001 and app-000001).
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196