Description of problem: EO currently sets shard allocation to "none" when doing a cert restart, unlike the other restarts that use "primaries". Version-Release number of selected component (if applicable): 4.5 How reproducible: Always Steps to Reproduce: 1. Trigger cert redeployment 2. Check EO status while it is restarting Actual results: Shard allocation is set to "none" Expected results: Shard allocation should be set to "primaries" Additional info:
The EO reports the message 'Unable to set shard allocation to primaries'[1]. It also reports 'Timed out waiting for elasticsearch-cdm-g56b2tbr-xxx to leave the cluster" [2]. Have we changed the logic on ES cluster upgrading? Do we still need to shard allocation to primaries steps? [1] time="2020-05-30T12:58:02Z" level=info msg="Beginning full cluster restart for cert redeploy on elasticsearch" time="2020-05-30T12:58:02Z" level=warning msg="Unable to set shard allocation to primaries: Put https://elasticsearch.openshift-logging.svc:9200/_cluster/settings: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"openshift-cluster-logging-signer\")" time="2020-05-30T12:58:02Z" level=warning msg="Unable to perform synchronized flush: Post https://elasticsearch.openshift-logging.svc:9200/_flush/synced: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"openshift-cluster-logging-signer\")" time="2020-05-30T12:58:02Z" level=warning msg="Unable to get cluster size prior to restart for elasticsearch-cdm-g56b2tbr-1" time="2020-05-30T12:58:02Z" level=warning msg="Unable to get cluster size prior to restart for elasticsearch-cdm-g56b2tbr-2" time="2020-05-30T12:58:02Z" level=warning msg="Unable to get cluster size prior to restart for elasticsearch-cdm-g56b2tbr-3" time="2020-05-30T12:58:02Z" level=warning msg="Unable to list existing templates in order to reconcile stale ones: Get https://elasticsearch.openshift-logging.svc:9200/_template: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"openshift-cluster-logging-signer\")" time="2020-05-30T12:58:52Z" level=info msg="Kibana status successfully updated" [2] time="2020-05-30T12:59:04Z" level=info msg="Timed out waiting for elasticsearch-cdm-g56b2tbr-1 to leave the cluster" time="2020-05-30T12:59:22Z" level=info msg="skipping deleting kibana 5 image because kibana 6 installed" time="2020-05-30T12:59:52Z" level=info msg="skipping kibana migrations: no index \".kibana\" available" time="2020-05-30T12:59:52Z" level=info msg="Kibana status successfully updated" time="2020-05-30T13:00:22Z" level=info msg="skipping deleting kibana 5 image because kibana 6 installed" time="2020-05-30T13:00:48Z" level=info msg="Timed out waiting for elasticsearch-cdm-g56b2tbr-2 to leave the cluster" time="2020-05-30T13:00:52Z" level=info msg="skipping kibana migrations: no index \".kibana\" available" time="2020-05-30T13:00:52Z" level=info msg="Kibana status successfully updated" time="2020-05-30T13:01:22Z" level=info msg="skipping deleting kibana 5 image because kibana 6 installed" time="2020-05-30T13:01:52Z" level=info msg="skipping kibana migrations: no index \".kibana\" available" time="2020-05-30T13:01:52Z" level=info msg="Kibana status successfully updated" time="2020-05-30T13:02:20Z" level=info msg="Timed out waiting for elasticsearch-cdm-g56b2tbr-3 to leave the cluster" time="2020-05-30T13:02:23Z" level=info msg="skipping deleting kibana 5 image because kibana 6 installed" time="2020-05-30T13:02:53Z" level=info msg="skipping kibana migrations: no index \".kibana\" available" time="2020-05-30T13:02:53Z" level=info msg="Kibana status successfully updated" time="2020-05-30T13:05:24Z" level=info msg="Waiting for cluster to complete recovery: yellow / green" time="2020-05-30T13:05:25Z" level=info msg="Waiting for cluster to complete recovery: yellow / green" time="2020-05-30T13:05:53Z" level=info msg="Waiting for cluster to complete recovery: yellow / green" time="2020-05-30T13:05:54Z" level=info msg="skipping deleting kibana 5 image because kibana 6 installed" time="2020-05-30T13:05:54Z" level=info msg="Waiting for cluster to complete recovery: yellow / green"
Move to verified, as the ES wasn't moved to None during cert regeneration. #oc get csv NAME DISPLAY VERSION REPLACES PHASE clusterlogging.4.5.0-202006032057 Cluster Logging 4.5.0-202006032057 clusterlogging.4.4.0-202006011837 Succeeded elasticsearch-operator.4.5.0-202006031723 Elasticsearch Operator 4.5.0-202006031723 elasticsearch-operator.4.4.0-202006011837 Succeeded
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409