Description of problem: Upgrade clusterlogging from 4.3 to 4.4, the ES pod stuck in CrashLoopBackOff status. $ oc get pod -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES cluster-logging-operator-5d44b567cf-m57pf 1/1 Running 0 8m11s 10.131.0.31 ip-10-0-60-119.us-east-2.compute.internal <none> <none> elasticsearch-cdm-o0ylrfxl-1-6d79ffc687-f2fn7 1/2 CrashLoopBackOff 6 7m59s 10.131.0.32 ip-10-0-60-119.us-east-2.compute.internal <none> <none> fluentd-5z9tr 1/1 Running 0 6m1s 10.130.0.36 ip-10-0-60-248.us-east-2.compute.internal <none> <none> fluentd-8pk7h 1/1 Running 0 7m29s 10.131.0.33 ip-10-0-60-119.us-east-2.compute.internal <none> <none> fluentd-f4sg8 1/1 Running 0 6m59s 10.128.2.34 ip-10-0-78-116.us-east-2.compute.internal <none> <none> fluentd-r5vmp 1/1 Running 0 6m39s 10.129.0.46 ip-10-0-52-176.us-east-2.compute.internal <none> <none> fluentd-rr65n 1/1 Running 0 6m18s 10.128.0.16 ip-10-0-68-97.us-east-2.compute.internal <none> <none> fluentd-tgnlj 1/1 Running 0 5m37s 10.129.2.18 ip-10-0-54-24.us-east-2.compute.internal <none> <none> kibana-666b9ff487-blhgf 2/2 Running 0 7m50s 10.128.2.33 ip-10-0-78-116.us-east-2.compute.internal <none> <none> $ oc logs -c proxy elasticsearch-cdm-o0ylrfxl-1-6d79ffc687-f2fn7 2020/04/15 02:45:00 provider.go:118: Defaulting client-id to system:serviceaccount:openshift-logging:elasticsearch 2020/04/15 02:45:00 provider.go:123: Defaulting client-secret to service account token /var/run/secrets/kubernetes.io/serviceaccount/token 2020/04/15 02:45:00 provider.go:311: Delegation of authentication and authorization to OpenShift is enabled for bearer tokens and client certificates. 2020/04/15 02:45:00 oauthproxy.go:200: mapping path "/" => upstream "https://localhost:9200/" 2020/04/15 02:45:00 oauthproxy.go:227: OAuthProxy configured for Client ID: system:serviceaccount:openshift-logging:elasticsearch 2020/04/15 02:45:00 oauthproxy.go:237: Cookie settings: name:_oauth_proxy secure(https):true httponly:true expiry:168h0m0s domain:<default> refresh:disabled 2020/04/15 02:45:00 oauthproxy.go:249: WARN: Configured to pass client specified bearer token upstream. Only use this option if you're sure that the upstream will not leak or abuse the given token. Bear in mind that the token could be a long lived token or hard to revoke. 2020/04/15 02:45:00 http.go:61: HTTP: listening on :4180 2020/04/15 02:45:00 http.go:107: HTTPS: listening on [::]:60000 I0415 02:45:00.640136 1 dynamic_serving_content.go:129] Starting serving::/etc/proxy/secrets/tls.crt::/etc/proxy/secrets/tls.key 2020/04/15 02:45:19 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:45:26 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:45:49 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:45:56 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:46:19 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:46:26 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:46:49 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:46:56 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:47:19 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:47:26 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:47:49 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:47:56 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:48:19 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:48:26 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:48:49 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:48:56 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:49:19 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:49:26 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:49:49 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:49:56 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:50:19 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:50:26 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:50:49 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:50:56 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:51:19 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:51:26 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:51:49 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:51:56 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:52:19 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:52:26 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:52:49 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused 2020/04/15 02:52:56 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9200: connect: connection refused $ oc logs -c elasticsearch elasticsearch-cdm-o0ylrfxl-1-6d79ffc687-f2fn7 [2020-04-15 02:50:58,412][INFO ][container.run ] Begin Elasticsearch startup script [2020-04-15 02:50:58,415][INFO ][container.run ] Comparing the specified RAM to the maximum recommended for Elasticsearch... [2020-04-15 02:50:58,416][INFO ][container.run ] Inspecting the maximum RAM available... [2020-04-15 02:50:58,418][INFO ][container.run ] ES_JAVA_OPTS: ' -Xms1024m -Xmx1024m' [2020-04-15 02:50:58,419][INFO ][container.run ] Copying certs from /etc/openshift/elasticsearch/secret to /etc/elasticsearch/secret [2020-04-15 02:50:58,423][INFO ][container.run ] Building required jks files and truststore Importing keystore /etc/elasticsearch/secret/admin.p12 to /etc/elasticsearch/secret/admin.jks... Entry for alias 1 successfully imported. Import command completed: 1 entries successfully imported, 0 entries failed or cancelled Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/admin.jks -destkeystore /etc/elasticsearch/secret/admin.jks -deststoretype pkcs12". Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/admin.jks -destkeystore /etc/elasticsearch/secret/admin.jks -deststoretype pkcs12". Certificate was added to keystore Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/admin.jks -destkeystore /etc/elasticsearch/secret/admin.jks -deststoretype pkcs12". Importing keystore /etc/elasticsearch/secret/elasticsearch.p12 to /etc/elasticsearch/secret/elasticsearch.jks... Entry for alias 1 successfully imported. Import command completed: 1 entries successfully imported, 0 entries failed or cancelled Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/elasticsearch.jks -destkeystore /etc/elasticsearch/secret/elasticsearch.jks -deststoretype pkcs12". Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/elasticsearch.jks -destkeystore /etc/elasticsearch/secret/elasticsearch.jks -deststoretype pkcs12". Certificate was added to keystore Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/elasticsearch.jks -destkeystore /etc/elasticsearch/secret/elasticsearch.jks -deststoretype pkcs12". Importing keystore /etc/elasticsearch/secret/logging-es.p12 to /etc/elasticsearch/secret/logging-es.jks... Entry for alias 1 successfully imported. Import command completed: 1 entries successfully imported, 0 entries failed or cancelled Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/logging-es.jks -destkeystore /etc/elasticsearch/secret/logging-es.jks -deststoretype pkcs12". Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/logging-es.jks -destkeystore /etc/elasticsearch/secret/logging-es.jks -deststoretype pkcs12". Certificate was added to keystore Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/logging-es.jks -destkeystore /etc/elasticsearch/secret/logging-es.jks -deststoretype pkcs12". Certificate was added to keystore Certificate was added to keystore [2020-04-15 02:51:00,248][INFO ][container.run ] Setting heap dump location /elasticsearch/persistent/heapdump.hprof [2020-04-15 02:51:00,249][INFO ][container.run ] ES_JAVA_OPTS: ' -Xms1024m -Xmx1024m -XX:HeapDumpPath=/elasticsearch/persistent/heapdump.hprof -Dsg.display_lic_none=false -Dio.netty.recycler.maxCapacityPerThread=0 -Dio.netty.allocator.type=unpooled' [2020-04-15 02:51:00,249][INFO ][container.run ] Checking if Elasticsearch is ready OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N Exception in thread "main" java.lang.IllegalArgumentException: Could not resolve placeholder 'POD_IP' at org.elasticsearch.common.settings.PropertyPlaceholder.parseStringValue(PropertyPlaceholder.java:116) at org.elasticsearch.common.settings.PropertyPlaceholder.replacePlaceholders(PropertyPlaceholder.java:69) at org.elasticsearch.common.settings.Settings$Builder.replacePropertyPlaceholders(Settings.java:1109) at org.elasticsearch.common.settings.Settings$Builder.replacePropertyPlaceholders(Settings.java:1069) at org.elasticsearch.node.InternalSettingsPreparer.initializeSettings(InternalSettingsPreparer.java:137) at org.elasticsearch.node.InternalSettingsPreparer.prepareEnvironment(InternalSettingsPreparer.java:115) at org.elasticsearch.cli.EnvironmentAwareCommand.createEnv(EnvironmentAwareCommand.java:75) at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:70) at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:134) at org.elasticsearch.cli.Command.main(Command.java:90) at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:91) at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:84) $ oc logs elasticsearch-operator-6f4994bc74-7rss9 time="2020-04-15T02:44:25Z" level=warning msg="Unable to parse loglevel \"\"" {"level":"info","ts":1586918665.8147256,"logger":"cmd","msg":"Go Version: go1.13.4"} {"level":"info","ts":1586918665.81475,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"} {"level":"info","ts":1586918665.8147547,"logger":"cmd","msg":"Version of operator-sdk: v0.8.2"} {"level":"info","ts":1586918665.8150382,"logger":"leader","msg":"Trying to become the leader."} {"level":"info","ts":1586918665.9269526,"logger":"leader","msg":"Found existing lock","LockOwner":"elasticsearch-operator-5d85b86dbd-hwvgr"} {"level":"info","ts":1586918665.9327946,"logger":"leader","msg":"Not the leader. Waiting."} {"level":"info","ts":1586918667.072788,"logger":"leader","msg":"Not the leader. Waiting."} {"level":"info","ts":1586918669.4545178,"logger":"leader","msg":"Not the leader. Waiting."} {"level":"info","ts":1586918673.993244,"logger":"leader","msg":"Not the leader. Waiting."} {"level":"info","ts":1586918682.6981413,"logger":"leader","msg":"Became the leader."} {"level":"info","ts":1586918682.7784624,"logger":"cmd","msg":"Registering Components."} {"level":"info","ts":1586918682.7786903,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"elasticsearch-controller","source":"kind source: /, Kind="} {"level":"info","ts":1586918682.884639,"logger":"cmd","msg":"failed to create or get service for metrics: services \"elasticsearch-operator\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>"} {"level":"info","ts":1586918682.8846705,"logger":"cmd","msg":"Starting the Cmd."} {"level":"info","ts":1586918682.9849257,"logger":"kubebuilder.controller","msg":"Starting Controller","controller":"elasticsearch-controller"} {"level":"info","ts":1586918683.0851054,"logger":"kubebuilder.controller","msg":"Starting workers","controller":"elasticsearch-controller","worker count":1} time="2020-04-15T02:44:46Z" level=warning msg="Unable to perform synchronized flush: Failed to flush 1 shards in preparation for cluster restart" time="2020-04-15T02:45:57Z" level=info msg="Timed out waiting for elasticsearch-cdm-o0ylrfxl-1 to rejoin cluster" time="2020-04-15T02:45:57Z" level=warning msg="Error occurred while updating node elasticsearch-cdm-o0ylrfxl-1: Node elasticsearch-cdm-o0ylrfxl-1 has not rejoined cluster elasticsearch yet" time="2020-04-15T02:48:59Z" level=info msg="Timed out waiting for elasticsearch-cdm-o0ylrfxl-1 to rejoin cluster" time="2020-04-15T02:52:01Z" level=info msg="Timed out waiting for elasticsearch-cdm-o0ylrfxl-1 to rejoin cluster" time="2020-04-15T02:55:03Z" level=info msg="Timed out waiting for elasticsearch-cdm-o0ylrfxl-1 to rejoin cluster" time="2020-04-15T02:58:05Z" level=info msg="Timed out waiting for elasticsearch-cdm-o0ylrfxl-1 to rejoin cluster" time="2020-04-15T03:01:07Z" level=info msg="Timed out waiting for elasticsearch-cdm-o0ylrfxl-1 to rejoin cluster" time="2020-04-15T03:04:10Z" level=info msg="Timed out waiting for elasticsearch-cdm-o0ylrfxl-1 to rejoin cluster" time="2020-04-15T03:07:12Z" level=info msg="Timed out waiting for elasticsearch-cdm-o0ylrfxl-1 to rejoin cluster" time="2020-04-15T03:10:14Z" level=info msg="Timed out waiting for elasticsearch-cdm-o0ylrfxl-1 to rejoin cluster" $ oc get csv NAME DISPLAY VERSION REPLACES PHASE clusterlogging.4.4.0-202004091036 Cluster Logging 4.4.0-202004091036 clusterlogging.4.3.13-202004131016 Succeeded elasticsearch-operator.4.4.0-202004141917 Elasticsearch Operator 4.4.0-202004141917 elasticsearch-operator.4.3.13-202004131016 Succeeded Version-Release number of selected component (if applicable): ose-elasticsearch-operator-v4.4.0-202004141917 How reproducible: Always Steps to Reproduce: 1. deploy logging 4.3 2. upgrade it to 4.4 3. Actual results: Expected results: Additional info:
Please attach: 1. Elasticsearch CR 2. Elasticsearch deployment yaml 3. Elasticsearch pod yaml 4. Elasticsearch configmap 5. Elasticsearch operator logs This ENV VAR should exist for this release [1][2] [1] https://github.com/openshift/elasticsearch-operator/blob/release-4.4/pkg/k8shandler/common.go#L292 [2] https://github.com/openshift/elasticsearch-operator/blob/release-4.4/pkg/k8shandler/configuration_tmpl.go#L18
If deploy logging 4.4, the `POD_IP` env var could be found in the ES pod, if upgrade from 4.3 to 4.4, there doesn't have this env var.
The issue is the operator is not comparing the ENV VARs for the containers and thus is not rolling out a new deployment
*** Bug 1825056 has been marked as a duplicate of this bug. ***
If the Elasticsearch clusters had been upgraded to 4.4, the ES pods looks like below. you can fix it by deleting the elasticsearch deployments in sequence. NAME READY STATUS RESTARTS AGE cluster-logging-operator-cd9746755-w94h5 1/1 Running 0 54m curator-1587731400-rnmwl 0/1 Completed 0 5m8s elasticsearch-cdm-6o5ww5ha-1-7b77d8bb5-9dh9t 1/2 CrashLoopBackOff 15 54m elasticsearch-cdm-6o5ww5ha-2-87bf69d87-4wsbf 2/2 Running 0 59m elasticsearch-cdm-6o5ww5ha-3-746d6b8f66-g57mp 2/2 Running 0 58m fluentd-2ccqc 1/1 Running 0 53m fluentd-4h2qt 1/1 Running 0 52m fluentd-9jq2l 1/1 Running 1 54m fluentd-g4hzk 1/1 Running 0 54m fluentd-g9cs6 1/1 Running 0 53m fluentd-l4j8j 1/1 Running 0 53m kibana-5cd48f8bcd-xjlj5 2/2 Running 0 54m The workaound: 1) Delete the first CrashLoopBackOff deployment. $oc delete deployment elasticsearch-cdm-6o5ww5ha-1 2) Waiting until the other ES pod CrashLoopBackOff, then delete its deployment. $oc delete deployment elasticsearch-cdm-6o5ww5ha-2
Is the plan to fix this issue for GA? If not, should we add this information to the official update docs? I added a Known Issue to the release notes.
*** Bug 1829064 has been marked as a duplicate of this bug. ***
setting to urgent, this needs to go out in the first 4.4.z.
This PR https://github.com/openshift/elasticsearch-operator/pull/336 resolves this bug
Upgrade failed again: There didn't have any logs indicate the env var changed in the ES deployment, the POD_IP env var wasn't added to the ES deployment. $ oc logs -n openshift-operators-redhat elasticsearch-operator-75c868554d-kwjns time="2020-05-06T01:37:36Z" level=warning msg="Unable to parse loglevel \"\"" {"level":"info","ts":1588729056.00425,"logger":"cmd","msg":"Go Version: go1.13.4"} {"level":"info","ts":1588729056.0042713,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"} {"level":"info","ts":1588729056.0042765,"logger":"cmd","msg":"Version of operator-sdk: v0.8.2"} {"level":"info","ts":1588729056.0045552,"logger":"leader","msg":"Trying to become the leader."} {"level":"info","ts":1588729056.1929462,"logger":"leader","msg":"Found existing lock","LockOwner":"elasticsearch-operator-6bfb774467-fms2f"} {"level":"info","ts":1588729056.200799,"logger":"leader","msg":"Not the leader. Waiting."} {"level":"info","ts":1588729057.3292577,"logger":"leader","msg":"Not the leader. Waiting."} {"level":"info","ts":1588729059.7134275,"logger":"leader","msg":"Not the leader. Waiting."} {"level":"info","ts":1588729064.2511082,"logger":"leader","msg":"Not the leader. Waiting."} {"level":"info","ts":1588729072.9628963,"logger":"leader","msg":"Became the leader."} {"level":"info","ts":1588729073.122422,"logger":"cmd","msg":"Registering Components."} {"level":"info","ts":1588729073.1226575,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"elasticsearch-controller","source":"kind source: /, Kind="} {"level":"info","ts":1588729073.2924252,"logger":"cmd","msg":"failed to create or get service for metrics: services \"elasticsearch-operator\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>"} {"level":"info","ts":1588729073.2924507,"logger":"cmd","msg":"Starting the Cmd."} {"level":"info","ts":1588729073.3927507,"logger":"kubebuilder.controller","msg":"Starting Controller","controller":"elasticsearch-controller"} {"level":"info","ts":1588729073.4928908,"logger":"kubebuilder.controller","msg":"Starting workers","controller":"elasticsearch-controller","worker count":1} time="2020-05-06T01:37:56Z" level=warning msg="Unable to perform synchronized flush: Failed to flush 5 shards in preparation for cluster restart" time="2020-05-06T01:39:01Z" level=info msg="Timed out waiting for elasticsearch-cdm-ldnvhe0o-1 to rejoin cluster" time="2020-05-06T01:39:01Z" level=warning msg="Error occurred while updating node elasticsearch-cdm-ldnvhe0o-1: Node elasticsearch-cdm-ldnvhe0o-1 has not rejoined cluster elasticsearch yet" time="2020-05-06T01:41:34Z" level=info msg="Timed out waiting for elasticsearch-cdm-ldnvhe0o-1 to rejoin cluster" $ oc get pod NAME READY STATUS RESTARTS AGE cluster-logging-operator-5d7cf5b97d-s2cts 1/1 Running 0 2m36s elasticsearch-cdm-ldnvhe0o-1-56f689488f-c2xqn 1/2 CrashLoopBackOff 4 2m29s fluentd-5cwc9 1/1 Running 0 47s fluentd-6zx2d 1/1 Running 0 97s fluentd-cqln8 1/1 Running 0 72s fluentd-l45qt 1/1 Running 0 3s fluentd-n6fdn 1/1 Running 0 2m5s fluentd-rkbrn 1/1 Running 0 26s kibana-578d7ddd66-n87jx 2/2 Running 0 2m21s $ oc logs -c elasticsearch elasticsearch-cdm-ldnvhe0o-1-56f689488f-c2xqn [2020-05-06 01:39:41,586][INFO ][container.run ] Begin Elasticsearch startup script [2020-05-06 01:39:41,599][INFO ][container.run ] Comparing the specified RAM to the maximum recommended for Elasticsearch... [2020-05-06 01:39:41,601][INFO ][container.run ] Inspecting the maximum RAM available... [2020-05-06 01:39:41,603][INFO ][container.run ] ES_JAVA_OPTS: ' -Xms2048m -Xmx2048m' [2020-05-06 01:39:41,605][INFO ][container.run ] Copying certs from /etc/openshift/elasticsearch/secret to /etc/elasticsearch/secret [2020-05-06 01:39:41,609][INFO ][container.run ] Building required jks files and truststore Importing keystore /etc/elasticsearch/secret/admin.p12 to /etc/elasticsearch/secret/admin.jks... Entry for alias 1 successfully imported. Import command completed: 1 entries successfully imported, 0 entries failed or cancelled Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/admin.jks -destkeystore /etc/elasticsearch/secret/admin.jks -deststoretype pkcs12". Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/admin.jks -destkeystore /etc/elasticsearch/secret/admin.jks -deststoretype pkcs12". Certificate was added to keystore Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/admin.jks -destkeystore /etc/elasticsearch/secret/admin.jks -deststoretype pkcs12". Importing keystore /etc/elasticsearch/secret/elasticsearch.p12 to /etc/elasticsearch/secret/elasticsearch.jks... Entry for alias 1 successfully imported. Import command completed: 1 entries successfully imported, 0 entries failed or cancelled Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/elasticsearch.jks -destkeystore /etc/elasticsearch/secret/elasticsearch.jks -deststoretype pkcs12". Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/elasticsearch.jks -destkeystore /etc/elasticsearch/secret/elasticsearch.jks -deststoretype pkcs12". Certificate was added to keystore Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/elasticsearch.jks -destkeystore /etc/elasticsearch/secret/elasticsearch.jks -deststoretype pkcs12". Importing keystore /etc/elasticsearch/secret/logging-es.p12 to /etc/elasticsearch/secret/logging-es.jks... Entry for alias 1 successfully imported. Import command completed: 1 entries successfully imported, 0 entries failed or cancelled Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/logging-es.jks -destkeystore /etc/elasticsearch/secret/logging-es.jks -deststoretype pkcs12". Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/logging-es.jks -destkeystore /etc/elasticsearch/secret/logging-es.jks -deststoretype pkcs12". Certificate was added to keystore Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/logging-es.jks -destkeystore /etc/elasticsearch/secret/logging-es.jks -deststoretype pkcs12". Certificate was added to keystore Certificate was added to keystore [2020-05-06 01:39:43,492][INFO ][container.run ] Setting heap dump location /elasticsearch/persistent/heapdump.hprof [2020-05-06 01:39:43,493][INFO ][container.run ] [2020-05-06 01:39:43,493][INFO ][container.run ] Checking if Elasticsearch is ready ES_JAVA_OPTS: ' -Xms2048m -Xmx2048m -XX:HeapDumpPath=/elasticsearch/persistent/heapdump.hprof -Dsg.display_lic_none=false -Dio.netty.recycler.maxCapacityPerThread=0 -Dio.netty.allocator.type=unpooled' Exception in thread "main" java.lang.IllegalArgumentException: Could not resolve placeholder 'POD_IP' at org.elasticsearch.common.settings.PropertyPlaceholder.parseStringValue(PropertyPlaceholder.java:116) at org.elasticsearch.common.settings.PropertyPlaceholder.replacePlaceholders(PropertyPlaceholder.java:69) at org.elasticsearch.common.settings.Settings$Builder.replacePropertyPlaceholders(Settings.java:1109) at org.elasticsearch.common.settings.Settings$Builder.replacePropertyPlaceholders(Settings.java:1069) at org.elasticsearch.node.InternalSettingsPreparer.initializeSettings(InternalSettingsPreparer.java:137) at org.elasticsearch.node.InternalSettingsPreparer.prepareEnvironment(InternalSettingsPreparer.java:115) at org.elasticsearch.cli.EnvironmentAwareCommand.createEnv(EnvironmentAwareCommand.java:75) at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:70) at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:134) at org.elasticsearch.cli.Command.main(Command.java:90) at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:91) at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:84) $ oc get deploy elasticsearch-cdm-ldnvhe0o-1 -oyaml apiVersion: apps/v1 kind: Deployment metadata: annotations: deployment.kubernetes.io/revision: "2" creationTimestamp: "2020-05-06T01:36:15Z" generation: 5 labels: cluster-name: elasticsearch component: elasticsearch es-node-client: "true" es-node-data: "true" es-node-master: "true" node-name: elasticsearch-cdm-ldnvhe0o-1 tuned.openshift.io/elasticsearch: "true" name: elasticsearch-cdm-ldnvhe0o-1 namespace: openshift-logging ownerReferences: - apiVersion: logging.openshift.io/v1 controller: true kind: Elasticsearch name: elasticsearch uid: 9f6de074-dd12-4422-aea2-98125214d1b8 resourceVersion: "61471" selfLink: /apis/apps/v1/namespaces/openshift-logging/deployments/elasticsearch-cdm-ldnvhe0o-1 uid: 0bc91513-6cbf-4a56-ad1a-35c30f170183 spec: paused: true progressDeadlineSeconds: 1800 replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: cluster-name: elasticsearch es-node-client: "true" es-node-data: "true" es-node-master: "true" node-name: elasticsearch-cdm-ldnvhe0o-1 strategy: type: Recreate template: metadata: creationTimestamp: null labels: cluster-name: elasticsearch component: elasticsearch es-node-client: "true" es-node-data: "true" es-node-master: "true" node-name: elasticsearch-cdm-ldnvhe0o-1 tuned.openshift.io/elasticsearch: "true" spec: affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - podAffinityTerm: labelSelector: matchExpressions: - key: es-node-client operator: In values: - "true" - key: es-node-data operator: In values: - "true" - key: es-node-master operator: In values: - "true" topologyKey: kubernetes.io/hostname weight: 100 containers: - env: - name: DC_NAME value: elasticsearch-cdm-ldnvhe0o-1 - name: NAMESPACE valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.namespace - name: KUBERNETES_TRUST_CERT value: "true" - name: SERVICE_DNS value: elasticsearch-cluster - name: CLUSTER_NAME value: elasticsearch - name: INSTANCE_RAM value: 4Gi - name: HEAP_DUMP_LOCATION value: /elasticsearch/persistent/heapdump.hprof - name: RECOVER_AFTER_TIME value: 5m - name: READINESS_PROBE_TIMEOUT value: "30" - name: POD_LABEL value: cluster=elasticsearch - name: IS_MASTER value: "true" - name: HAS_DATA value: "true" image: quay.io/openshift-qe-optional-operators/ose-logging-elasticsearch5@sha256:deeb2c8d91dd8e0f3050db5be111230515b5d06184ed99ba69441e3ae30fb866 imagePullPolicy: IfNotPresent name: elasticsearch ports: - containerPort: 9300 name: cluster protocol: TCP - containerPort: 9200 name: restapi protocol: TCP readinessProbe: exec: command: - /usr/share/elasticsearch/probe/readiness.sh failureThreshold: 3 initialDelaySeconds: 10 periodSeconds: 5 successThreshold: 1 timeoutSeconds: 30 resources: limits: memory: 4Gi requests: cpu: "1" memory: 4Gi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /elasticsearch/persistent name: elasticsearch-storage - mountPath: /usr/share/java/elasticsearch/config name: elasticsearch-config - mountPath: /etc/openshift/elasticsearch/secret name: certificates - args: - --http-address=:4180 - --https-address=:60000 - --provider=openshift - --upstream=https://localhost:9200 - --tls-cert=/etc/proxy/secrets/tls.crt - --tls-key=/etc/proxy/secrets/tls.key - --upstream-ca=/etc/proxy/elasticsearch/admin-ca - --openshift-service-account=elasticsearch - '-openshift-sar={"resource": "namespaces", "verb": "get"}' - '-openshift-delegate-urls={"/": {"resource": "namespaces", "verb": "get"}}' - --pass-user-bearer-token - --cookie-secret=ZHu6FwjOM/5TP62AiiwZDQ== image: quay.io/openshift-qe-optional-operators/ose-oauth-proxy@sha256:5fe478210770b21c1eb26c1570bcbda40bc5a79011580ff5ebd4c701a5b04eb2 imagePullPolicy: IfNotPresent name: proxy ports: - containerPort: 60000 name: metrics protocol: TCP resources: limits: memory: 64Mi requests: cpu: 100m memory: 64Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/proxy/secrets name: elasticsearch-metrics - mountPath: /etc/proxy/elasticsearch name: certificates dnsPolicy: ClusterFirst nodeSelector: kubernetes.io/os: linux restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: elasticsearch serviceAccountName: elasticsearch terminationGracePeriodSeconds: 30 tolerations: - effect: NoSchedule key: node.kubernetes.io/disk-pressure operator: Exists volumes: - configMap: defaultMode: 420 name: elasticsearch name: elasticsearch-config - name: elasticsearch-storage persistentVolumeClaim: claimName: elasticsearch-elasticsearch-cdm-ldnvhe0o-1 - name: certificates secret: defaultMode: 420 secretName: elasticsearch - name: elasticsearch-metrics secret: defaultMode: 420 secretName: elasticsearch-metrics status: conditions: - lastTransitionTime: "2020-05-06T01:37:56Z" lastUpdateTime: "2020-05-06T01:37:56Z" message: Deployment does not have minimum availability. reason: MinimumReplicasUnavailable status: "False" type: Available - lastTransitionTime: "2020-05-06T01:38:00Z" lastUpdateTime: "2020-05-06T01:38:00Z" message: Deployment is paused reason: DeploymentPaused status: Unknown type: Progressing observedGeneration: 5 replicas: 1 unavailableReplicas: 1 updatedReplicas: 1 $ oc get csv NAME DISPLAY VERSION REPLACES PHASE clusterlogging.4.4.0-202005041337 Cluster Logging 4.4.0-202005041337 clusterlogging.4.3.19-202005041055 Succeeded elasticsearch-operator.4.4.0-202005051217 Elasticsearch Operator 4.4.0-202005051217 elasticsearch-operator.4.3.19-202005041055 Succeeded EO image: openshift-ose-elasticsearch-operator-v4.4.0-202005051217
The fix is merged, and qe passed test with new elasticsearch-operator images.
Tested with elasticsearch-operator.4.4.0-202005131452, the upgrade succeeded. $ oc get csv NAME DISPLAY VERSION REPLACES PHASE clusterlogging.4.4.0-202005121717 Cluster Logging 4.4.0-202005121717 clusterlogging.4.3.19-202005041055 Succeeded elasticsearch-operator.4.4.0-202005131452 Elasticsearch Operator 4.4.0-202005131452 elasticsearch-operator.4.3.19-202005041055 Succeeded $ oc get pod NAME READY STATUS RESTARTS AGE cluster-logging-operator-d6468c6f7-jq4mc 1/1 Running 0 18m curator-1589505600-j2zc6 0/1 Completed 0 62s elasticsearch-cdm-enke4rg3-1-774f7c8f4b-vdsnx 2/2 Running 0 16m elasticsearch-cdm-enke4rg3-2-66c6685987-hmdfr 2/2 Running 0 14m elasticsearch-cdm-enke4rg3-3-6777f76b6b-rbrl7 2/2 Running 0 12m fluentd-4pdrn 1/1 Running 0 17m fluentd-lxqg2 1/1 Running 0 16m fluentd-qmxsf 1/1 Running 0 17m fluentd-rrs7z 1/1 Running 0 16m fluentd-sp4cf 1/1 Running 0 17m fluentd-wxpnb 1/1 Running 0 17m fluentd-xdr92 1/1 Running 0 16m kibana-849c98d867-kgrzc 2/2 Running 0 18m
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2133
Hit this issue again. when upgrade from clusterlogging.4.3.24-202006012115 to clusterlogging.4.4.0-202006011837 #oc get csv NAME DISPLAY VERSION REPLACES PHASE clusterlogging.4.4.0-202006011837 Cluster Logging 4.4.0-202006011837 clusterlogging.4.3.24-202006012115 Succeeded elasticsearch-operator.4.4.0-202006011837 Elasticsearch Operator 4.4.0-202006011837 elasticsearch-operator.4.3.24-202006011815 Succeeded Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/logging-es.jks -destkeystore /etc/elasticsearch/secret/logging-es.jks -deststoretype pkcs12". Certificate was added to keystore Certificate was added to keystore [2020-06-05 07:53:54,611][INFO ][container.run ] Setting heap dump location /elasticsearch/persistent/heapdump.hprof [2020-06-05 07:53:54,613][INFO ][container.run ] ES_JAVA_OPTS: ' -Xms1024m -Xmx1024m -XX:HeapDumpPath=/elasticsearch/persistent/heapdump.hprof -Dsg.display_lic_none=false -Dio.netty.recycler.maxCapacityPerThread=0 -Dio.netty.allocator.type=unpooled' [2020-06-05 07:53:54,613][INFO ][container.run ] Checking if Elasticsearch is ready OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N Exception in thread "main" java.lang.IllegalArgumentException: Could not resolve placeholder 'POD_IP' at org.elasticsearch.common.settings.PropertyPlaceholder.parseStringValue(PropertyPlaceholder.java:116) at org.elasticsearch.common.settings.PropertyPlaceholder.replacePlaceholders(PropertyPlaceholder.java:69) at org.elasticsearch.common.settings.Settings$Builder.replacePropertyPlaceholders(Settings.java:1109) at org.elasticsearch.common.settings.Settings$Builder.replacePropertyPlaceholders(Settings.java:1069) at org.elasticsearch.node.InternalSettingsPreparer.initializeSettings(InternalSettingsPreparer.java:137) at org.elasticsearch.node.InternalSettingsPreparer.prepareEnvironment(InternalSettingsPreparer.java:115) at org.elasticsearch.cli.EnvironmentAwareCommand.createEnv(EnvironmentAwareCommand.java:75) at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:70) at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:134) at org.elasticsearch.cli.Command.main(Command.java:90) at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:91) at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:84)
What is different from [1] during testing that we have regressed? Please provide: * the starting operator image version * the starting elasticsearch configmap * the ending elasticsearch configmap * the starting elasticsearch yaml * the ending elasticsearch yaml [1] https://bugzilla.redhat.com/show_bug.cgi?id=1824006#c17
Couldn't reproduce now. #oc get csv NAME DISPLAY VERSION REPLACES PHASE clusterlogging.4.4.0-202006061254 Cluster Logging 4.4.0-202006061254 clusterlogging.4.3.25-202006060952 Succeeded elasticsearch-operator.4.4.0-202006061254 Elasticsearch Operator 4.4.0-202006061254 elasticsearch-operator.4.3.25-202006060952 Succeeded
Please: > If the solution does not work for you, OPEN A NEW BUG REPORT. By design we cannot re-ship a bug that has already been shipped.