Description of problem: Delete ES deployment in stuck status, and then update elasticsearches.logging.openshift.io. The EO may stuck,and the new ES/deployment couldn't be created. Version-Release number of selected component (if applicable): v4.1 How reproducible: Always Steps to Reproduce: 1. deployement Elasticsearch with request memory > node Memory. For example: 16G memory in elasticsearches.logging.openshift.io. But node RAM is 8GM 2. Delete the resouce Entry in elasticsearches.logging.openshift.io 3. Check the elasicsearch-operatror Actual results: The EO is stuck. [anli@preserve-anli-slave 41]$ oc logs elasticsearch-operator-796f97d775-6czdq time="2019-05-08T13:34:37Z" level=info msg="Go Version: go1.10.8" time="2019-05-08T13:34:37Z" level=info msg="Go OS/Arch: linux/amd64" time="2019-05-08T13:34:37Z" level=info msg="operator-sdk Version: 0.0.7" time="2019-05-08T13:34:37Z" level=info msg="Watching logging.openshift.io/v1, Elasticsearch, , 5000000000" time="2019-05-08T13:35:10Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1: / green" time="2019-05-08T13:35:29Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1: / green" time="2019-05-08T13:35:47Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1: / green" time="2019-05-08T13:36:06Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1: / green" time="2019-05-08T13:36:24Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1: / green" time="2019-05-08T13:36:43Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1: / green" time="2019-05-08T13:37:01Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1: / green" time="2019-05-08T13:37:20Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1: / green" Expected results: The ES pod can be deployed. Additional info: set Severity to Low as that is a rare case.
And sometime, It print message time="2019-05-08T08:48:23Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]" time="2019-05-08T08:48:32Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]" time="2019-05-08T08:48:41Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]" time="2019-05-08T08:48:49Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]" time="2019-05-08T08:48:58Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]" time="2019-05-08T08:49:07Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]" time="2019-05-08T08:49:16Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
Tested in ose-elasticsearch-operator-v4.2.0-201907021819, the issue could be reproduced by: 1. deploy logging, set es request memory >= node memory, the es pods are in Pending status 2. delete es resources entry in clusterlogging instance, then clo will update the elasticsearch instance automatically 3. delete the es deployment the ES deployment could not be recreated, and check the EO pod log: {"level":"info","ts":1562218976.652798,"logger":"cmd","msg":"Go Version: go1.11.6"} {"level":"info","ts":1562218976.6530623,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"} {"level":"info","ts":1562218976.6530707,"logger":"cmd","msg":"Version of operator-sdk: v0.7.0"} {"level":"info","ts":1562218976.6534047,"logger":"leader","msg":"Trying to become the leader."} {"level":"info","ts":1562218976.8516989,"logger":"leader","msg":"No pre-existing lock was found."} {"level":"info","ts":1562218976.8649614,"logger":"leader","msg":"Became the leader."} {"level":"info","ts":1562218977.0188684,"logger":"cmd","msg":"Registering Components."} {"level":"info","ts":1562218977.019178,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"elasticsearch-controller","source":"kind source: /, Kind="} {"level":"info","ts":1562218977.2075162,"logger":"cmd","msg":"failed to create or get service for metrics: services \"elasticsearch-operator\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>"} {"level":"info","ts":1562218977.2075431,"logger":"cmd","msg":"Starting the Cmd."} {"level":"info","ts":1562218977.3078299,"logger":"kubebuilder.controller","msg":"Starting Controller","controller":"elasticsearch-controller"} {"level":"info","ts":1562218977.408088,"logger":"kubebuilder.controller","msg":"Starting workers","controller":"elasticsearch-controller","worker count":1} {"level":"error","ts":1562219581.7813885,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"elasticsearch-controller","request":"openshift-logging/elasticsearch","error":"Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-rijx8l0o-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/github.com/go-logr/zapr/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:217\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/k8s.io/apimachinery/pkg/util/wait/wait.go:134\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/k8s.io/apimachinery/pkg/util/wait/wait.go:88"} {"level":"error","ts":1562219583.3307629,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"elasticsearch-controller","request":"openshift-logging/elasticsearch","error":"Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-rijx8l0o-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/github.com/go-logr/zapr/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:217\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/k8s.io/apimachinery/pkg/util/wait/wait.go:134\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}
I'm not able to reproduce this issue now. The ES deployment could always be recreated. Tested with ose-elasticsearch-operator-v4.2.0-201907232219
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922