Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1707875

Summary: The ES deployment couldn't be created if the origin one is deleted in stuck
Product: OpenShift Container Platform Reporter: Anping Li <anli>
Component: LoggingAssignee: ewolinet
Status: CLOSED ERRATA QA Contact: Qiaoling Tang <qitang>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.1.0CC: aos-bugs, ewolinet, qitang, rmeggins
Target Milestone: ---   
Target Release: 4.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 1711044 (view as bug list) Environment:
Last Closed: 2019-10-16 06:28:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1711044    

Description Anping Li 2019-05-08 15:26:14 UTC
Description of problem:
Delete ES deployment in stuck status, and then update elasticsearches.logging.openshift.io. The EO may stuck,and the new ES/deployment couldn't be created.

Version-Release number of selected component (if applicable):
v4.1

How reproducible:
Always

Steps to Reproduce:
1. deployement Elasticsearch with request memory > node Memory. 
For example: 16G memory in elasticsearches.logging.openshift.io.  But node RAM is 8GM
2. Delete the resouce Entry in elasticsearches.logging.openshift.io
3. Check the elasicsearch-operatror

Actual results:
The EO is stuck. 

[anli@preserve-anli-slave 41]$ oc logs elasticsearch-operator-796f97d775-6czdq
time="2019-05-08T13:34:37Z" level=info msg="Go Version: go1.10.8"
time="2019-05-08T13:34:37Z" level=info msg="Go OS/Arch: linux/amd64"
time="2019-05-08T13:34:37Z" level=info msg="operator-sdk Version: 0.0.7"
time="2019-05-08T13:34:37Z" level=info msg="Watching logging.openshift.io/v1, Elasticsearch, , 5000000000"
time="2019-05-08T13:35:10Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1:  / green"
time="2019-05-08T13:35:29Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1:  / green"
time="2019-05-08T13:35:47Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1:  / green"
time="2019-05-08T13:36:06Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1:  / green"
time="2019-05-08T13:36:24Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1:  / green"
time="2019-05-08T13:36:43Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1:  / green"
time="2019-05-08T13:37:01Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1:  / green"
time="2019-05-08T13:37:20Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1:  / green"


Expected results:

The ES pod can be deployed.

Additional info:
set Severity to Low as that is a rare case.

Comment 1 Anping Li 2019-05-08 15:37:13 UTC
And sometime, It print message
time="2019-05-08T08:48:23Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
time="2019-05-08T08:48:32Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
time="2019-05-08T08:48:41Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
time="2019-05-08T08:48:49Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
time="2019-05-08T08:48:58Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
time="2019-05-08T08:49:07Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
time="2019-05-08T08:49:16Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"

Comment 3 Qiaoling Tang 2019-07-04 06:30:21 UTC
Tested in ose-elasticsearch-operator-v4.2.0-201907021819, the issue could be reproduced by:

1. deploy logging, set es request memory >= node memory, the es pods are in Pending status
2. delete es resources entry in clusterlogging instance, then clo will update the elasticsearch instance automatically
3. delete the es deployment

the ES deployment could not be recreated, and check the EO pod log:
{"level":"info","ts":1562218976.652798,"logger":"cmd","msg":"Go Version: go1.11.6"}
{"level":"info","ts":1562218976.6530623,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
{"level":"info","ts":1562218976.6530707,"logger":"cmd","msg":"Version of operator-sdk: v0.7.0"}
{"level":"info","ts":1562218976.6534047,"logger":"leader","msg":"Trying to become the leader."}
{"level":"info","ts":1562218976.8516989,"logger":"leader","msg":"No pre-existing lock was found."}
{"level":"info","ts":1562218976.8649614,"logger":"leader","msg":"Became the leader."}
{"level":"info","ts":1562218977.0188684,"logger":"cmd","msg":"Registering Components."}
{"level":"info","ts":1562218977.019178,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"elasticsearch-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1562218977.2075162,"logger":"cmd","msg":"failed to create or get service for metrics: services \"elasticsearch-operator\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>"}
{"level":"info","ts":1562218977.2075431,"logger":"cmd","msg":"Starting the Cmd."}
{"level":"info","ts":1562218977.3078299,"logger":"kubebuilder.controller","msg":"Starting Controller","controller":"elasticsearch-controller"}
{"level":"info","ts":1562218977.408088,"logger":"kubebuilder.controller","msg":"Starting workers","controller":"elasticsearch-controller","worker count":1}
{"level":"error","ts":1562219581.7813885,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"elasticsearch-controller","request":"openshift-logging/elasticsearch","error":"Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-rijx8l0o-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/github.com/go-logr/zapr/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:217\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/k8s.io/apimachinery/pkg/util/wait/wait.go:134\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}
{"level":"error","ts":1562219583.3307629,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"elasticsearch-controller","request":"openshift-logging/elasticsearch","error":"Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-rijx8l0o-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/github.com/go-logr/zapr/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:217\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/k8s.io/apimachinery/pkg/util/wait/wait.go:134\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/openshift/elasticsearch-operator/_output/src/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}

Comment 9 Qiaoling Tang 2019-07-24 07:37:56 UTC
I'm not able to reproduce this issue now. The ES deployment could always be recreated.

Tested with ose-elasticsearch-operator-v4.2.0-201907232219

Comment 10 errata-xmlrpc 2019-10-16 06:28:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922