Bug 1711044

Summary: The ES deployment couldn't be created if the origin one is deleted in stuck
Product: OpenShift Container Platform Reporter: ewolinet
Component: LoggingAssignee: ewolinet
Status: CLOSED ERRATA QA Contact: Anping Li <anli>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.1.zCC: anli, aos-bugs, ewolinet, jcantril, pweil, rmeggins, sponnaga, vlaad
Target Milestone: ---   
Target Release: 4.1.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: 4.1.5
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: 1707875 Environment:
Last Closed: 2019-08-28 19:54:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1707875    
Bug Blocks:    

Description ewolinet 2019-05-16 19:15:18 UTC
+++ This bug was initially created as a clone of Bug #1707875 +++

Description of problem:
Delete ES deployment in stuck status, and then update elasticsearches.logging.openshift.io. The EO may stuck,and the new ES/deployment couldn't be created.

Version-Release number of selected component (if applicable):
v4.1

How reproducible:
Always

Steps to Reproduce:
1. deployement Elasticsearch with request memory > node Memory. 
For example: 16G memory in elasticsearches.logging.openshift.io.  But node RAM is 8GM
2. Delete the resouce Entry in elasticsearches.logging.openshift.io
3. Check the elasicsearch-operatror

Actual results:
The EO is stuck. 

[anli@preserve-anli-slave 41]$ oc logs elasticsearch-operator-796f97d775-6czdq
time="2019-05-08T13:34:37Z" level=info msg="Go Version: go1.10.8"
time="2019-05-08T13:34:37Z" level=info msg="Go OS/Arch: linux/amd64"
time="2019-05-08T13:34:37Z" level=info msg="operator-sdk Version: 0.0.7"
time="2019-05-08T13:34:37Z" level=info msg="Watching logging.openshift.io/v1, Elasticsearch, , 5000000000"
time="2019-05-08T13:35:10Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1:  / green"
time="2019-05-08T13:35:29Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1:  / green"
time="2019-05-08T13:35:47Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1:  / green"
time="2019-05-08T13:36:06Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1:  / green"
time="2019-05-08T13:36:24Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1:  / green"
time="2019-05-08T13:36:43Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1:  / green"
time="2019-05-08T13:37:01Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1:  / green"
time="2019-05-08T13:37:20Z" level=info msg="Waiting for cluster to be fully recovered before restarting elasticsearch-cdm-ab9eosgt-1:  / green"


Expected results:

The ES pod can be deployed.

Additional info:
set Severity to Low as that is a rare case.

--- Additional comment from Anping Li on 2019-05-08 15:37:13 UTC ---

And sometime, It print message
time="2019-05-08T08:48:23Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
time="2019-05-08T08:48:32Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
time="2019-05-08T08:48:41Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
time="2019-05-08T08:48:49Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
time="2019-05-08T08:48:58Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
time="2019-05-08T08:49:07Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
time="2019-05-08T08:49:16Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"

Comment 4 Anping Li 2019-08-02 09:40:58 UTC
Verified using v4.1.9-201907311355
1. Deploy elasticsearch using default values (16Gi). If failed for resource limitation.
2. Configure resource using smaller Memory
  logStore:
    elasticsearch:
      nodeCount: 1
      resources:
        limits:
          memory: 4Gi
        requests:
          memory: 4Gi
    type: elasticsearch
3. Waiting for a while. The ES are deployed using Memory=4Gi

Comment 9 Anping Li 2019-08-16 05:52:45 UTC
Test blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1741753

Comment 10 Anping Li 2019-08-19 06:37:50 UTC
Verified in 4.1.12

Comment 14 Anping Li 2019-08-26 03:44:31 UTC
Verified in 4.1.13

Comment 16 errata-xmlrpc 2019-08-28 19:54:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2547