Bug 1892002

Summary: Elasticsearch Operator is unable to update ES pods when they are in crashloopbackoff state
Product: OpenShift Container Platform Reporter: ewolinet
Component: LoggingAssignee: ewolinet
Status: CLOSED ERRATA QA Contact: Qiaoling Tang <qitang>
Severity: low Docs Contact:
Priority: unspecified    
Version: 4.5CC: aos-bugs, periklis, qitang
Target Milestone: ---   
Target Release: 4.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: logging-exploration
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 1892005 (view as bug list) Environment:
Last Closed: 2021-02-24 11:21:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1892005    

Description ewolinet 2020-10-27 20:45:12 UTC
Description of problem:
In EO 4.5 if ES pods are in a crashloopbackoff state, the operator does not treat this as an 'unschedulable' node and will try to communicate with the cluster before making changes which will not be possible. So the cluster gets into a 'wedged' state.


Version-Release number of selected component (if applicable):
4.5

How reproducible:
Always


Steps to Reproduce:
1. Force ES pods into a 'crashloopbackoff' state (using a 4.5+ EO update the CSV to use an elasticsearch 5 image) 
2. Make a change for EO to perform (Update the CSV to use an elasticsearch6 image)
3. Observe EO is unable to make these changes to the ES deployments.

Actual results:
EO does not update the deployments.


Expected results:
EO will update the deployments so the pods can correctly start.


Additional info:
This should be fixed in EO 4.6+ already, the logic to consider a crashloopbackoff an unschedulable condition was added in 4.6 as part of a feature.

Comment 1 Qiaoling Tang 2020-11-04 06:34:12 UTC
Verified with elasticsearch-operator.4.7.0-202011030448.p0

Comment 6 errata-xmlrpc 2021-02-24 11:21:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Errata Advisory for Openshift Logging 5.0.0), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0652