Bug 1836452 - Elasticsearch Operator does not continue upgrade on 'yellow', does not allow primaries to be created
Summary: Elasticsearch Operator does not continue upgrade on 'yellow', does not allow ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 4.5
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.5.0
Assignee: ewolinet
QA Contact: Anping Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-05-15 23:06 UTC by ewolinet
Modified: 2020-07-13 17:39 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-13 17:39:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift elasticsearch-operator pull 355 0 None closed Bug 1836452: Updating upgrade policy to allow yellow and set only primaries 2020-10-05 13:23:30 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:39:51 UTC

Description ewolinet 2020-05-15 23:06:25 UTC
Description of problem:
During upgrade, the EO waits for the cluster to go 'green' each time, however it can get away with waiting for 'yellow'.

Also per the elasticsearch documentation we should be setting the allocation to be "primaries" not "none".

How reproducible:
Always

Comment 3 Anping Li 2020-05-23 09:12:50 UTC
Move to verified. 
1).  Make the ES to Yellow status (It is easy as https://bugzilla.redhat.com/show_bug.cgi?id=1838153)
1.2) Install 4.4 Cluster Logging with nodeCount=1 and ZeroRedundancy
1.2) Upgrade CLO to 4.5
1.3) The ES status

#oc get csv
NAME                                        DISPLAY                  VERSION              REPLACES                            PHASE
clusterlogging.4.5.0-202005221517           Cluster Logging          4.5.0-202005221517   clusterlogging.4.4.0-202005221357   Succeeded
elasticsearch-operator.4.4.0-202005220258   Elasticsearch Operator   4.4.0-202005220258                                       Succeeded


#oc exec -c elasticsearch elasticsearch-cdm-a9h73w86-1-758c54d858-b9zb6 -- es_cluster_health
{
  "cluster_name" : "elasticsearch",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 16,
  "active_shards" : 16,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 10,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 1,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 61.53846153846154
}

2) Upgrade EO to 4.5.
3) Check the ES pods. You can see the ES had been upgraded to 4.5. (ose-elasticsearch-proxy).  (Note: the ES couldn't be running for https://bugzilla.redhat.com/show_bug.cgi?id=1838929. I think that doesn't blocked this bug.)
$ oc get pods
NAME                                            READY   STATUS             RESTARTS   AGE
cluster-logging-operator-558d8f8f7-4w6nm        1/1     Running            0          11m
curator-1590223200-pplnv                        0/1     Completed          0          17m
curator-1590223800-rtd2f                        0/1     Error              0          7m18s
elasticsearch-cdm-a9h73w86-1-64fdf5c8f5-pvbxv   1/2     CrashLoopBackOff   5          5m23s


$oc get pods elasticsearch-cdm-a9h73w86-1-64fdf5c8f5-pvbxv -o yaml |grep 'image:'
    image: quay.io/openshift-qe-optional-operators/ose-logging-elasticsearch6@sha256:1d2c67ad5a6bbebfc4d44c6e943b3c1727cb33731f67c35e69d4436ff8b46774
    image: quay.io/openshift-qe-optional-operators/ose-elasticsearch-proxy@sha256:cc93bc0d0e7a5c92f6380fde91b6bade54994b1d03949441d49a719fbfd55e23

Comment 4 Anping Li 2020-05-23 12:19:37 UTC
As the workaround to BZ1838929. After I used registry.svc.ci.openshift.org/origin/4.5:logging-elasticsearch6 instead of downstream image.  The ES pod can be running.

Comment 5 errata-xmlrpc 2020-07-13 17:39:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.