Bug 1865777 - [IBM Power]Elasticsearch pods are not ready
Summary: [IBM Power]Elasticsearch pods are not ready
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 4.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.6.0
Assignee: Jeff Cantrill
QA Contact: Anping Li
URL:
Whiteboard: logging-exploration
Depends On: 1873034
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-04 06:59 UTC by Saurabh Sadhale
Modified: 2023-12-15 18:41 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 15:09:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift origin-aggregated-logging pull 1959 0 None closed Bug 1865777: Update readiness probe to handle stderr and connect log 2021-01-11 01:17:34 UTC
Red Hat Knowledge Base (Solution) 5332581 0 None None None 2020-08-20 04:00:35 UTC
Red Hat Product Errata RHBA-2020:4198 0 None None None 2020-10-27 15:10:14 UTC

Comment 2 Jeff Cantrill 2020-08-05 13:50:20 UTC
So what is the problem being reported?  Is it Elasticsearch isn't ready or there is an issue with Curator?

Comment 3 Jeff Cantrill 2020-08-05 13:51:07 UTC
The later is reported here https://bugzilla.redhat.com/show_bug.cgi?id=1860793

Comment 5 Jeff Cantrill 2020-08-05 16:04:43 UTC
(In reply to Saurabh Sadhale from comment #0)
> Created attachment 1710265 [details]
> Logging-Dump
> 
> Description of problem:
> After configuring the ES by following documentation 

Please link specifically to the documentation in question

the pods of ES are not
> fully up and running on vSphere environment. 
> 
> NOTE: There is a BUG https://bugzilla.redhat.com/show_bug.cgi?id=1847365 but
> this is for IBM power clusters which is why I have opened this Bug. 
> 

Are you running on an IBM power cluster?  I see an error but not certain this is the cause

Comment 9 Anping Li 2020-08-07 10:12:51 UTC
The fix can print the connect log as below. 

[2020-08-07 09:13:42,221][ERROR][container.run            ] Timed out waiting for Elasticsearch to be ready
HTTP/1.1 503 Service Unavailable^M
content-type: application/json; charset=UTF-8^M
content-length: 471^M

Comment 13 Barry Donahue 2020-08-18 12:59:36 UTC
Jeremy, could you assign this BZ?

Comment 18 Jeff Cantrill 2020-08-24 13:27:39 UTC
We may be able to work around this issue until the change lands by doing the following when the pod is booting:

"oc exec -c elasticsearch $pod -- touch /opt/app-root/src/elasticsearch_connect_log.txt"

Comment 19 Steven Walter 2020-08-24 19:53:33 UTC
Thanks Jeff, I've suggested it to the customer. Workaround is sensible enough to me.

Comment 20 Anping Li 2020-08-27 07:26:26 UTC
Blocked by no ppc64 metadata images.

Comment 21 Steven Walter 2020-08-27 17:26:57 UTC
As a note, customer tried the workaround to create elasticsearch_connect_log.txt manually at pod startup, but pod still does not progress with:
[2020-08-26 17:16:00,444][ERROR][container.run            ] Timed out waiting for Elasticsearch to be ready

Elasticsearch process itself also never initializes (as expected):

sh-4.2$ cat elasticsearch.log
[2020-08-26T16:29:03,355][INFO ][o.e.n.Node               ] [elasticsearch-cdm-extsb882-1] initializing ...
[2020-08-26T17:10:59,437][INFO ][o.e.n.Node               ] [elasticsearch-cdm-extsb882-1] initializing ...
sh-4.2$ cat elasticsearch-2020-08-24.log
[2020-08-24T21:47:05,973][INFO ][o.e.n.Node               ] [elasticsearch-cdm-extsb882-1] initializing ...
sh-4.2$ cat elasticsearch-2020-08-20.log
[2020-08-20T23:07:31,287][INFO ][o.e.n.Node               ] [elasticsearch-cdm-extsb882-1] initializing ...

Comment 22 Anping Li 2020-09-24 15:12:03 UTC
Verified on  
4.6.0-0.nightly-ppc64le-2020-09-22-130722.
clusterlogging.4.6.0-202009230045.p0 
elasticsearch-operator.4.6.0-202009230045.p0


[ppc64]$ oc get pods
NAME                                            READY   STATUS                  RESTARTS   AGE
cluster-logging-operator-57f4b7d696-wlm2c       1/1     Running                 0          14m
curator-1600958400-2tclb                        0/1     Completed               0          9m18s
elasticsearch-cdm-n7mbu2j8-1-7d464664fd-nzz5k   2/2     Running                 0          9m41s
elasticsearch-cdm-n7mbu2j8-2-8489b7748b-ft472   2/2     Running                 0          2m55s
elasticsearch-cdm-n7mbu2j8-3-89cdbbfdb-lq2hk    2/2     Running                 0          2m47s
elasticsearch-delete-app-1600958700-xzhz2       0/1     Completed               0          4m10s
elasticsearch-delete-audit-1600958700-ww5bs     0/1     Completed               0          4m4s
elasticsearch-delete-infra-1600958700-vsjnd     0/1     Completed               0          4m1s
elasticsearch-rollover-app-1600958700-bgxn2     0/1     Completed               0          3m57s
elasticsearch-rollover-audit-1600958700-7hcgs   0/1     Completed               0          3m55s
elasticsearch-rollover-infra-1600958700-7bpv8   0/1     Completed               0          4m29s
fluentd-8zfsm                                   1/1     Running                 0          9m29s
fluentd-n2lbv                                   1/1     Running                 0          8m56s
fluentd-tj7bk                                   1/1     Running                 0          9m38s
kibana-586f67c6b6-gjhkr                         2/2     Running                 0          8m54s
[ppc64]oc exec -c elasticsearch elasticsearch-cdm-n7mbu2j8-1-7d464664fd-nzz5k -- es_util --query=_cat/shards
audit-000001 0 p STARTED      0    230b 10.131.0.12 elasticsearch-cdm-n7mbu2j8-3
.security    0 p STARTED      5  59.2kb 10.131.0.11 elasticsearch-cdm-n7mbu2j8-2
app-000001   0 p STARTED      0    261b 10.128.2.26 elasticsearch-cdm-n7mbu2j8-1
.kibana_1    0 r STARTED      0    230b 10.131.0.11 elasticsearch-cdm-n7mbu2j8-2
.kibana_1    0 p STARTED      0    230b 10.131.0.12 elasticsearch-cdm-n7mbu2j8-3
infra-000001 0 p STARTED 458211 191.5mb 10.128.2.26 elasticsearch-cdm-n7mbu2j8-1

Comment 24 errata-xmlrpc 2020-10-27 15:09:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.1 extras update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4198


Note You need to log in before you can comment on or make changes to this bug.