Bug 1748478

Summary:	readiness probe could show an invalid message in some conditions.
Product:	OpenShift Container Platform	Reporter:	German Parente <gparente>
Component:	Logging	Assignee:	Jeff Cantrill <jcantril>
Status:	CLOSED ERRATA	QA Contact:	Anping Li <anli>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	3.11.0	CC:	agawand, anisal, aos-bugs, ddo, grodrigu, jcantril, mirollin, ocasalsa, rmeggins, stwalter
Target Milestone:	---
Target Release:	4.3.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-01-23 11:05:32 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1795393

Description German Parente 2019-09-03 16:40:57 UTC

Description of problem:

we can see sometimes this error:

I0902 10:31:51.476431   22783 prober.go:111] Readiness probe for "logging-es-xxxxxxxx-NN-yyyyy_logging(......):elasticsearch" failed (failure): cat: /opt/app-root/src/init_failures: No such file or directory

in fact, the readiness, at the end, does this check:

check_for_init_complete || cat ${HOME}/init_failures

But it could be possible that init.sh has not yet generated the file "${HOME}/init_complete" even if there are no errors yet, so, ${HOME}/init_failures is empty or non-existent as in this case.

We should take care of this situation and avoid the 

cat: /opt/app-root/src/init_failures: No such file or directory


Version-Release number of selected component (if applicable): atomic-openshift-3.11.98-1.git.0.0cbaff3.el7.x86_64


How reproducible: at customer site. 


Steps to Reproduce:
1. I guess we could put a "sleep X" in init.sh to force this message.

Comment 3 Greg Rodriguez II 2019-10-28 19:47:23 UTC

Added another customer experiencing this issue.  Is there a known workaround that has been developed?

Comment 5 Jeff Cantrill 2019-10-31 19:11:43 UTC

(In reply to Greg Rodriguez II from comment #4)
> Customer states this affecting production and would like any type of
> workaround or resolution

Delete the readiness probe from the Deployment.  You may have to manually seed the permissions 'oc exec -c elasticsearch -- es_seed_acl'

Comment 8 Greg Rodriguez II 2019-11-08 20:21:36 UTC

Customer is requesting update on this ticket.  Has there been any progress?

Comment 10 Anping Li 2019-11-14 10:47:50 UTC

Waiting another image

Comment 12 Anping Li 2019-11-24 10:01:27 UTC

Verified openshift/ose-logging-elasticsearch5:v4.3.0-201911220712

Comment 13 Greg Rodriguez II 2019-11-29 13:50:56 UTC

Are there any plans to port this fix to 4.2 in the near future?

Comment 15 errata-xmlrpc 2020-01-23 11:05:32 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062