Bug 1838630 - etcd operator shows all 3 master nodes as unhealthy members
Summary: etcd operator shows all 3 master nodes as unhealthy members
Keywords:
Status: CLOSED DUPLICATE of bug 1838781
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Etcd Operator
Version: 4.4
Hardware: x86_64
OS: Linux
high
medium
Target Milestone: ---
: 4.6.0
Assignee: Sam Batschelet
QA Contact: ge liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-05-21 13:18 UTC by Sam Yangsao
Modified: 2020-06-18 20:25 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-06-18 20:20:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Screenshot of error on OCP console (125.82 KB, image/png)
2020-05-21 13:18 UTC, Sam Yangsao
no flags Details

Description Sam Yangsao 2020-05-21 13:18:53 UTC
Created attachment 1690643 [details]
Screenshot of error on OCP console

Description of problem:

etcd operator shows all 3 master nodes as unhealthy members on initial installation

Version-Release number of selected component (if applicable):

OCP v4.4.3 and v4.4.4
vSphere 6.7U3

How reproducible:

Unsure

Steps to Reproduce:

1.  Install OCP v4.4.3 on vSphere 6.7U3 with default sizings
2.  Error occurs in console
3.  Upgrade to OCP v4.4.4
4.  Error still re-appears in console

Actual results:

etcd operator shows all 3 master nodes as unhealthy members without specifying what the actual issue is. 

unhealthy members: master01.devocp4.lab.msp.redhat.com,master03.devocp4.lab.msp.redhat.com,master02.devocp4.lab.msp.redhat.com

In the past, we use to errors if the disk sync was suspect, it would report this in the console stating that it doesn't meet a certain threshold for synchronization.  This error doesn't give us any details on what may be occurring.

Expected results:

etcd operator should give a better description on why the members are unhealthy.

Provide additional troubleshooting suggestions on where to look if this error occurs.

Additional info:

Screenshot of error attached, must-gather output will also be provided.

Comment 2 Sam Yangsao 2020-06-16 17:04:17 UTC
OK, after upgrading to OCP 4.4.8, I don't see these messages anymore.  

Not sure what changed from OCP 4.4.3 and 4.4.4 that removed these messages...

Comment 3 Dan Mace 2020-06-18 20:20:28 UTC
Sam,

I suspect this is a duplicate of the health check bug in https://bugzilla.redhat.com/show_bug.cgi?id=1838781 which was released in 4.4.7. If you observe the problem again in 4.4.7+ or have some other reason to believe there's a different issue, please re-open this bug and we can investigate.

Thanks!

*** This bug has been marked as a duplicate of bug 183878 ***

Comment 4 Chris Lumens 2020-06-18 20:25:23 UTC
Heh, I think you typo'd the duplicate - unless you actually meant to reference an FC5 anaconda bug.  Changing.

*** This bug has been marked as a duplicate of bug 1838781 ***


Note You need to log in before you can comment on or make changes to this bug.