Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1614479

Summary:

Stopping master host causes another master controller to restart

Product:

OpenShift Container Platform

Reporter:

Vikas Laad <vlaad>

Component:

kube-apiserver

Assignee:

Stefan Schimanski <sttts>

Status:

CLOSED WONTFIX

QA Contact:

Xingxing Xia <xxia>

Severity:

low

Docs Contact:

Priority:

low

Version:

3.11.0

CC:

aos-bugs, jokerman, mfojtik, mifiedle, mmccomas, yinzhou

Target Milestone:

---

Keywords:

NeedsTestCase

Target Release:

3.11.z

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2020-05-05 15:29:13 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
master controller logs	none

Description Vikas Laad 2018-08-09 17:20:52 UTC

Description of problem:
On a HA cluster if I stop one of the master instances sometimes it causes other master controller pod to restart. I think it happens because master api on that node also goes down and it restarts the other controller due to that. I think there should be retry so that it connects to other api and should not restart. Here is the error in controller log.

E0809 15:38:23.099658       1 controllermanager.go:480] Error starting "resourcequota"
F0809 15:38:23.099677       1 controllermanager.go:185] error starting controllers: failed to discover resources: unable to retrieve the complete list of server APIs: servicecatalog.k8s.io/v1beta1: the server is currently unable to handle the request


Version-Release number of selected component (if applicable):
openshift v3.11.0-0.11.0

Steps to Reproduce:
1. Create HA cluster with 3 masters
2. Shutdown/Stop of the master host
3. watch other pods in kube-system project

Actual results:
One of the other 2 masters also get restarted.

Expected results:
No other master should restart.

Additional info:
Attaching full logs from the master controller container which restarted.

Comment 1 Vikas Laad 2018-08-09 17:23:01 UTC

Created attachment 1474781 [details]
master controller logs

Comment 2 zhou ying 2018-08-10 09:23:55 UTC

I could not reproduce the issue with my HA env on aws. 

But, We have met the same error when tested the destructive case with low memory instance. After change to large instance , then restart the master works.