Description of problem:
Restarting or Stoppint etcd causes master api pod restart multiple times.
NAME READY STATUS RESTARTS AGE
master-api-ip-172-31-49-98.us-west-2.compute.internal 1/1 Running 13 1h
master-controllers-ip-172-31-49-98.us-west-2.compute.internal 1/1 Running 4 1h
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Create OCP cluster with 3 etcd (not co-located), 1 master, 1 infra and 2 compute nodes
2. Create few pods, imagestreams, builds etc
3. Stop etcd leader node from aws console
4. watch master api and controllers pods
5. oc commands fail sometimes when api pod is restarting
Many restarts of master api and controllers pods
No restarts of master api and controllers pods
attaching jounal logs from master node, master api and controller pod logs, exited container logs
Created attachment 1436508 [details]
master journal log
Created attachment 1436509 [details]
exited api container log
Created attachment 1436510 [details]
master api pod log
I will attach controller manager logs today.
Created attachment 1436801 [details]
controllers exited container log
Created attachment 1436802 [details]
controller manager exited container log
Created attachment 1436803 [details]
api exited container log
This seems like it might be related to the issue fixed by https://github.com/openshift/origin/pull/19638
Definitely, moving on QA to test that fix.
Vikas can you try with the latest build?
I did not see this problem in following version, tried multiple times to restart the etcd leader.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.