Description of problem: Restarting or Stoppint etcd causes master api pod restart multiple times. NAME READY STATUS RESTARTS AGE master-api-ip-172-31-49-98.us-west-2.compute.internal 1/1 Running 13 1h master-controllers-ip-172-31-49-98.us-west-2.compute.internal 1/1 Running 4 1h Version-Release number of selected component (if applicable): openshift v3.10.0-0.41.0 kubernetes v1.10.0+b81c8f8 etcd 3.2.16 How reproducible: Always Steps to Reproduce: 1. Create OCP cluster with 3 etcd (not co-located), 1 master, 1 infra and 2 compute nodes 2. Create few pods, imagestreams, builds etc 3. Stop etcd leader node from aws console 4. watch master api and controllers pods 5. oc commands fail sometimes when api pod is restarting Actual results: Many restarts of master api and controllers pods Expected results: No restarts of master api and controllers pods Additional info: attaching jounal logs from master node, master api and controller pod logs, exited container logs
Created attachment 1436508 [details] master journal log
Created attachment 1436509 [details] exited api container log
Created attachment 1436510 [details] master api pod log
I will attach controller manager logs today.
Created attachment 1436801 [details] controllers exited container log
Created attachment 1436802 [details] controller manager exited container log
Created attachment 1436803 [details] api exited container log
This seems like it might be related to the issue fixed by https://github.com/openshift/origin/pull/19638
Definitely, moving on QA to test that fix. Vikas can you try with the latest build?
I did not see this problem in following version, tried multiple times to restart the etcd leader. openshift v3.10.0-0.50.0 kubernetes v1.10.0+b81c8f8
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1816