Description of problem: kube-apiserver pods got restarted during statefulset test. Version-Release number of selected component (if applicable): # oc get clusterversion version -o json | jq -r .status.desired { "image": "registry.svc.ci.openshift.org/ocp/release@sha256:be7574f653f533f17cdae078c514432756b3bcb0af7a592e8348c9833770db2a", "version": "4.0.0-0.ci-2019-02-19-055341" } 3 masters (m5.xlarge) and 6 computes (m5.4xlarge) How reproducible: 2/2 Steps to Reproduce: See https://bugzilla.redhat.com/show_bug.cgi?id=1670241 # in a terminal: oc get pod -n openshift-kube-apiserver -w 1. Create 75 STS and each of them has 2 replicas 2. STS pods use gp2 PVCs 3. Delete test projects Repeat 1 - 3 twice or 3 times. Actual results: Expected results: Additional info: # oc get pod -n openshift-kube-apiserver -w NAME READY STATUS RESTARTS AGE installer-10-ip-10-0-133-100.us-east-2.compute.internal 0/1 Completed 0 126m installer-10-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 127m installer-10-ip-10-0-175-205.us-east-2.compute.internal 0/1 Completed 0 125m installer-11-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 19m installer-12-ip-10-0-133-100.us-east-2.compute.internal 0/1 Completed 0 18m installer-12-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 18m installer-12-ip-10-0-175-205.us-east-2.compute.internal 0/1 Completed 0 17m installer-13-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 8m38s installer-14-ip-10-0-133-100.us-east-2.compute.internal 0/1 Completed 0 7m35s installer-14-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 8m12s installer-15-ip-10-0-133-100.us-east-2.compute.internal 0/1 Completed 0 7m4s installer-15-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 5m32s installer-15-ip-10-0-175-205.us-east-2.compute.internal 0/1 Completed 0 6m21s installer-6-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 4h installer-7-ip-10-0-133-100.us-east-2.compute.internal 0/1 Completed 0 3h59m installer-7-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 4h installer-7-ip-10-0-175-205.us-east-2.compute.internal 0/1 Completed 0 3h59m installer-8-ip-10-0-133-100.us-east-2.compute.internal 0/1 Completed 0 3h57m installer-8-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 3h57m installer-8-ip-10-0-175-205.us-east-2.compute.internal 0/1 Completed 0 3h58m installer-9-ip-10-0-133-100.us-east-2.compute.internal 0/1 Completed 0 138m installer-9-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 139m installer-9-ip-10-0-175-205.us-east-2.compute.internal 0/1 Completed 0 137m kube-apiserver-ip-10-0-133-100.us-east-2.compute.internal 1/1 Running 0 6m51s kube-apiserver-ip-10-0-148-181.us-east-2.compute.internal 1/1 Running 0 5m20s kube-apiserver-ip-10-0-175-205.us-east-2.compute.internal 1/1 Running 0 6m9s revision-pruner-10-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 127m revision-pruner-11-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 19m revision-pruner-12-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 18m revision-pruner-13-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 8m38s revision-pruner-14-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 8m13s revision-pruner-15-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 5m32s revision-pruner-6-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 4h revision-pruner-7-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 4h revision-pruner-8-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 3h57m revision-pruner-9-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 139m revision-pruner-16-ip-10-0-148-181.us-east-2.compute.internal 0/1 Pending 0 0s revision-pruner-16-ip-10-0-148-181.us-east-2.compute.internal 0/1 ContainerCreating 0 0s installer-16-ip-10-0-148-181.us-east-2.compute.internal 0/1 Pending 0 0s installer-16-ip-10-0-148-181.us-east-2.compute.internal 0/1 ContainerCreating 0 0s revision-pruner-16-ip-10-0-148-181.us-east-2.compute.internal 0/1 ContainerCreating 0 8s installer-16-ip-10-0-148-181.us-east-2.compute.internal 0/1 ContainerCreating 0 8s revision-pruner-16-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 9s installer-16-ip-10-0-148-181.us-east-2.compute.internal 1/1 Running 0 9s revision-pruner-16-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 9s kube-apiserver-ip-10-0-148-181.us-east-2.compute.internal 0/1 ContainerCreating 0 10m kube-apiserver-ip-10-0-148-181.us-east-2.compute.internal 0/1 Terminating 0 10m kube-apiserver-ip-10-0-148-181.us-east-2.compute.internal 0/1 Terminating 0 10m kube-apiserver-ip-10-0-148-181.us-east-2.compute.internal 0/1 Pending 0 1s installer-16-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 13s installer-16-ip-10-0-148-181.us-east-2.compute.internal 0/1 Completed 0 13s kube-apiserver-ip-10-0-148-181.us-east-2.compute.internal 0/1 ContainerCreating 0 2s kube-apiserver-ip-10-0-148-181.us-east-2.compute.internal 0/1 Running 0 2s kube-apiserver-ip-10-0-148-181.us-east-2.compute.internal 1/1 Running 0 19s installer-16-ip-10-0-133-100.us-east-2.compute.internal 0/1 Pending 0 0s installer-16-ip-10-0-133-100.us-east-2.compute.internal 0/1 ContainerCreating 0 0s installer-16-ip-10-0-133-100.us-east-2.compute.internal 0/1 ContainerCreating 0 8s installer-16-ip-10-0-133-100.us-east-2.compute.internal 1/1 Running 0 13s Saw this yesterday when cleaning up test projects: for i in {0..74}; do oc delete project "clusterproject$i" --wait=false; done project.project.openshift.io "clusterproject23" deleted Error from server (ServiceUnavailable): the server is currently unable to handle the request (delete projects.project.openshift.io clusterproject24)
The ""apiserver is shutting down" was fixed recently by improving the retry logic. I would try to verify this is no longer a problem using latest cluster-kube-apiserver-operator image.
Did the test again with # oc get clusterversion version -o json | jq -r .status.desired { "image": "registry.svc.ci.openshift.org/ocp/release@sha256:336f5a15441352a5b8c53bd039f22b96048ae40781d13bb4e4e08b3d60073af1", "version": "4.0.0-0.nightly-2019-03-05-065158" } It happened again when doing clean up on the 2nd round of statefulset test (the first round is fine) This is from `oc get pod -n openshift-kube-apiserver -w` installer-7-ip-10-0-130-108.us-east-2.compute.internal 0/1 Completed 0 34m installer-7-ip-10-0-152-32.us-east-2.compute.internal 0/1 Completed 0 33m installer-7-ip-10-0-172-67.us-east-2.compute.internal 0/1 Completed 0 32m kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 1/1 Running 2 34m kube-apiserver-ip-10-0-152-32.us-east-2.compute.internal 1/1 Running 2 33m kube-apiserver-ip-10-0-172-67.us-east-2.compute.internal 1/1 Running 2 32m revision-pruner-2-ip-10-0-130-108.us-east-2.compute.internal 0/1 Completed 0 45m revision-pruner-3-ip-10-0-130-108.us-east-2.compute.internal 0/1 Completed 0 44m revision-pruner-4-ip-10-0-130-108.us-east-2.compute.internal 0/1 Completed 0 41m revision-pruner-5-ip-10-0-130-108.us-east-2.compute.internal 0/1 Completed 0 39m revision-pruner-6-ip-10-0-130-108.us-east-2.compute.internal 0/1 Completed 0 35m revision-pruner-7-ip-10-0-130-108.us-east-2.compute.internal 0/1 Completed 0 34m installer-8-ip-10-0-130-108.us-east-2.compute.internal 0/1 Pending 0 0s installer-8-ip-10-0-130-108.us-east-2.compute.internal 0/1 ContainerCreating 0 0s revision-pruner-8-ip-10-0-130-108.us-east-2.compute.internal 0/1 Pending 0 1s revision-pruner-8-ip-10-0-130-108.us-east-2.compute.internal 0/1 ContainerCreating 0 1s installer-8-ip-10-0-130-108.us-east-2.compute.internal 0/1 ContainerCreating 0 8s revision-pruner-8-ip-10-0-130-108.us-east-2.compute.internal 0/1 ContainerCreating 0 9s revision-pruner-8-ip-10-0-130-108.us-east-2.compute.internal 0/1 Completed 0 9s installer-8-ip-10-0-130-108.us-east-2.compute.internal 1/1 Running 0 9s revision-pruner-8-ip-10-0-130-108.us-east-2.compute.internal 0/1 Completed 0 9s kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 0/1 ContainerCreating 0 103m kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 0/1 Terminating 0 103m kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 0/1 Terminating 0 103m kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 0/1 Pending 0 0s installer-8-ip-10-0-130-108.us-east-2.compute.internal 0/1 Completed 0 12s installer-8-ip-10-0-130-108.us-east-2.compute.internal 0/1 Completed 0 12s kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 0/1 Error 0 1s kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 0/1 Error 1 2s kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 0/1 CrashLoopBackOff 1 3s kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 0/1 Running 2 20s kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 1/1 Running 2 30s installer-8-ip-10-0-152-32.us-east-2.compute.internal 0/1 Pending 0 0s installer-8-ip-10-0-152-32.us-east-2.compute.internal 0/1 ContainerCreating 0 0s installer-8-ip-10-0-152-32.us-east-2.compute.internal 0/1 ContainerCreating 0 8s installer-8-ip-10-0-152-32.us-east-2.compute.internal 1/1 Running 0 8s kube-apiserver-ip-10-0-152-32.us-east-2.compute.internal 0/1 ContainerCreating 0 103m kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 0/1 Terminating 0 103m kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 0/1 Terminating 0 103m kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 0/1 Pending 0 0s installer-8-ip-10-0-130-108.us-east-2.compute.internal 0/1 Completed 0 12s installer-8-ip-10-0-130-108.us-east-2.compute.internal 0/1 Completed 0 12s kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 0/1 Error 0 1s kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 0/1 Error 1 2s kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 0/1 CrashLoopBackOff 1 3s kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 0/1 Running 2 20s kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 1/1 Running 2 30s installer-8-ip-10-0-152-32.us-east-2.compute.internal 0/1 Pending 0 0s installer-8-ip-10-0-152-32.us-east-2.compute.internal 0/1 ContainerCreating 0 0s installer-8-ip-10-0-152-32.us-east-2.compute.internal 0/1 ContainerCreating 0 8s installer-8-ip-10-0-152-32.us-east-2.compute.internal 1/1 Running 0 8s kube-apiserver-ip-10-0-152-32.us-east-2.compute.internal 0/1 ContainerCreating 0 103m kube-apiserver-ip-10-0-152-32.us-east-2.compute.internal 0/1 Terminating 0 103m kube-apiserver-ip-10-0-152-32.us-east-2.compute.internal 0/1 Terminating 0 103m kube-apiserver-ip-10-0-152-32.us-east-2.compute.internal 0/1 Pending 0 0s kube-apiserver-ip-10-0-152-32.us-east-2.compute.internal 0/1 ContainerCreating 0 1s installer-8-ip-10-0-152-32.us-east-2.compute.internal 0/1 Completed 0 12s installer-8-ip-10-0-152-32.us-east-2.compute.internal 0/1 Completed 0 12s kube-apiserver-ip-10-0-152-32.us-east-2.compute.internal 0/1 Error 0 2s kube-apiserver-ip-10-0-152-32.us-east-2.compute.internal 0/1 Error 1 3s kube-apiserver-ip-10-0-152-32.us-east-2.compute.internal 0/1 CrashLoopBackOff 1 4s kube-apiserver-ip-10-0-152-32.us-east-2.compute.internal 0/1 Running 2 24s kube-apiserver-ip-10-0-152-32.us-east-2.compute.internal 1/1 Running 2 40s installer-8-ip-10-0-172-67.us-east-2.compute.internal 0/1 Pending 0 0s installer-8-ip-10-0-172-67.us-east-2.compute.internal 0/1 ContainerCreating 0 0s installer-8-ip-10-0-172-67.us-east-2.compute.internal 0/1 ContainerCreating 0 7s installer-8-ip-10-0-172-67.us-east-2.compute.internal 1/1 Running 0 8s kube-apiserver-ip-10-0-172-67.us-east-2.compute.internal 0/1 ContainerCreating 0 103m kube-apiserver-ip-10-0-172-67.us-east-2.compute.internal 0/1 Terminating 0 103m kube-apiserver-ip-10-0-172-67.us-east-2.compute.internal 0/1 Terminating 0 103m kube-apiserver-ip-10-0-172-67.us-east-2.compute.internal 0/1 Pending 0 0s installer-8-ip-10-0-172-67.us-east-2.compute.internal 0/1 Completed 0 11s installer-8-ip-10-0-172-67.us-east-2.compute.internal 0/1 Completed 0 11s kube-apiserver-ip-10-0-172-67.us-east-2.compute.internal 0/1 Error 0 1s kube-apiserver-ip-10-0-172-67.us-east-2.compute.internal 0/1 Error 1 2s This is from `for i in {0..149}; do oc delete project "clusterproject$i" --wait=false; done` # for i in {0..149}; do oc delete project "clusterproject$i" --wait=false; done project.project.openshift.io "clusterproject0" deleted project.project.openshift.io "clusterproject1" deleted project.project.openshift.io "clusterproject2" deleted project.project.openshift.io "clusterproject3" deleted project.project.openshift.io "clusterproject4" deleted project.project.openshift.io "clusterproject5" deleted project.project.openshift.io "clusterproject6" deleted The connection to the server api.hongkliu1.qe.devcluster.openshift.com:6443 was refused - did you specify the right host or port? project.project.openshift.io "clusterproject8" deleted project.project.openshift.io "clusterproject9" deleted And then some of the projects and PVs are still there. # oc get project NAME DISPLAY NAME STATUS clusterproject26 Active clusterproject7 Active # oc get pv --no-headers | wc -l 96 They should be gone normally. Hi Michal, Is this expected?
When this happen, can we get a logs from the failed container in: kube-apiserver-ip-10-0-130-108.us-east-2.compute.internal 1/1 Running 2 34m kube-apiserver-ip-10-0-152-32.us-east-2.compute.internal 1/1 Running 2 33m kube-apiserver-ip-10-0-172-67.us-east-2.compute.internal 1/1 Running 2 32m (it seems to restart twice, we need to know why). Also just dumping the YAML of that Pod should reveal its last words.
I will give it a shot today.
Tested again 1. first time the pods got restarted, it were not detected right away by the client side this time. The objects got deleted successfully as nothing happened. This is better than the result in Comment 13. However, when I started to collect logs for the pods, there was several mins oc-cli did not work. 2. openshift-kube-controller-manager and openshift-kube-scheduler got restarted too. Actually they did before pods in openshift-kube-apiserver. I did not watch pods in other NS in Comment 13 2 days ago. http://file.rdu.redhat.com/~hongkliu/test_result/bz1678847/bz1678847.20190307/output.txt Other files in the same folder. 3. I kept creating/deleting ... PV started to be stuck there eventually when pods got restarted the 2nd time. Then collect logs again (with *r2* in the name). If `previous` is a part in the file name, that is from `oc logs --previous ....`
(In reply to Hongkai Liu from comment #16) > Tested again > 1. first time the pods got restarted, it were not detected right away by the > client side this time. The objects got deleted successfully as nothing > happened. This is better than the result in Comment 13. However, when I > started to collect logs for the pods, there was several mins oc-cli did not > work. Did not work as getting error or just super slow? Super slow means it can't get the discovery from the server that is down and it will eventually recover. > 2. openshift-kube-controller-manager and openshift-kube-scheduler got > restarted too. Actually they did before pods in openshift-kube-apiserver. I > did not watch pods in other NS in Comment 13 2 days ago. The containers in pods were restarted? Can we see logs from those containers? > http://file.rdu.redhat.com/~hongkliu/test_result/bz1678847/bz1678847. > 20190307/output.txt > Other files in the same folder. > > 3. I kept creating/deleting ... PV started to be stuck there eventually when > pods got restarted the 2nd time. > Then collect logs again (with *r2* in the name). Sounds like a new bug for the storage team. > > If `previous` is a part in the file name, that is from `oc logs --previous > ....`
> Did not work as getting error or just super slow? Super slow means it can't > get > the discovery from the server that is down and it will eventually recover. If i recall correctly, it got error. It got recovered not long afterwards (one min maybe). > The containers in pods were restarted? Can we see logs from those containers? > > > http://file.rdu.redhat.com/~hongkliu/test_result/bz1678847/bz1678847. > > 20190307/output.txt > > Other files in the same folder. Yes. I had some of them saved: http://file.rdu.redhat.com/~hongkliu/test_result/bz1678847/bz1678847.20190307/ > Sounds like a new bug for the storage team. From what I saw, it is the same as Comment 13. I do not know the cause though. @Michal, are we still expecting the restarts of the api-server pods? From the api server pods logs, can we know what the cause is? Thanks.
F0307 16:43:41.377576 1 cmd.go:71] failed to create listener: failed to listen on 0.0.0.0:6443: listen tcp 0.0.0.0:6443: bind: address already in use This seems to be the culprint. I think when the API server is started, it crash because it can't bind on that port (probably the old host port was not cleaned up?). It should not be disturbing during the rollout of new API server the other 2 pods should hold the API up, so you might experience some discovery lags, but the API should be still operational.
Yup, it looks like the old apiserver process hasn't given up the socket yet. There are a couple state machines here (kubelet, sockets, etc.) that could be interacting in confusing ways. This may be related to Stefan's various tweaks around how apiservers are cleanly shut down. If the apiserver is unhealthy but not yet terminated, kubelet could be starting a new one while it holds the socket. Just conjecture though...
The next time this happens, issue the command "ss -ltnp" on the host. That should tell you which pid is holding the socket.
@Michal, F0307 16:43:41.377576 1 cmd.go:71] failed to create listener: failed to listen on 0.0.0.0:6443: listen tcp 0.0.0.0:6443: bind: address already in use The above line is from a restart already (NOT the first time the API-server pod got started). It took sometime for me to see the restart from `oc get pod --all-namespaces -w`. The api-server logs would be rather long at that time, instead of a short one like http://file.rdu.redhat.com/~hongkliu/test_result/bz1678847/bz1678847.20190307/kube-apiserver-ip-10-0-142-215.us-east-2.compute.internal.previous.log The port was in use for some reason and it caused another restart and that one became `previous`. If the above sounds correct, we still do not know what caused the restart. So in this case, even though we know which process is using that port, it could not help much figure out the reason why it got restarted in the first place. Let me know if my understand is wrong. Thanks.
Those "bind: address already in use" are expected. We have increases the termination duration to be at least 30 seconds to cope with slowly converging SDN. When the new static pod for kube-apiserver is copied over to the kubelet manifest directory, the kubelet will gracefully shutdown the old pod. This happens through a method call to the Container Runtime Interface (CRI) implemented by CRI-O. Kubelet assumes that the pod is gone right away and starts the new one. CRI-O though gracefully terminates the old pod and that takes 30 seconds+, leading to the bind error message.
It still does not explain why the api pods got restarted in the first place (even though the 2nd restart is expected). More important, it eventually happened to the cluster that the objects that oc issued deleting are left there forever. Let me know if I missed some points here.
Based on comment 25 there seems to be a timing issue in the scenario described in the description of this bz that can leave resources that oc delete was issued against are left Terminating forever. @mfojtik @sttts Would it be better to open new bz for this issue? If so, go ahead and close this one again as NOTABUG base on the explanation in comment 24.
Closing based on comment 24.
About the other part of the issue: there are installers running seconds before the issues. That suggests that the operator intentionally did a roll out. Please attach kube-apiserver operator logs and events to find out the reason.
Here is a PR which avoid connection error for external API consumers during a rolling upgrade like the one seen here in the issue: https://github.com/openshift/installer/pull/1421 @Hongkai Liu: are you able to test against the upper PR (when the prerequisite merges)?
I am trying to reproduce it with the following build. # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-03-15-063749 True False 103m Cluster version is 4.0.0-0.nightly-2019-03-15-063749 I will update the result right here by the end of the day.
@stefan, I used the version in Comment 30 to replicate the problem. * I still saw the restarts of the api server pods. The operator log: http://file.rdu.redhat.com/~hongkliu/test_result/bz1678847/bz1678847.20190315/ * The diff this time is that it is not detected by the client side. All the pods and PVs are deleted. I tried over 10 times. With previous builds, 2-3 times I could see the leftover of PVs. As long as the problem is not detectable from the client side, I am fine with the api-server restarts. This STS case is included in most of our test runs. So it is very likely to test the PR to merge in Comment 29 in the next run. Thanks.
It sounds like we may be interested in fixing these to reduce error noise in the cluster? If so, we hit this again in CI today [1]: $ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/604/pull-ci-openshift-machine-config-operator-master-e2e-aws/2978/artifacts/e2e-aws/pods.json | jq -r '[.items[] | .status.containerStatuses[] | .restartCount = (.restartCount | tonumber)] | sort_by(-.restartCount)[] | select(.restartCount > 0) | .previousMessage = (.lastState.terminated.message // (.lastState.terminated.exitCode | tostring) // "?") | (.restartCount | tostring) + " " + .name + " " + .previousMessage' | head -n14 3 kube-apiserver-5 1 flags.go:33] FLAG: --tls-cipher-suites="[TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA]" I0409 21:50:50.143920 1 flags.go:33] FLAG: --tls-min-version="VersionTLS12" I0409 21:50:50.143929 1 flags.go:33] FLAG: --tls-private-key-file="/etc/kubernetes/static-pod-certs/secrets/serving-cert/tls.key" I0409 21:50:50.143937 1 flags.go:33] FLAG: --tls-sni-cert-key="[/etc/kubernetes/static-pod-certs/secrets/localhost-serving-cert-certkey/tls.crt,/etc/kubernetes/static-pod-certs/secrets/localhost-serving-cert-certkey/tls.key;/etc/kubernetes/static-pod-certs/secrets/service-network-serving-certkey/tls.crt,/etc/kubernetes/static-pod-certs/secrets/service-network-serving-certkey/tls.key]" I0409 21:50:50.143960 1 flags.go:33] FLAG: --token-auth-file="" I0409 21:50:50.143974 1 flags.go:33] FLAG: --v="2" I0409 21:50:50.143981 1 flags.go:33] FLAG: --version="false" I0409 21:50:50.143991 1 flags.go:33] FLAG: --vmodule="" I0409 21:50:50.143998 1 flags.go:33] FLAG: --watch-cache="true" I0409 21:50:50.144005 1 flags.go:33] FLAG: --watch-cache-sizes="[]" I0409 21:50:50.144806 1 server.go:568] external host was not specified, using 10.0.155.178 I0409 21:50:50.145125 1 server.go:611] Initializing cache sizes based on 0MB limit I0409 21:50:50.145253 1 server.go:146] Version: v1.13.4+c68d654 F0409 21:50:50.145440 1 cmd.go:71] failed to create listener: failed to listen on 0.0.0.0:6443: listen tcp 0.0.0.0:6443: bind: address already in use It's not clear why that pod was starting up and tripping over its predecessor. [1]: https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/604/pull-ci-openshift-machine-config-operator-master-e2e-aws/2978
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days