Description of problem: Hit failed installation of OCP cluster, with kube-apiserver crashloop "timed out waiting for port :6443 and :6080 to be released". After manually removing the checking of 6080 port for the kube-apiserver container in the static pod yaml files in the master, the crash is gone. Seems there is race happening between the start order of the kube-apiserver and kube-apiserver-insecure-readyz containers. Version-Release number of selected component (if applicable): 4.5.0-0.nightly-2020-06-03-045340 Here is the actual env in action, [root@wj45ios603e-xnj76-master-0 core]# crictl logs 7b4aa4234361c Copying system trust bundle Waiting for port :6443 and :6080 to be released.............................................timed out waiting for port :6443 and :6080 to be released [root@wj45ios603e-xnj76-master-0 core]# fuser -v 6080/tcp USER PID ACCESS COMMAND 6080/tcp: root 39516 F.... cluster-kube-ap [root@wj45ios603e-xnj76-master-0 core]# ps aux|grep -i cluster-kube-ap root 39401 0.2 0.3 927832 60740 ? Ssl 08:44 0:06 cluster-kube-apiserver-operator cert-syncer --kubeconfig=/etc/kubernetes/static-pod-resources/configmaps/kube-apiserver-cert-syncer-kubeconfig/kubeconfig --namespace=openshift-kube-apiserver --destination-dir=/etc/kubernetes/static-pod-certs root 39516 0.0 0.2 452688 48720 ? Ssl 08:44 0:00 cluster-kube-apiserver-operator insecure-readyz --insecure-port=6080 --delegate-url=https://localhost:6443/readyz root 43119 0.0 0.3 862296 55616 ? Ssl 08:45 0:00 cluster-kube-apiserver-operator cert-regeneration-controller --kubeconfig=/etc/kubernetes/static-pod-resources/configmaps/kube-apiserver-cert-syncer-kubeconfig/kubeconfig --namespace=openshift-kube-apiserver -v=2 root 180734 0.0 0.0 12920 2528 pts/0 S+ 09:34 0:00 grep --color=auto -i cluster-kube-ap [root@wj45ios603e-xnj76-master-0 core]# crictl ps |grep -i insecure-readyz 3593dbe3245fd 81b500f67ec4ee958cbb7760aae0b25f077a779f491641ee91e631a50f8393ab 50 minutes ago Running kube-apiserver-insecure-readyz 0 2dae574048620 Check the 6080 container: [root@wj45ios603e-xnj76-master-0 core]# crictl ps -a | grep insecure-readyz 3593dbe3245fd 81b500f67ec4ee958cbb7760aae0b25f077a779f491641ee91e631a50f8393ab 3 hours ago Running kube-apiserver-insecure-readyz 0 2dae574048620 [root@wj45ios603e-xnj76-master-0 core]# crictl ps -a | grep 2dae c24d9d214d40a cb6f865a8becaf5e71d9837bebfacd186f0f013e4f22c12347712a118d020657 2 minutes ago Exited kube-apiserver 41 2dae574048620 73640aaa27d1d 81b500f67ec4ee958cbb7760aae0b25f077a779f491641ee91e631a50f8393ab 8 minutes ago Running kube-apiserver-insecure-readyz 1 2dae574048620 1d6055f5b90ea 81b500f67ec4ee958cbb7760aae0b25f077a779f491641ee91e631a50f8393ab 3 hours ago Running kube-apiserver-cert-regeneration-controller 1 2dae574048620 3593dbe3245fd 81b500f67ec4ee958cbb7760aae0b25f077a779f491641ee91e631a50f8393ab 3 hours ago Exited kube-apiserver-insecure-readyz 0 2dae574048620 b941a78bd05f5 81b500f67ec4ee958cbb7760aae0b25f077a779f491641ee91e631a50f8393ab 3 hours ago Exited kube-apiserver-cert-regeneration-controller 0 2dae574048620 87b8f77defb3c 81b500f67ec4ee958cbb7760aae0b25f077a779f491641ee91e631a50f8393ab 3 hours ago Running kube-apiserver-cert-syncer 0 2dae574048620 45cc67221f493 cb6f865a8becaf5e71d9837bebfacd186f0f013e4f22c12347712a118d020657 3 hours ago Exited setup 0 dae574048620 There is a bug 1837992 involved in arising this problem, that bug related PR https://github.com/openshift/cluster-kube-apiserver-operator/pull/864/files#diff-79f70c8858100d23aa0da941b6136509R47-R56 should not include 'or sport = 6080'. Remove 6080 port detecting for the kube-apiserver container in /etc/kubernetes/manifests/kube-apiserver-pod.yaml and /etc/kubernetes/static-pod-resources/kube-apiserver-pod-7/kube-apiserver-pod.yaml (7 is the latest revision), the container can be Running: [root@wj45ios603e-xnj76-master-0 core]# crictl ps -a | grep kube-apiserver 8d472d6325667 81b500f67ec4ee958cbb7760aae0b25f077a779f491641ee91e631a50f8393ab 20 minutes ago Running kube-apiserver-insecure-readyz 0 0008cc9b90a44 0c3c70f8dd1ad 81b500f67ec4ee958cbb7760aae0b25f077a779f491641ee91e631a50f8393ab 20 minutes ago Running kube-apiserver-cert-regeneration-controller 0 0008cc9b90a44 d0f9f3d9cf1e9 81b500f67ec4ee958cbb7760aae0b25f077a779f491641ee91e631a50f8393ab 20 minutes ago Running kube-apiserver-cert-syncer 0 0008cc9b90a44 8bee9ff6b36dc cb6f865a8becaf5e71d9837bebfacd186f0f013e4f22c12347712a118d020657 20 minutes ago Running kube-apiserver 0 0008cc9b90a44 Expected Results: A restarted kube-apiserver container no need wait for the port 6080 to be available Additional info:
*** Bug 1844288 has been marked as a duplicate of this bug. ***
OCP 4.5 with Red Hat OpenStack Platform 16.0 run into this bug and blocked related tests. OCP 4.5 with Google Cloud Platform hit this, see https://bugzilla.redhat.com/show_bug.cgi?id=1838421#c19 encountered the same error, State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Message: ...............................................................................timed out waiting for port :6443 and :6080 to be released Required a fix about this bug on OCP 4.5.
Blocked by https://github.com/openshift/cluster-kube-apiserver-operator/pull/870.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196