Description of problem: IPI install on OSP 16 with Kuryr fails with kube-controller-manager never initializing. The kubelet logs on the failing master are full of CreateContainerError message that the container name is already in use Mar 06 16:14:51 wsun-4405-dg9lq-master-1 hyperkube[2052]: E0306 16:14:51.101709 2052 remote_runtime.go:200] CreateContainer in sandbox "b0686d2c9d1d73e7e28ef341c39dee94a213e799ab704dc995dbbee6753490d7" from runtime service failed: rpc error: code = Unknown desc = the container name "k8s_installer_installer-7-wsun-4405-dg9lq-master-1_openshift-kube-controller-manager_6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e_0" is already in use by "0bf2439073d6031b002c48417c3d529cd60677d0ae6eef981204ec65449cda3d". You have to remove that container to be able to reuse that name.: that name is already in use Mar 06 16:14:51 wsun-4405-dg9lq-master-1 hyperkube[2052]: E0306 16:14:51.101897 2052 kuberuntime_manager.go:803] container start failed: CreateContainerError: the container name "k8s_installer_installer-7-wsun-4405-dg9lq-master-1_openshift-kube-controller-manager_6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e_0" is already in use by "0bf2439073d6031b002c48417c3d529cd60677d0ae6eef981204ec65449cda3d". You have to remove that container to be able to reuse that name.: that name is already in use Mar 06 16:14:51 wsun-4405-dg9lq-master-1 hyperkube[2052]: E0306 16:14:51.101987 2052 pod_workers.go:191] Error syncing pod 6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e ("installer-7-wsun-4405-dg9lq-master-1_openshift-kube-controller-manager(6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e)"), skipping: failed to "StartContainer" for "installer" with CreateContainerError: "the container name \"k8s_installer_installer-7-wsun-4405-dg9lq-master-1_openshift-kube-controller-manager_6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e_0\" is already in use by \"0bf2439073d6031b002c48417c3d529cd60677d0ae6eef981204ec65449cda3d\". You have to remove that container to be able to reuse that name.: that name is already in use" Mar 06 16:14:51 wsun-4405-dg9lq-master-1 hyperkube[2052]: I0306 16:14:51.102540 2052 event.go:281] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-kube-controller-manager", Name:"installer-7-wsun-4405-dg9lq-master-1", UID:"6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e", APIVersion:"v1", ResourceVersion:"19282", FieldPath:"spec.containers{installer}"}): type: 'Warning' reason: 'Failed' Error: the container name "k8s_installer_installer-7-wsun-4405-dg9lq-master-1_openshift-kube-controller-manager_6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e_0" is already in use by "0bf2439073d6031b002c48417c3d529cd60677d0ae6eef981204ec65449cda3d". You have to remove that container to be able to reuse that name.: that name is already in use Version-Release number of selected component (if applicable): release:4.4.0-0.nightly-2020-03-06-073549 How reproducible: 3 times so far Additional info: I will up load the full journal and pod logs from the failing master and the install log
Created attachment 1668156 [details] journal and container logs from failed master + openshift install log
OCP 4.4 IPI on OSP works ok with release 4.4.0-0.nightly-2020-03-12-082023. There was a bug on OCP 4.4 [1] and another bug on OSP 16 [2] that could have caused the original issue described in this BZ. Tried installing the cluster two times and it finished successfully both times. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.4.0-0.nightly-2020-03-12-082023 True False 4m13s Cluster version is 4.4.0-0.nightly-2020-03-12-082023 Closing this BZ as requested by Openshift QE. Feel free to re-open it in case you hit the issue again. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1811530 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1812009