Bug 1811131

Summary: IPI install on OSP 16 with Kuryr failing with CreateContainerError for kube-controller-manager: container name already in use
Product: OpenShift Container Platform Reporter: Mike Fiedler <mifiedle>
Component: NodeAssignee: Urvashi Mohnani <umohnani>
Status: CLOSED UPSTREAM QA Contact: Sunil Choudhary <schoudha>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.4CC: aos-bugs, jfan, jokerman, juriarte, rphillips, wsun, xtian
Target Milestone: ---Keywords: TestBlocker
Target Release: 4.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-12 17:33:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1786037, 1786217    
Attachments:
Description Flags
journal and container logs from failed master + openshift install log none

Description Mike Fiedler 2020-03-06 16:36:27 UTC
Description of problem:

IPI install on OSP 16 with Kuryr fails with kube-controller-manager never initializing.   The kubelet logs on the failing master are full of CreateContainerError message that the container name is already in use


Mar 06 16:14:51 wsun-4405-dg9lq-master-1 hyperkube[2052]: E0306 16:14:51.101709    2052 remote_runtime.go:200] CreateContainer in sandbox "b0686d2c9d1d73e7e28ef341c39dee94a213e799ab704dc995dbbee6753490d7" from runtime service failed: rpc error: code = Unknown desc = the container name "k8s_installer_installer-7-wsun-4405-dg9lq-master-1_openshift-kube-controller-manager_6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e_0" is already in use by "0bf2439073d6031b002c48417c3d529cd60677d0ae6eef981204ec65449cda3d". You have to remove that container to be able to reuse that name.: that name is already in use
Mar 06 16:14:51 wsun-4405-dg9lq-master-1 hyperkube[2052]: E0306 16:14:51.101897    2052 kuberuntime_manager.go:803] container start failed: CreateContainerError: the container name "k8s_installer_installer-7-wsun-4405-dg9lq-master-1_openshift-kube-controller-manager_6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e_0" is already in use by "0bf2439073d6031b002c48417c3d529cd60677d0ae6eef981204ec65449cda3d". You have to remove that container to be able to reuse that name.: that name is already in use
Mar 06 16:14:51 wsun-4405-dg9lq-master-1 hyperkube[2052]: E0306 16:14:51.101987    2052 pod_workers.go:191] Error syncing pod 6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e ("installer-7-wsun-4405-dg9lq-master-1_openshift-kube-controller-manager(6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e)"), skipping: failed to "StartContainer" for "installer" with CreateContainerError: "the container name \"k8s_installer_installer-7-wsun-4405-dg9lq-master-1_openshift-kube-controller-manager_6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e_0\" is already in use by \"0bf2439073d6031b002c48417c3d529cd60677d0ae6eef981204ec65449cda3d\". You have to remove that container to be able to reuse that name.: that name is already in use"
Mar 06 16:14:51 wsun-4405-dg9lq-master-1 hyperkube[2052]: I0306 16:14:51.102540    2052 event.go:281] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-kube-controller-manager", Name:"installer-7-wsun-4405-dg9lq-master-1", UID:"6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e", APIVersion:"v1", ResourceVersion:"19282", FieldPath:"spec.containers{installer}"}): type: 'Warning' reason: 'Failed' Error: the container name "k8s_installer_installer-7-wsun-4405-dg9lq-master-1_openshift-kube-controller-manager_6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e_0" is already in use by "0bf2439073d6031b002c48417c3d529cd60677d0ae6eef981204ec65449cda3d". You have to remove that container to be able to reuse that name.: that name is already in use



Version-Release number of selected component (if applicable): release:4.4.0-0.nightly-2020-03-06-073549


How reproducible: 3 times so far


Additional info:

I will up load the full journal and pod logs from the failing master and the install log

Comment 1 Mike Fiedler 2020-03-06 16:40:33 UTC
Created attachment 1668156 [details]
journal and container logs from failed master + openshift install log

Comment 4 Jon Uriarte 2020-03-12 17:33:05 UTC
OCP 4.4 IPI on OSP works ok with release 4.4.0-0.nightly-2020-03-12-082023.

There was a bug on OCP 4.4 [1] and another bug on OSP 16 [2] that could have caused the original issue described in this BZ.

Tried installing the cluster two times and it finished successfully both times.

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.4.0-0.nightly-2020-03-12-082023   True        False         4m13s   Cluster version is 4.4.0-0.nightly-2020-03-12-082023

Closing this BZ as requested by Openshift QE. Feel free to re-open it in case you hit the issue again.



[1] https://bugzilla.redhat.com/show_bug.cgi?id=1811530
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1812009