Bug 1811131 - IPI install on OSP 16 with Kuryr failing with CreateContainerError for kube-controller-manager: container name already in use
Summary: IPI install on OSP 16 with Kuryr failing with CreateContainerError for kube-c...
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.4.0
Assignee: Urvashi Mohnani
QA Contact: Sunil Choudhary
URL:
Whiteboard:
Depends On:
Blocks: 1786037 1786217
TreeView+ depends on / blocked
 
Reported: 2020-03-06 16:36 UTC by Mike Fiedler
Modified: 2020-04-16 13:53 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-12 17:33:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
journal and container logs from failed master + openshift install log (15.39 MB, application/gzip)
2020-03-06 16:40 UTC, Mike Fiedler
no flags Details

Description Mike Fiedler 2020-03-06 16:36:27 UTC
Description of problem:

IPI install on OSP 16 with Kuryr fails with kube-controller-manager never initializing.   The kubelet logs on the failing master are full of CreateContainerError message that the container name is already in use


Mar 06 16:14:51 wsun-4405-dg9lq-master-1 hyperkube[2052]: E0306 16:14:51.101709    2052 remote_runtime.go:200] CreateContainer in sandbox "b0686d2c9d1d73e7e28ef341c39dee94a213e799ab704dc995dbbee6753490d7" from runtime service failed: rpc error: code = Unknown desc = the container name "k8s_installer_installer-7-wsun-4405-dg9lq-master-1_openshift-kube-controller-manager_6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e_0" is already in use by "0bf2439073d6031b002c48417c3d529cd60677d0ae6eef981204ec65449cda3d". You have to remove that container to be able to reuse that name.: that name is already in use
Mar 06 16:14:51 wsun-4405-dg9lq-master-1 hyperkube[2052]: E0306 16:14:51.101897    2052 kuberuntime_manager.go:803] container start failed: CreateContainerError: the container name "k8s_installer_installer-7-wsun-4405-dg9lq-master-1_openshift-kube-controller-manager_6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e_0" is already in use by "0bf2439073d6031b002c48417c3d529cd60677d0ae6eef981204ec65449cda3d". You have to remove that container to be able to reuse that name.: that name is already in use
Mar 06 16:14:51 wsun-4405-dg9lq-master-1 hyperkube[2052]: E0306 16:14:51.101987    2052 pod_workers.go:191] Error syncing pod 6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e ("installer-7-wsun-4405-dg9lq-master-1_openshift-kube-controller-manager(6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e)"), skipping: failed to "StartContainer" for "installer" with CreateContainerError: "the container name \"k8s_installer_installer-7-wsun-4405-dg9lq-master-1_openshift-kube-controller-manager_6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e_0\" is already in use by \"0bf2439073d6031b002c48417c3d529cd60677d0ae6eef981204ec65449cda3d\". You have to remove that container to be able to reuse that name.: that name is already in use"
Mar 06 16:14:51 wsun-4405-dg9lq-master-1 hyperkube[2052]: I0306 16:14:51.102540    2052 event.go:281] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-kube-controller-manager", Name:"installer-7-wsun-4405-dg9lq-master-1", UID:"6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e", APIVersion:"v1", ResourceVersion:"19282", FieldPath:"spec.containers{installer}"}): type: 'Warning' reason: 'Failed' Error: the container name "k8s_installer_installer-7-wsun-4405-dg9lq-master-1_openshift-kube-controller-manager_6ed6b3fc-dd37-4a8a-812e-5b98e4c9f71e_0" is already in use by "0bf2439073d6031b002c48417c3d529cd60677d0ae6eef981204ec65449cda3d". You have to remove that container to be able to reuse that name.: that name is already in use



Version-Release number of selected component (if applicable): release:4.4.0-0.nightly-2020-03-06-073549


How reproducible: 3 times so far


Additional info:

I will up load the full journal and pod logs from the failing master and the install log

Comment 1 Mike Fiedler 2020-03-06 16:40:33 UTC
Created attachment 1668156 [details]
journal and container logs from failed master + openshift install log

Comment 4 Jon Uriarte 2020-03-12 17:33:05 UTC
OCP 4.4 IPI on OSP works ok with release 4.4.0-0.nightly-2020-03-12-082023.

There was a bug on OCP 4.4 [1] and another bug on OSP 16 [2] that could have caused the original issue described in this BZ.

Tried installing the cluster two times and it finished successfully both times.

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.4.0-0.nightly-2020-03-12-082023   True        False         4m13s   Cluster version is 4.4.0-0.nightly-2020-03-12-082023

Closing this BZ as requested by Openshift QE. Feel free to re-open it in case you hit the issue again.



[1] https://bugzilla.redhat.com/show_bug.cgi?id=1811530
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1812009


Note You need to log in before you can comment on or make changes to this bug.