Bug 1708663 - OCP4.1 UPI installation fails to create bootstrap-machine-config-operator
Summary: OCP4.1 UPI installation fails to create bootstrap-machine-config-operator
Keywords:
Status: CLOSED DUPLICATE of bug 1695516
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Containers
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.1.0
Assignee: Urvashi Mohnani
QA Contact: weiwei jiang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-10 14:00 UTC by Lukas Bednar
Modified: 2019-05-10 15:48 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-05-10 15:41:20 UTC
Target Upstream Version:


Attachments (Terms of Use)
log-bundle.tar.gz generated by openshift-install gather bootstrap (1.28 MB, application/gzip)
2019-05-10 14:00 UTC, Lukas Bednar
no flags Details

Description Lukas Bednar 2019-05-10 14:00:27 UTC
Created attachment 1566672 [details]
log-bundle.tar.gz generated by openshift-install gather bootstrap

Description of problem:

I am trying to install OCP-4.1 in UPI mode.

I use latest openshift-install-linux-4.1.0-rc.1 installer and rhcos-410.8.20190502.0 image.
I have dns, http server for bootstrap ignition server, haproxy as described here: https://github.com/openshift/installer/blob/master/docs/user/metal/install_upi.md

Then I bring bootstrap, master and worker nodes up.
Bootstrap is able to pick ignition config from http server, and waits for etcd cluster.
Master and worker nodes are having problem to get their machine-configs.

[K[    [0;31m*[0;1;31m*[0m] A start job is running for Ignition (disks) (59min 6s / no limit)[K[   [0;31m*[0;1;31m*[0m[0;31m*[0m] A start job is running for Ignition (disks) (59min 7s / no limit)[K[  [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (disks) (13h 59min 7s / no limit)[K[ [0;31m*[0;1;31m*[0m[0;31m*  [0m] A start job is running for Ignition (disks) (13h 59min 8s / no limit)[K[[0;31m*[0;1;31m*[0m[0;31m*   [0m] A start job is running for Ignition (disks) (13h 59min 8s / no limit)[K[[0;1;31m*[0m[0;31m*    [0m] A start job is running for Ignition (disks) (13h 59min 9s / no limit)[K[[0m[0;31m*     [0m] A start job is running for Ignition (disks) (13h 59min 9s / no limit)[K[[0;1;31m*[0m[0;31m*    [0m] A start job is running for Ignition (disks) (13h 59min 10s / no limit)[K[[0;31m*[0;1;31m*[0m[0;31m*   [0m] A start job is running for Ignition (disks) (13h 59min 10s / no limit)[K[ [0;31m*[0;1;31m*[0m[0;31m*  [0m] A start job is running for Ignition (disks) (13h 59min 11s / no limit)[50353.819223] ignition[607]: GET https://api-int.working.oc4:22623/config/master: attempt #10053
[50353.824470] ignition[607]: GET error: Get https://api-int.working.oc4:22623/config/master: EOF

api-int.working.oc4 points to haproxy, which do load-balance between bootstrap and master nodes.
When I try to access https://bootstrap.working.oc4:22623/config/master it tells me "connection refused".
When I go to bootstrap machine I don't see 22623 port to be bind.

And in log I see that it could not crate bootstrap-machine-config-operator-host .

May 09 16:52:39 host-172-16-0-23 hyperkube[1181]: E0509 16:52:39.141412    1181 pod_workers.go:190] Error syncing pod 50348b3c4c0a3abff8cb6c0c802ea28e ("bootstrap-machine-config-operator-host-172-16-0-23_default(50348b3c4c0a3abff8cb6c0c802ea28e)"), skipping: failed to "CreatePodSandbox" for "bootstrap-machine-config-operator-host-172-16-0-23_default(50348b3c4c0a3abff8cb6c0c802ea28e)" with CreatePodSandboxError: "CreatePodSandbox for pod \"bootstrap-machine-config-operator-host-172-16-0-23_default(50348b3c4c0a3abff8cb6c0c802ea28e)\" failed: rpc error: code = Unknown desc = error creating pod sandbox with name \"k8s_bootstrap-machine-config-operator-host-172-16-0-23_default_50348b3c4c0a3abff8cb6c0c802ea28e_0\": layer not known"

Version-Release number of the following components:
* openshift-install-linux-4.1.0-rc.1 installer
* rhcos-410.8.20190502.0 image.

How reproducible: 100

Steps to Reproduce:
1. I am following https://github.com/openshift/installer/blob/master/docs/user/metal/install_upi.md

Actual results:
Installation fails on timeout, because bootstrap-machine-config-operator failed to be created.

Expected results:
Installation success.

Additional info:
Attached logs collected by openshift-install gather bootstrap command.

Comment 1 Scott Dodson 2019-05-10 14:49:12 UTC
Looks identical to https://bugzilla.redhat.com/show_bug.cgi?id=1695516 however the image referenced should have that fix

Comment 3 Urvashi Mohnani 2019-05-10 15:34:04 UTC
The issue is fixed in cri-o 1.13.9


Note You need to log in before you can comment on or make changes to this bug.