Bug 2095153 - Bootstrap Failure during Baremetal IPI Deployment of 4.10.0-0.nightly-2022-06-08-043906
Summary: Bootstrap Failure during Baremetal IPI Deployment of 4.10.0-0.nightly-2022-06...
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Bare Metal Hardware Provisioning
Version: 4.10
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.10.z
Assignee: Riccardo Pittau
QA Contact: Amit Ugol
URL:
Whiteboard:
Depends On:
Blocks: 2089312
TreeView+ depends on / blocked
 
Reported: 2022-06-09 07:03 UTC by Adina Wolff
Modified: 2022-06-09 13:32 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-06-09 13:32:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Adina Wolff 2022-06-09 07:03:22 UTC
Description of problem:
During IPI deployment of 4.10.0-0.nightly-2022-06-08-043906 on real baremetal (hp), ironic fails to connect to nodes. 

installer log shows:

time="2022-06-08T14:46:25+03:00" level=debug msg="time=\"2022-06-08T11:46:25Z\" level=fatal msg=\"connect: connect endpoint 'unix:///var/run/crio/crio.sock', make sure you are running as root and the endpoint has been started: context deadline exceeded\""

kubelet logs show:

E0608 11:17:45.099937       1 apiaccess_count_controller.go:168] the server could not find the requested resource (post apirequestcounts.apiserver.openshift.io)

E0608 11:20:14.754156       1 dynamic_cafile_content.go:236] key failed with : open /etc/kubernetes/secrets/aggregator-signer.crt: no such file or directory

I0608 11:21:24.798020       1 tlsconfig.go:255] "Shutting down DynamicServingCertificateController"
I0608 11:21:24.797789       1 controller.go:89] Shutting down OpenAPI AggregationController
E0608 11:21:25.651122       1 apiaccess_count_controller.go:168] Get "https://[::1]:6443/apis/apiserver.openshift.io/v1/apirequestcounts/cronjobs.v1.batch": context canceled
E0608 11:21:26.009780       1 apiaccess_count_controller.go:168] Get "https://[::1]:6443/apis/apiserver.openshift.io/v1/apirequestcounts/catalogsources.v1alpha1.operators.coreos.com": context canceled
E0608 11:21:26.107340       1 apiaccess_count_controller.go:168] Get "https://[::1]:6443/apis/apiserver.openshift.io/v1/apirequestcounts/runtimeclasses.v1.node.k8s.io": context canceled
E0608 11:21:26.372961       1 apiaccess_count_controller.go:168] Get "https://[::1]:6443/apis/apiserver.openshift.io/v1/apirequestcounts/podsecuritypolicies.v1beta1.policy": context canceled
E0608 11:21:26.426498       1 apiaccess_count_controller.go:168] Get "https://[::1]:6443/apis/apiserver.openshift.io/v1/apirequestcounts/certificatesigningrequests.v1.certificates.k8s.io": context canceled
I0608 11:21:26.798162       1 genericapiserver.go:510] [graceful-termination] apiserver is exiting


Version-Release number of selected component (if applicable):
4.10.0-0.nightly-2022-06-08-043906

How reproducible:
2/2


Steps to Reproduce:
1. run ipi deployment


Actual results:
Deployment fails on ironic.

Expected results:
Deployment ends successfully.

Additional info:
bootstrap logs: http://rhos-compute-node-10.lab.eng.rdu2.redhat.com/logs/log-bundle-20220608144604.tar.gz

Comment 1 Adina Wolff 2022-06-09 13:28:45 UTC
Just re-ran the deployment again twice and these times it ends successfully.


Note You need to log in before you can comment on or make changes to this bug.