Description of problem:
e2e-metal-ipi-ovn-ipv6 is failing on the latest cri-o / RHCOS builds, bootstrapping failures started happening after we bumped the RHEL images to 8.4 beta.
Kubelet is reporting:
./journals/kubelet.log:Apr 30 17:39:26 localhost kubelet.sh: E0430 17:39:26.406727 3183 kuberuntime_sandbox.go:68] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = error creating pod sandbox with name \"k8s_etcd-bootstrap-member-localhost_openshift-etcd_505683813579901bb5fed1eab2bf616d_0\": Error initializing source docker://k8s.gcr.io/pause:3.5: error pinging docker registry k8s.gcr.io: Get \"https://k8s.gcr.io/v2/\": dial tcp [2607:f8b0:400e:c07::52]:443: i/o timeout" pod="openshift-etcd/etcd-bootstrap-member-localhost"
We noticed https://github.com/cri-o/cri-o/pull/4550 -- which seems related. How does this typically work for disconnected environments, do you ship the pause image with cri-o somehow? I don't see it as part of the release payload, or in crictl images.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Install disconnected environment
Bootstraping fails with kubelet unable to fetch k8s pause image
This is blocking new nightlies.
So bootstrap overrides this with /etc/kubernetes/kubelet-pause-image-override, which uses the 'pod' image from the release payload. I believe to make this work is you need to carry the 3.5 changes from https://github.com/kubernetes/kubernetes/pull/100292 in openshift/kubernetes
I think this will be fixed by the attached PR (updating bootstrap process to handle new crio config behavior)