Description of problem: oc adm must-gather fails on disconnected IPv6 environments because it's unable to reach quay.io: [kni@provisionhost-0 ~]$ oc adm must-gather [must-gather ] OUT unable to resolve the imagestream tag openshift/must-gather:latest [must-gather ] OUT [must-gather ] OUT Using must-gather plugin-in image: quay.io/openshift/origin-must-gather:latest [must-gather ] OUT namespace/openshift-must-gather-k45bw created [must-gather ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-57tnq created [must-gather ] OUT pod for plug-in image quay.io/openshift/origin-must-gather:latest created [must-gather-5nht8] OUT gather did not start: unable to pull image: ErrImagePull: rpc error: code = Unknown desc = error pinging docker registry quay.io: Get https://quay.io/v2/: dial tcp 34.195.60.239:443: connect: network is unreachable [must-gather ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-57tnq deleted [must-gather ] OUT namespace/openshift-must-gather-k45bw deleted error: gather did not start for pod must-gather-5nht8: unable to pull image: ErrImagePull: rpc error: code = Unknown desc = error pinging docker registry quay.io: Get https://quay.io/v2/: dial tcp 34.195.60.239:443: connect: network is unreachable Version-Release number of selected component (if applicable): version 4.3.0-0.nightly-2020-03-01-194304 True False 22h Cluster version is 4.3.0-0.nightly-2020-03-01-194304 How reproducible: 100% Steps to Reproduce: 1. Deploy bare metal IPv6 environment 2. Run oc adm must-gather Actual results: Fails because it tries reaching quay.io via its public IPv4 address which doesn't work because the environment is single stack IPv6 hence no IPv4 connectivity. Expected results: When running oc adm must-gather it doesn't try reaching public quay.io. Is there any way we can mirror the image on the disconnected registry at install time? Additional info: apiVersion: v1 baseDomain: qe.lab.redhat.com networking: networkType: OVNKubernetes machineCIDR: fd2e:6f44:5dd8:c956::/64 clusterNetwork: - cidr: fd01::/48 hostPrefix: 64 serviceNetwork: - fd02::/112 metadata: name: ocp-edge-cluster compute: - name: worker replicas: 2 controlPlane: name: master replicas: 3 platform: baremetal: {} platform: baremetal: libvirtURI: qemu+ssh://root.qe.lab.redhat.com/system bootstrapOSImage: http://registry.ocp-edge-cluster.qe.lab.redhat.com:8080/images/rhcos-43.81.202001142154.0-qemu.x86_64.qcow2.gz?sha256=891c93d4ac0a0ed5ea4e3867dd5ecefd77674f4a6c1f9ca9218a176e1695e156 clusterOSImage: http://registry.ocp-edge-cluster.qe.lab.redhat.com:8080/images/rhcos-43.81.202001142154.0-openstack.x86_64.qcow2.gz?sha256=75de2a60078408d237ff20b88145831f152188d04dc705ab2ea086f169b520ba apiVIP: fd2e:6f44:5dd8:c956::5 dnsVIP: fd2e:6f44:5dd8:c956:0:0:0:2 ingressVIP: fd2e:6f44:5dd8:c956::10 hosts: - name: openshift-master-0 role: master bmc: address: ipmi://[fd2e:6f44:5dd8:c956::1]:6230 username: admin password: password bootMACAddress: 52:54:00:f7:4b:18 hardwareProfile: default - name: openshift-master-1 role: master bmc: address: ipmi://[fd2e:6f44:5dd8:c956::1]:6231 username: admin password: password bootMACAddress: 52:54:00:4e:98:a4 hardwareProfile: default - name: openshift-master-2 role: master bmc: address: ipmi://[fd2e:6f44:5dd8:c956::1]:6232 username: admin password: password bootMACAddress: 52:54:00:42:b9:a3 hardwareProfile: default - name: openshift-worker-0 role: worker bmc: address: ipmi://[fd2e:6f44:5dd8:c956::1]:6233 username: admin password: password bootMACAddress: 52:54:00:23:5c:3f hardwareProfile: unknown - name: openshift-worker-1 role: worker bmc: address: ipmi://[fd2e:6f44:5dd8:c956::1]:6234 username: admin password: password bootMACAddress: 52:54:00:ef:d6:1d hardwareProfile: unknown additionalTrustBundle: | -----BEGIN CERTIFICATE----- pullSecret: | { "auths": { "registry.ocp-edge-cluster.qe.lab.redhat.com:5000": { "auth": "" } }} fips: false sshKey: | ssh-rsa kni.qe.lab.redhat.com imageContentSources: - mirrors: - registry.ocp-edge-cluster.qe.lab.redhat.com:5000/localimages/local-release-image source: quay.io/openshift-release-dev/ocp-v4.0-art-dev - mirrors: - registry.ocp-edge-cluster.qe.lab.redhat.com:5000/localimages/local-release-image source: registry.svc.ci.openshift.org/ocp/release
This is a generic problem with `oc adm must-gather` in disconnected environments. See these docs bugs https://bugzilla.redhat.com/show_bug.cgi?id=1771435 If anything were to change in the product it'd either be on the `oc adm mirror` or `oc adm must-gather` side of things, definitely not the installer so I'm moving this to oc component.
I mirrored the must-gather image to the disconnected registry that I used for initial deployment: oc image mirror quay.io/openshift/origin-must-gather:latest registry.ocp-edge-cluster.qe.lab.redhat.com:5000/openshift/origin-must-gather:latest but when I run oc adm must-gather it gets stuck: oc adm must-gather --image registry.ocp-edge-cluster.qe.lab.redhat.com:5000/openshift/origin-must-gather:latest [must-gather ] OUT Using must-gather plugin-in image: registry.ocp-edge-cluster.qe.lab.redhat.com:5000/openshift/origin-must-gather:latest [must-gather ] OUT namespace/openshift-must-gather-8hq7j created [must-gather ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-lldv8 created [must-gather ] OUT pod for plug-in image registry.ocp-edge-cluster.qe.lab.redhat.com:5000/openshift/origin-must-gather:latest created pod is stuck in Init state: openshift-must-gather-8hq7j must-gather-k626f 0/1 Init:0/1 0 13s [kni@provisionhost-0 ~]$ oc -n openshift-must-gather-8hq7j get pods must-gather-k626f -o yaml apiVersion: v1 kind: Pod metadata: annotations: k8s.v1.cni.cncf.io/networks-status: "" creationTimestamp: "2020-03-10T18:50:34Z" generateName: must-gather- labels: app: must-gather name: must-gather-k626f namespace: openshift-must-gather-8hq7j resourceVersion: "242963" selfLink: /api/v1/namespaces/openshift-must-gather-8hq7j/pods/must-gather-k626f uid: a8c09f54-72cd-423e-949e-f16b1da35b56 spec: containers: - command: - /bin/bash - -c - 'trap : TERM INT; sleep infinity & wait' image: registry.ocp-edge-cluster.qe.lab.redhat.com:5000/openshift/origin-must-gather:latest imagePullPolicy: Always name: copy resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /must-gather name: must-gather-output - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: default-token-gl9wz readOnly: true dnsPolicy: ClusterFirst enableServiceLinks: true imagePullSecrets: - name: default-dockercfg-sjk9h initContainers: - command: - /usr/bin/gather image: registry.ocp-edge-cluster.qe.lab.redhat.com:5000/openshift/origin-must-gather:latest imagePullPolicy: Always name: gather resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /must-gather name: must-gather-output - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: default-token-gl9wz readOnly: true nodeName: master-2.ocp-edge-cluster.qe.lab.redhat.com priority: 0 restartPolicy: Never schedulerName: default-scheduler securityContext: {} serviceAccount: default serviceAccountName: default terminationGracePeriodSeconds: 0 tolerations: - operator: Exists volumes: - emptyDir: {} name: must-gather-output - name: default-token-gl9wz secret: defaultMode: 420 secretName: default-token-gl9wz status: conditions: - lastProbeTime: null lastTransitionTime: "2020-03-10T18:50:34Z" message: 'containers with incomplete status: [gather]' reason: ContainersNotInitialized status: "False" type: Initialized - lastProbeTime: null lastTransitionTime: "2020-03-10T18:50:34Z" message: 'containers with unready status: [copy]' reason: ContainersNotReady status: "False" type: Ready - lastProbeTime: null lastTransitionTime: "2020-03-10T18:50:34Z" message: 'containers with unready status: [copy]' reason: ContainersNotReady status: "False" type: ContainersReady - lastProbeTime: null lastTransitionTime: "2020-03-10T18:50:34Z" status: "True" type: PodScheduled containerStatuses: - image: registry.ocp-edge-cluster.qe.lab.redhat.com:5000/openshift/origin-must-gather:latest imageID: "" lastState: {} name: copy ready: false restartCount: 0 started: false state: waiting: reason: PodInitializing hostIP: fd2e:6f44:5dd8:c956::107 initContainerStatuses: - image: registry.ocp-edge-cluster.qe.lab.redhat.com:5000/openshift/origin-must-gather:latest imageID: "" lastState: {} name: gather ready: false restartCount: 0 state: waiting: reason: PodInitializing phase: Pending qosClass: BestEffort startTime: "2020-03-10T18:50:34Z"
The issue in previous comment seems to be caused by another BZ getting the cluster in a broken state. I could run `oc adm must-gather --image registry.ocp-edge-cluster.qe.lab.redhat.com:5000/openshift/origin-must-gather:latest` against a healthy cluster.