Bug 1862979
Summary: | [OCP 4.6] failed to provision master node due to cannot get image from mirror registry | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Yunfei Jiang <yunjiang> | ||||||
Component: | Machine Config Operator | Assignee: | Sinny Kumari <skumari> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Yunfei Jiang <yunjiang> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | urgent | ||||||||
Version: | 4.6 | CC: | amurdaca, dhellmann, kgarriso, miabbott, skumari, somalley, stbenjam, vrutkovs, walters, wking, xtian, xxia | ||||||
Target Milestone: | --- | Keywords: | TestBlocker | ||||||
Target Release: | 4.6.0 | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2020-10-27 16:22:34 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Yunfei Jiang
2020-08-03 12:00:07 UTC
Created attachment 1703284 [details]
machine-config-daemon-firstboot.service
Created attachment 1703285 [details]
bootstrap log
this bug blocks all tests against disconnected environment We see the same on all baremetal IPv6 jobs, which must be disconnected due to quay not supporting IPv6: Aug 03 12:25:32 master-0.ostest.test.metalkube.org machine-config-daemon[2451]: error: unable to connect to image repository quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:70cfcdee7fa0eac2578f32f197b410d0f50d5bb10ac56ba402eb758e50e76d04: Get "https://quay.io/v2/": dial tcp 34.198.42.182:443: connect: network is unreachable I think this beings against MCO, which is the source of the problem. Either MCD is running an `oc` command that's not accounting for disconnected installs, or `oc` itself has a problem. Is https://github.com/openshift/enhancements/pull/334 another option to avoid _not_ using oc? Adding upgrade blocker tag as well as we don't have any indication it doesn't block as of now. Feel free to remove if we find otherwise. *** Bug 1862948 has been marked as a duplicate of this bug. *** *** Bug 1863335 has been marked as a duplicate of this bug. *** After having a brainstorming session with Antonio today, we came up with another solution to fix the problem and this involves minimal changes: - We keep the current implementation (i.e keep using oc image extract) of CoreOS extensions support - Until oc fixes gets in to support mirror registry- when `oc image extract` fails, we fallback to copying machine-os-content on nodes using `podman pull osImageURL && podman create osImageURL && podman cp container_ID:/ /run/machine-os-content/os-content-XXXX` The fallback solution is applied only when oc image extract has failed. *** Bug 1862948 has been marked as a duplicate of this bug. *** This bug seems to affect proxy environments too (which are similar to disconnected - image cannot be downloaded directly from quay and `oc image extract` doesn't take mirrors/proxies into account) (In reply to Vadim Rutkovsky from comment #12) > This bug seems to affect proxy environments too (which are similar to > disconnected - image cannot be downloaded directly from quay and `oc image > extract` doesn't take mirrors/proxies into account) thanks Vadim, we're tackling that separately (In reply to Antonio Murdaca from comment #16) > (In reply to Vadim Rutkovsky from comment #12) > > This bug seems to affect proxy environments too (which are similar to > > disconnected - image cannot be downloaded directly from quay and `oc image > > extract` doesn't take mirrors/proxies into account) > > thanks Vadim, we're tackling that separately Perhaps we should reopen the bug https://bugzilla.redhat.com/show_bug.cgi?id=1862948 which has proxy setup. @yunjiang Mike N. is on paternity leave and additionally, we do not have the infrastructure to test disconnected installs. Would it be possible that you could retest this and indicate if the BZ is verified? (In reply to Sinny Kumari from comment #17) > (In reply to Antonio Murdaca from comment #16) > > (In reply to Vadim Rutkovsky from comment #12) > > > This bug seems to affect proxy environments too (which are similar to > > > disconnected - image cannot be downloaded directly from quay and `oc image > > > extract` doesn't take mirrors/proxies into account) > > > > thanks Vadim, we're tackling that separately > > Perhaps we should reopen the bug > https://bugzilla.redhat.com/show_bug.cgi?id=1862948 which has proxy setup. No, proxy issue is caused by the very same rootcase (so I closed the proxy bug as dupe) *** Bug 1862948 has been marked as a duplicate of this bug. *** verified. PASS. version: 4.6.0-0.nightly-2020-08-10-180431 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 Removing UpgradeBlocker from this older bug, to remove it from the suspect queue described in [1]. If you feel like this bug still needs to be a suspect, please add keyword again. [1]: https://github.com/openshift/enhancements/pull/475 |