Description of problem: All release-*-e2e-metal*4.6 (all three normal, serial and compact variant) jobs are recently failing with messages like this: ``` Error: Ipxe script url http://http-matchbox.svc.ci.openshift.org/ipxe?cluster_id=ci-op-jgf5n4pv-18a82&role=worker is not accessible ``` - https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-metal-compact-4.6/1285407011529297920#1:build-log.txt%3A96 - https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-metal-4.6/1285407011453800448#1:build-log.txt%3A98 - https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-metal-serial-4.6/1285214460008468480#1:build-log.txt%3A133 How reproducible: Sippy says 0.00% (100.00%) (35 runs) Additional information: Slack conversation about the issue: https://coreos.slack.com/archives/C0NFH1Y1X/p1595288717004400
This appears to be a change to Packet's cloud APIs, I'll figure out exactly where it started happening so they can pinpoint it but it's not even creating hosts and I don't see requests coming in to matchbox.
*** Bug 1859270 has been marked as a duplicate of this bug. ***
Last success was 2020/07/14 21:00 UTC, first failure was 2020/07/15 09:39:41. So something between those two dates.
It looks like Packet's iPXE url validator grew IPv6 support around the time that things broke. They're asking if we made any DNS changes but I suspect it's a problem on their side but may necessitate changes on our side.
We need our cloud provider to make some changes to their PXE management to fix things here. We should see fixes from them in the next few days.
Moving this matchbox service to a cluster that has IPv4 ingress for the time being Packet fixes their IPv6..
Checked the latest e2e-metal CI jobs, no such error happened during terraform creation steps, move this bug to verified. e2e-metal-compact-4.6: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-metal-compact-4.6/1291030492501512192 https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-metal-compact-4.6/1290849189617471488 e2e-metal-serial-4.6: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-metal-serial-4.6/1291182676027379712 https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-metal-serial-4.6/1291112931575992320 e2e-metal-4.6: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-metal-4.6/1291182673779232768 https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-metal-4.6/1291112933316628481
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196