Description of problem: When provisioning an IPI on Bare Metal Cluster with Provisioning Network, after the bootstrap phase is completed, the workers are powered on via ipmi and the DHCP+PXE service should be handled by the metal3 operator scheduled in one of the masters. If the MAC address of the provisioning interface (bootMacAddress) is not provided lower-case in the install-config.yaml file, the DHCP service is unavailable. Moreover, the container is set as "ready" even if it isn't. Version-Release number of selected component (if applicable): Tested on 4.11.0-0.nightly-arm64-2022-05-04-091042 with OVN Network How reproducible: IPI on Bare Metal with Managed Provisioning Network Steps to Reproduce: 1. Instantiate an IPI on Bare Metal with Managed Provisioning Network and set the bootMacAddress to NOT be lower-case 2. Wait for the bootstrap to conclude 3. The workers won't boot 4. The installation fails Actual results: The workers won't get installed as they cannot boot by PXE, and the installation fails oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE baremetal 4.11.0-0.nightly-arm64-2022-05-04-091042 True False False 59m oc project openshift-machine-api oc get pods metal3-fffcfb6bd-255n9 7/7 Running 0 60m oc logs metal3-fffcfb6bd-255n9 -c metal3-dnsmasq +++ get_provisioning_interface +++ '[' -n '' ']' +++ local interface=provisioning +++ for mac in ${PROVISIONING_MACS//,/ } +++ ip -br link show up +++ grep -q 00:1b:21:E4:37:EC +++ for mac in ${PROVISIONING_MACS//,/ } +++ ip -br link show up +++ grep -q 00:1B:21:E4:3A:B1 +++ for mac in ${PROVISIONING_MACS//,/ } +++ ip -br link show up +++ grep -q A0:36:9F:30:03:EE +++ echo provisioning ++ export PROVISIONING_INTERFACE=provisioning ++ PROVISIONING_INTERFACE=provisioning ++ export LISTEN_ALL_INTERFACES=true ++ LISTEN_ALL_INTERFACES=true ++ export IRONIC_PRIVATE_PORT=6388 ++ IRONIC_PRIVATE_PORT=6388 ++ export IRONIC_INSPECTOR_PRIVATE_PORT=5049 ++ IRONIC_INSPECTOR_PRIVATE_PORT=5049 + export HTTP_PORT=6180 + HTTP_PORT=6180 + export DNSMASQ_EXCEPT_INTERFACE=lo + DNSMASQ_EXCEPT_INTERFACE=lo + wait_for_interface_or_ip + '[' '!' -z '' ']' + '[' '!' -z '' ']' + echo 'Waiting for provisioning interface to be configured' Waiting for provisioning interface to be configured ++ ip -br add show scope global up dev provisioning ++ awk '{print $3}' ++ sed -e 's%/.*%%' ++ head -n 1 Device "provisioning" does not exist. + export IRONIC_IP= ... Repeated forever ... ip -br link show up lo UNKNOWN 00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP> enP1p4s0 UP 00:1b:21:e4:3a:b1 <BROADCAST,MULTICAST,UP,LOWER_UP> enP4p3s0u2u3c2 UNKNOWN 46:6e:f9:36:e3:12 <BROADCAST,MULTICAST,UP,LOWER_UP> enP2p2s0f0 UP 28:c1:3c:8a:a2:8f <BROADCAST,MULTICAST,UP,LOWER_UP> enP2p2s0f1 UP 28:c1:3c:8a:a2:90 <BROADCAST,MULTICAST,UP,LOWER_UP> The provisioning interface is enP1p4s0 which has the IP set to 172.22.0.3 Expected results: 0. The workers are provisioned 1. The interface lookup by mac should be case-insensitive 2. The cluster-baremetal-operator should not be set as ready/available, as well as the metal3-dnsmasq container in the metal3 pod. Additional info: This seems to happen because /bin/ironic-common.sh in the metal3-dnsmasq container tries to lookup, case-sensitive, the provisioning interface by MAC Address. (within the dnsmasq container) cat /bin/ironic-common.sh #!/usr/bin/bash set -euxo pipefail function get_provisioning_interface() { if [ -n "${PROVISIONING_INTERFACE:-}" ]; then # don't override the PROVISIONING_INTERFACE if one is provided echo ${PROVISIONING_INTERFACE} return fi local interface="provisioning" for mac in ${PROVISIONING_MACS//,/ } ; do if ip -br link show up | grep -q "$mac"; then interface=$(ip -br link show up | grep "$mac" | cut -f 1 -d ' ') break fi done echo $interface }
Marking as not a blocker given the existence of a workaround: setting the bootMacAddress value to all lower case.
A fix was merged in upstream https://github.com/metal3-io/ironic-image/pull/374
https://github.com/openshift/ironic-image/pull/283 created for downstream sync
Verified on payload registry.ci.openshift.org/ocp-arm64/release-arm64:4.12.0-0.nightly-arm64-2022-08-09-214103 steps: 1. BootMacAddress given Case mixing like: - name: worker-00 role: worker bmc: address: ipmi://openshift-qe-039.mgmt.arm.eng.rdu2.redhat.com/ disableCertificateVerification: true username: *** HIDDEN *** password: *** HIDDEN *** bootMACAddress: a0:36:9F:30:04:B4 rootDeviceHints: deviceName: "/dev/nvme0n1" networkConfig: interfaces: ...... - name: worker-01 role: worker bmc: address: ipmi://openshift-qe-040.mgmt.arm.eng.rdu2.redhat.com/ disableCertificateVerification: true username: *** HIDDEN *** password: *** HIDDEN *** bootMACAddress: A0:36:9F:30:04:C4 rootDeviceHints: deviceName: "/dev/nvme0n1" networkConfig: interfaces: 2.launch an IPI on Bare Metal with Managed Provisioning Network 3. Wait and check worker nodes can be created oc get nodes NAME STATUS ROLES AGE VERSION worker-00.lwanbug2081734.qeclusters.arm.eng.rdu2.redhat.com Ready worker 36m v1.24.0+a9d6306 worker-01.lwanbug2081734.qeclusters.arm.eng.rdu2.redhat.com Ready worker 36m v1.24.0+a9d6306 4. Check logs of container metal3-dnsmasq, it can recognize upper case BootMacAddress oc logs metal3-6bb77d9df6-xgjjc -c metal3-dnsmasq +++ get_provisioning_interface +++ '[' -n '' ']' +++ local interface=provisioning +++ for mac in ${PROVISIONING_MACS//,/ } +++ ip -br link show up +++ grep -qi 00:1B:21:E4:63:30 +++ for mac in ${PROVISIONING_MACS//,/ } +++ ip -br link show up +++ grep -qi 00:1B:21:E4:37:A7 ++++ ip -br link show up ++++ grep -i 00:1B:21:E4:37:A7 ++++ cut -f 1 -d ' ' +++ interface=enP1p4s0 +++ break +++ echo enP1p4s0 ++ export PROVISIONING_INTERFACE=enP1p4s0 ++ PROVISIONING_INTERFACE=enP1p4s0 ++ export LISTEN_ALL_INTERFACES=true ++ LISTEN_ALL_INTERFACES=true ++ export IRONIC_PRIVATE_PORT=6388 ++ IRONIC_PRIVATE_PORT=6388 ++ export IRONIC_INSPECTOR_PRIVATE_PORT=5049 ++ IRONIC_INSPECTOR_PRIVATE_PORT=5049 + export HTTP_PORT=6180
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399