Bug 2081734
| Summary: | metal3-dnsmasq: workers are not provisioned during the cluster installation when BootMacAddress is not provided lower-case | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | aleskandro <adistefa> | |
| Component: | Bare Metal Hardware Provisioning | Assignee: | Tudor Domnescu <tdomnesc> | |
| Bare Metal Hardware Provisioning sub component: | ironic | QA Contact: | wang lin <lwan> | |
| Status: | CLOSED ERRATA | Docs Contact: | ||
| Severity: | medium | |||
| Priority: | medium | CC: | adistefa, aos-bugs, derekh, lwan, mifiedle, rpittau, tsedovic | |
| Version: | 4.11 | Keywords: | Triaged | |
| Target Milestone: | --- | |||
| Target Release: | 4.12.0 | |||
| Hardware: | All | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2110407 (view as bug list) | Environment: | ||
| Last Closed: | 2023-01-17 19:48:18 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2110407 | |||
Marking as not a blocker given the existence of a workaround: setting the bootMacAddress value to all lower case. A fix was merged in upstream https://github.com/metal3-io/ironic-image/pull/374 https://github.com/openshift/ironic-image/pull/283 created for downstream sync Verified on payload registry.ci.openshift.org/ocp-arm64/release-arm64:4.12.0-0.nightly-arm64-2022-08-09-214103
steps:
1. BootMacAddress given Case mixing like:
- name: worker-00
role: worker
bmc:
address: ipmi://openshift-qe-039.mgmt.arm.eng.rdu2.redhat.com/
disableCertificateVerification: true
username: *** HIDDEN ***
password: *** HIDDEN ***
bootMACAddress: a0:36:9F:30:04:B4
rootDeviceHints:
deviceName: "/dev/nvme0n1"
networkConfig:
interfaces:
......
- name: worker-01
role: worker
bmc:
address: ipmi://openshift-qe-040.mgmt.arm.eng.rdu2.redhat.com/
disableCertificateVerification: true
username: *** HIDDEN ***
password: *** HIDDEN ***
bootMACAddress: A0:36:9F:30:04:C4
rootDeviceHints:
deviceName: "/dev/nvme0n1"
networkConfig:
interfaces:
2.launch an IPI on Bare Metal with Managed Provisioning Network
3. Wait and check worker nodes can be created
oc get nodes
NAME STATUS ROLES AGE VERSION
worker-00.lwanbug2081734.qeclusters.arm.eng.rdu2.redhat.com Ready worker 36m v1.24.0+a9d6306
worker-01.lwanbug2081734.qeclusters.arm.eng.rdu2.redhat.com Ready worker 36m v1.24.0+a9d6306
4. Check logs of container metal3-dnsmasq, it can recognize upper case BootMacAddress
oc logs metal3-6bb77d9df6-xgjjc -c metal3-dnsmasq
+++ get_provisioning_interface
+++ '[' -n '' ']'
+++ local interface=provisioning
+++ for mac in ${PROVISIONING_MACS//,/ }
+++ ip -br link show up
+++ grep -qi 00:1B:21:E4:63:30
+++ for mac in ${PROVISIONING_MACS//,/ }
+++ ip -br link show up
+++ grep -qi 00:1B:21:E4:37:A7
++++ ip -br link show up
++++ grep -i 00:1B:21:E4:37:A7
++++ cut -f 1 -d ' '
+++ interface=enP1p4s0
+++ break
+++ echo enP1p4s0
++ export PROVISIONING_INTERFACE=enP1p4s0
++ PROVISIONING_INTERFACE=enP1p4s0
++ export LISTEN_ALL_INTERFACES=true
++ LISTEN_ALL_INTERFACES=true
++ export IRONIC_PRIVATE_PORT=6388
++ IRONIC_PRIVATE_PORT=6388
++ export IRONIC_INSPECTOR_PRIVATE_PORT=5049
++ IRONIC_INSPECTOR_PRIVATE_PORT=5049
+ export HTTP_PORT=6180
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399 |
Description of problem: When provisioning an IPI on Bare Metal Cluster with Provisioning Network, after the bootstrap phase is completed, the workers are powered on via ipmi and the DHCP+PXE service should be handled by the metal3 operator scheduled in one of the masters. If the MAC address of the provisioning interface (bootMacAddress) is not provided lower-case in the install-config.yaml file, the DHCP service is unavailable. Moreover, the container is set as "ready" even if it isn't. Version-Release number of selected component (if applicable): Tested on 4.11.0-0.nightly-arm64-2022-05-04-091042 with OVN Network How reproducible: IPI on Bare Metal with Managed Provisioning Network Steps to Reproduce: 1. Instantiate an IPI on Bare Metal with Managed Provisioning Network and set the bootMacAddress to NOT be lower-case 2. Wait for the bootstrap to conclude 3. The workers won't boot 4. The installation fails Actual results: The workers won't get installed as they cannot boot by PXE, and the installation fails oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE baremetal 4.11.0-0.nightly-arm64-2022-05-04-091042 True False False 59m oc project openshift-machine-api oc get pods metal3-fffcfb6bd-255n9 7/7 Running 0 60m oc logs metal3-fffcfb6bd-255n9 -c metal3-dnsmasq +++ get_provisioning_interface +++ '[' -n '' ']' +++ local interface=provisioning +++ for mac in ${PROVISIONING_MACS//,/ } +++ ip -br link show up +++ grep -q 00:1b:21:E4:37:EC +++ for mac in ${PROVISIONING_MACS//,/ } +++ ip -br link show up +++ grep -q 00:1B:21:E4:3A:B1 +++ for mac in ${PROVISIONING_MACS//,/ } +++ ip -br link show up +++ grep -q A0:36:9F:30:03:EE +++ echo provisioning ++ export PROVISIONING_INTERFACE=provisioning ++ PROVISIONING_INTERFACE=provisioning ++ export LISTEN_ALL_INTERFACES=true ++ LISTEN_ALL_INTERFACES=true ++ export IRONIC_PRIVATE_PORT=6388 ++ IRONIC_PRIVATE_PORT=6388 ++ export IRONIC_INSPECTOR_PRIVATE_PORT=5049 ++ IRONIC_INSPECTOR_PRIVATE_PORT=5049 + export HTTP_PORT=6180 + HTTP_PORT=6180 + export DNSMASQ_EXCEPT_INTERFACE=lo + DNSMASQ_EXCEPT_INTERFACE=lo + wait_for_interface_or_ip + '[' '!' -z '' ']' + '[' '!' -z '' ']' + echo 'Waiting for provisioning interface to be configured' Waiting for provisioning interface to be configured ++ ip -br add show scope global up dev provisioning ++ awk '{print $3}' ++ sed -e 's%/.*%%' ++ head -n 1 Device "provisioning" does not exist. + export IRONIC_IP= ... Repeated forever ... ip -br link show up lo UNKNOWN 00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP> enP1p4s0 UP 00:1b:21:e4:3a:b1 <BROADCAST,MULTICAST,UP,LOWER_UP> enP4p3s0u2u3c2 UNKNOWN 46:6e:f9:36:e3:12 <BROADCAST,MULTICAST,UP,LOWER_UP> enP2p2s0f0 UP 28:c1:3c:8a:a2:8f <BROADCAST,MULTICAST,UP,LOWER_UP> enP2p2s0f1 UP 28:c1:3c:8a:a2:90 <BROADCAST,MULTICAST,UP,LOWER_UP> The provisioning interface is enP1p4s0 which has the IP set to 172.22.0.3 Expected results: 0. The workers are provisioned 1. The interface lookup by mac should be case-insensitive 2. The cluster-baremetal-operator should not be set as ready/available, as well as the metal3-dnsmasq container in the metal3 pod. Additional info: This seems to happen because /bin/ironic-common.sh in the metal3-dnsmasq container tries to lookup, case-sensitive, the provisioning interface by MAC Address. (within the dnsmasq container) cat /bin/ironic-common.sh #!/usr/bin/bash set -euxo pipefail function get_provisioning_interface() { if [ -n "${PROVISIONING_INTERFACE:-}" ]; then # don't override the PROVISIONING_INTERFACE if one is provided echo ${PROVISIONING_INTERFACE} return fi local interface="provisioning" for mac in ${PROVISIONING_MACS//,/ } ; do if ip -br link show up | grep -q "$mac"; then interface=$(ip -br link show up | grep "$mac" | cut -f 1 -d ' ') break fi done echo $interface }