Description of problem: OCP Baremetal IPI inside of Openstack environment requires the RootDeviceHint to be /dev/vdb when a pxeboot is performed given that OSP uses nova rescue and a pxeboot.img as /dev/vda. Adding this additional openstack profile which specifies the vdb device will enable us to deploy OCP Baremetal IPI inside of Openstack for internal RH training. Version-Release number of selected component (if applicable): 4.5 How reproducible: N/A Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: Upstream PR: https://github.com/metal3-io/baremetal-operator/pull/486 Downstream PR: https://github.com/openshift/baremetal-operator/pull/62
We were able to validate the BMO change with: 4.5.0-0.nightly-2020-04-29-064431 [cloud-user@provision scripts]$ oc get nodes NAME STATUS ROLES AGE VERSION master-0 Ready master 54m v1.18.0-rc.1 master-1 Ready master 54m v1.18.0-rc.1 master-2 Ready master 54m v1.18.0-rc.1 worker-0 Ready worker 9m31s v1.18.0-rc.1 worker-1 Ready worker 12m v1.18.0-rc.1 [cloud-user@provision scripts]$ oc get bmh -n openshift-machine-api NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR master-0 OK externally provisioned kni1-master-0 ipmi://10.20.0.3:6202 true master-1 OK externally provisioned kni1-master-1 ipmi://10.20.0.3:6201 true master-2 OK externally provisioned kni1-master-2 ipmi://10.20.0.3:6203 true worker-0 OK provisioned kni1-worker-0-542pq ipmi://10.20.0.3:6200 openstack true worker-1 OK provisioned kni1-worker-0-d8qgf ipmi://10.20.0.3:6204 openstack true [cloud-user@provision scripts]$ oc describe bmh worker-0 -n openshift-machine-api Name: worker-0 Namespace: openshift-machine-api Labels: <none> Annotations: <none> API Version: metal3.io/v1alpha1 Kind: BareMetalHost Metadata: Creation Timestamp: 2020-05-01T14:09:54Z Finalizers: machine.machine.openshift.io baremetalhost.metal3.io Generation: 2 Resource Version: 36724 Self Link: /apis/metal3.io/v1alpha1/namespaces/openshift-machine-api/baremetalhosts/worker-0 UID: 73023dbf-b300-4855-8cbe-253be625c0e8 Spec: Bmc: Address: ipmi://10.20.0.3:6200 Credentials Name: worker-0-bmc-secret Boot MAC Address: de:ad:be:ef:00:50 Consumer Ref: API Version: machine.openshift.io/v1beta1 Kind: Machine Name: kni1-worker-0-542pq Namespace: openshift-machine-api Hardware Profile: openstack Image: Checksum: http://172.22.0.3:6180/images/rhcos-44.81.202004250133-0-openstack.x86_64.qcow2/rhcos-44.81.202004250133-0-compressed.x86_64.qcow2.md5sum URL: http://172.22.0.3:6180/images/rhcos-44.81.202004250133-0-openstack.x86_64.qcow2/rhcos-44.81.202004250133-0-compressed.x86_64.qcow2 Online: true User Data: Name: worker-user-data Namespace: openshift-machine-api Status: Error Message: Good Credentials: Credentials: Name: worker-0-bmc-secret Namespace: openshift-machine-api Credentials Version: 25339 Hardware: Cpu: Arch: x86_64 Clock Megahertz: 2599.996 Count: 4 Flags: 3dnowprefetch abm adx aes apic arat arch_perfmon avx avx2 bmi1 bmi2 clflush cmov constant_tsc cpuid cx16 cx8 de ept erms f16c flexpriority fma fpu fsgsbase fxsr hle hypervisor ibpb ibrs invpcid invpcid_single lahf_lm lm mca mce mmx movbe msr mtrr nopl nx pae pat pcid pclmulqdq pdpe1gb pge pni popcnt pse pse36 pti rdrand rdseed rdtscp rep_good rtm sep smap smep ss ssbd sse sse2 sse4_1 sse4_2 ssse3 stibp syscall tpr_shadow tsc tsc_adjust tsc_deadline_timer tsc_known_freq vme vmx vnmi vpid x2apic xsave xsaveopt xtopology Model: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz Firmware: Bios: Date: Vendor: Version: Hostname: worker-0 Nics: Ip: 172.22.0.46 Mac: de:ad:be:ef:00:50 Model: 0x1af4 0x0001 Name: ens3 Pxe: true Speed Gbps: 0 Vlan Id: 0 Ip: 10.20.0.200 Mac: ba:dc:0f:fe:e0:50 Model: 0x1af4 0x0001 Name: ens4 Pxe: false Speed Gbps: 0 Vlan Id: 0 Ram Mebibytes: 16384 Storage: Name: /dev/vda Rotational: true Serial Number: Size Bytes: 1048576 Vendor: 0x1af4 Name: /dev/vdb Rotational: true Serial Number: Size Bytes: 107374182400 Vendor: 0x1af4 System Vendor: Manufacturer: Red Hat Product Name: OpenStack Compute Serial Number: 00000000-0000-0000-0000-0cc47ad2e2c6 Hardware Profile: openstack Last Updated: 2020-05-01T15:12:57Z Operation History: Deprovision: End: <nil> Start: <nil> Inspect: End: 2020-05-01T14:53:19Z Start: 2020-05-01T14:51:52Z Provision: End: 2020-05-01T15:00:55Z Start: 2020-05-01T14:56:33Z Register: End: 2020-05-01T14:51:52Z Start: 2020-05-01T14:50:54Z Operational Status: OK Powered On: true Provisioning: ID: 48d87691-6ffd-4f0f-b779-ab3aec30c4c0 Image: Checksum: http://172.22.0.3:6180/images/rhcos-44.81.202004250133-0-openstack.x86_64.qcow2/rhcos-44.81.202004250133-0-compressed.x86_64.qcow2.md5sum URL: http://172.22.0.3:6180/images/rhcos-44.81.202004250133-0-openstack.x86_64.qcow2/rhcos-44.81.202004250133-0-compressed.x86_64.qcow2 State: provisioned Tried Credentials: Credentials: Name: worker-0-bmc-secret Namespace: openshift-machine-api Credentials Version: 25339 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Registered 21m metal3-baremetal-controller Registered new host Normal BMCAccessValidated 21m metal3-baremetal-controller Verified access to BMC Normal InspectionStarted 21m metal3-baremetal-controller Hardware inspection started Normal InspectionComplete 20m metal3-baremetal-controller Hardware inspection completed Normal ProfileSet 20m metal3-baremetal-controller Hardware profile set: openstack Normal ProvisioningStarted 16m metal3-baremetal-controller Image provisioning started for http://172.22.0.3:6180/images/rhcos-44.81.202004250133-0-openstack.x86_64.qcow2/rhcos-44.81.202004250133-0-compressed.x86_64.qcow2 Normal ProvisioningComplete 12m metal3-baremetal-controller Image provisioning completed for http://172.22.0.3:6180/images/rhcos-44.81.202004250133-0-openstack.x86_64.qcow2/rhcos-44.81.202004250133-0-compressed.x86_64.qcow2 [cloud-user@provision scripts]$ oc describe bmh worker-1 -n openshift-machine-api Name: worker-1 Namespace: openshift-machine-api Labels: <none> Annotations: <none> API Version: metal3.io/v1alpha1 Kind: BareMetalHost Metadata: Creation Timestamp: 2020-05-01T14:09:54Z Finalizers: machine.machine.openshift.io baremetalhost.metal3.io Generation: 2 Resource Version: 36593 Self Link: /apis/metal3.io/v1alpha1/namespaces/openshift-machine-api/baremetalhosts/worker-1 UID: edd2315d-8f43-40ca-abf6-71453ca14982 Spec: Bmc: Address: ipmi://10.20.0.3:6204 Credentials Name: worker-1-bmc-secret Boot MAC Address: de:ad:be:ef:00:51 Consumer Ref: API Version: machine.openshift.io/v1beta1 Kind: Machine Name: kni1-worker-0-d8qgf Namespace: openshift-machine-api Hardware Profile: openstack Image: Checksum: http://172.22.0.3:6180/images/rhcos-44.81.202004250133-0-openstack.x86_64.qcow2/rhcos-44.81.202004250133-0-compressed.x86_64.qcow2.md5sum URL: http://172.22.0.3:6180/images/rhcos-44.81.202004250133-0-openstack.x86_64.qcow2/rhcos-44.81.202004250133-0-compressed.x86_64.qcow2 Online: true User Data: Name: worker-user-data Namespace: openshift-machine-api Status: Error Message: Good Credentials: Credentials: Name: worker-1-bmc-secret Namespace: openshift-machine-api Credentials Version: 25344 Hardware: Cpu: Arch: x86_64 Clock Megahertz: 2599.996 Count: 4 Flags: 3dnowprefetch abm adx aes apic arat arch_perfmon avx avx2 bmi1 bmi2 clflush cmov constant_tsc cpuid cx16 cx8 de ept erms f16c flexpriority fma fpu fsgsbase fxsr hle hypervisor ibpb ibrs invpcid invpcid_single lahf_lm lm mca mce mmx movbe msr mtrr nopl nx pae pat pcid pclmulqdq pdpe1gb pge pni popcnt pse pse36 pti rdrand rdseed rdtscp rep_good rtm sep smap smep ss ssbd sse sse2 sse4_1 sse4_2 ssse3 stibp syscall tpr_shadow tsc tsc_adjust tsc_deadline_timer tsc_known_freq vme vmx vnmi vpid x2apic xsave xsaveopt xtopology Model: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz Firmware: Bios: Date: Vendor: Version: Hostname: worker-1 Nics: Ip: 172.22.0.47 Mac: de:ad:be:ef:00:51 Model: 0x1af4 0x0001 Name: ens3 Pxe: true Speed Gbps: 0 Vlan Id: 0 Ip: 10.20.0.201 Mac: ba:dc:0f:fe:e0:51 Model: 0x1af4 0x0001 Name: ens4 Pxe: false Speed Gbps: 0 Vlan Id: 0 Ram Mebibytes: 16384 Storage: Name: /dev/vda Rotational: true Serial Number: Size Bytes: 1048576 Vendor: 0x1af4 Name: /dev/vdb Rotational: true Serial Number: Size Bytes: 107374182400 Vendor: 0x1af4 System Vendor: Manufacturer: Red Hat Product Name: OpenStack Compute Serial Number: 00000000-0000-0000-0000-0cc47ad2e2c6 Hardware Profile: openstack Last Updated: 2020-05-01T15:12:33Z Operation History: Deprovision: End: <nil> Start: <nil> Inspect: End: 2020-05-01T14:52:59Z Start: 2020-05-01T14:51:27Z Provision: End: 2020-05-01T14:58:30Z Start: 2020-05-01T14:53:48Z Register: End: 2020-05-01T14:51:27Z Start: 2020-05-01T14:50:54Z Operational Status: OK Powered On: true Provisioning: ID: 318a6d28-4d54-4aab-82c6-742c941e2eae Image: Checksum: http://172.22.0.3:6180/images/rhcos-44.81.202004250133-0-openstack.x86_64.qcow2/rhcos-44.81.202004250133-0-compressed.x86_64.qcow2.md5sum URL: http://172.22.0.3:6180/images/rhcos-44.81.202004250133-0-openstack.x86_64.qcow2/rhcos-44.81.202004250133-0-compressed.x86_64.qcow2 State: provisioned Tried Credentials: Credentials: Name: worker-1-bmc-secret Namespace: openshift-machine-api Credentials Version: 25344 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Registered 22m metal3-baremetal-controller Registered new host Normal BMCAccessValidated 22m metal3-baremetal-controller Verified access to BMC Normal InspectionStarted 21m metal3-baremetal-controller Hardware inspection started Normal InspectionComplete 20m metal3-baremetal-controller Hardware inspection completed Normal ProfileSet 20m metal3-baremetal-controller Hardware profile set: openstack Normal ProvisioningStarted 19m metal3-baremetal-controller Image provisioning started for http://172.22.0.3:6180/images/rhcos-44.81.202004250133-0-openstack.x86_64.qcow2/rhcos-44.81.202004250133-0-compressed.x86_64.qcow2 Normal ProvisioningComplete 15m metal3-baremetal-controller Image provisioning completed for http://172.22.0.3:6180/images/rhcos-44.81.202004250133-0-openstack.x86_64.qcow2/rhcos-44.81.202004250133-0-compressed.x86_64.qcow2
Testing this out requires the BMO in the installer binary to also be patched because the BMO there is maintained separately. I have been using the following method on my provisioning node. Note I grab an upstream go because the version that comes with RHEL8 is not high enough for the baremetal installer compile. sudo yum install go wget -c https://dl.google.com/go/go1.13.10.linux-amd64.tar.gz sudo tar -C /usr/local -xzf go1.13.10.linux-amd64.tar.gz export PATH=/usr/local/go/bin:$PATH export PATH=$PATH:$HOME/scripts export GOPATH=/home/cloud-user/go mkdir -p $HOME/go/src/github.com/openshift cd $HOME/go/src/github.com/openshift git clone --single-branch --branch release-4.5 https://github.com/openshift/installer.git cd installer/ vi vendor/github.com/metal3-io/baremetal-operator/pkg/hardware/profile.go sudo yum -y install libvirt-devel TAGS="baremetal libvirt" hack/build.sh cp bin/openshift-install $HOME/scripts/openshift-baremetal-install
vendor/github.com/metal3-io/baremetal-operator/pkg/hardware/profile.go should get the following added: profiles["openstack"] = Profile{ Name: "openstack", RootDeviceHints: RootDeviceHints{ DeviceName: "/dev/vdb", }, RootGB: 10, LocalGB: 50, CPUArch: "x86_64", } In profiles section.
New profile `openstack` cannot be used during initial deployment, this is tracked in 1826475 but is available to baremetal operator when adding worker nodes as day 2 operation (when cluster is already up and running) Testing: 1. Deployed OCP 4.5.0-0.nightly-2020-05-06-003431 - 3masters+2nodes as vms 2. Worker VMs have sda, vda and vdb disks but initially deployed workers use /dev/sda 3. Created BMH resource for extra worker node and set profile to `openstack`: --- apiVersion: metal3.io/v1alpha1 kind: BareMetalHost metadata: creationTimestamp: null name: openshift-worker-0-2 namespace: openshift-machine-api spec: bmc: address: ... credentialsName: openshift-worker-0-2-bmc-secret disableCertificateVerification: true bootMACAddress: 52:54:00:31:02:e0 hardwareProfile: openstack image: checksum: http://... url: http://... online: false userData: name: worker-user-data namespace: openshift-machine-api oc get bmh openshift-worker-0-2 -n openshift-machine-api -o json | jq '.status.hardwareProfile' "openstack" 4. After introspection finished scaled up machineset for workers 5. Checked node's properties in ironic properties: capabilities: cpu_vt:true,cpu_aes:true,cpu_hugepages:true,cpu_hugepages_1g:true cpu_arch: x86_64 cpus: '8' local_gb: 50 memory_mb: '16384' root_device: name: /dev/vdb 6. Launched app to make sure pods can be scheduled to new node(worker-0-2) : oc get po -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES yp-1-deploy 0/1 Completed 0 101s 10.129.2.4 worker-0-2 <none> <none> yp-1-k4kfp 1/1 Running 0 46s 10.128.2.4 worker-0-0 <none> <none> yp-1-st42n 1/1 Running 0 98s 10.129.2.5 worker-0-2 <none> <none> yp-1-zmzdx 1/1 Running 0 46s 10.131.0.11 worker-0-1 <none> <none> Ben, please provide a doc text for this bz that this profile cannot be used during initial cluster deployment
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409