Description of problem: ----------------------- During OCP deployment rootDeviceHints specified in install-config.yaml ain't honoured. Excerpt from install-config.yaml: - name: openshift-worker-0-1 role: worker bmc: address: redfish://... disableCertificateVerification: True username: *** password: *** bootMACAddress: 52:54:00:82:cf:e7 rootDeviceHints: minSizeGigabytes: 40 The node has 6 disks: --------------------- Storage: Hctl: 0:0:0:0 Model: QEMU HARDDISK Name: /dev/sda Rotational: true Serial Number: drive-scsi0-0-0-0 Size Bytes: 26843545600 Vendor: QEMU Hctl: 0:0:0:5 Model: QEMU HARDDISK Name: /dev/sdb Rotational: true Serial Number: drive-scsi0-0-0-5 Size Bytes: 10737418240 Vendor: QEMU Hctl: 0:0:0:4 Model: QEMU HARDDISK Name: /dev/sdc Rotational: true Serial Number: drive-scsi0-0-0-4 Size Bytes: 10737418240 Vendor: QEMU Hctl: 0:0:0:3 Model: QEMU HARDDISK Name: /dev/sdd Rotational: true Serial Number: drive-scsi0-0-0-3 Size Bytes: 10737418240 Vendor: QEMU Hctl: 0:0:0:2 Model: QEMU HARDDISK Name: /dev/sde Rotational: true Serial Number: drive-scsi0-0-0-2 Size Bytes: 10737418240 Vendor: QEMU Hctl: 0:0:0:1 Model: QEMU HARDDISK Name: /dev/sdf Rotational: true Serial Number: drive-scsi0-0-0-1 Size Bytes: 48318382080 Vendor: QEMU Based on hint a disk with minimum of 40GB has to be chosen. But when checking node: [core@worker-0-0 ~]$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 25G 0 disk ├─sda1 8:1 0 384M 0 part /boot ├─sda2 8:2 0 127M 0 part /boot/efi ├─sda3 8:3 0 1M 0 part ├─sda4 8:4 0 24.4G 0 part │ └─coreos-luks-root-nocrypt 253:0 0 24.4G 0 dm /sysroot └─sda5 8:5 0 65M 0 part sdb 8:16 0 10G 0 disk sdc 8:32 0 10G 0 disk sdd 8:48 0 10G 0 disk sde 8:64 0 10G 0 disk sdf 8:80 0 45G 0 disk Events for worker's BMH objects: -------------------------------- Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Registered 51m metal3-baremetal-controller Registered new host Normal BMCAccessValidated 51m metal3-baremetal-controller Verified access to BMC Normal InspectionStarted 51m metal3-baremetal-controller Hardware inspection started Normal InspectionComplete 46m metal3-baremetal-controller Hardware inspection completed Normal ProfileSet 46m metal3-baremetal-controller Hardware profile set: unknown Normal ProvisioningStarted 46m metal3-baremetal-controller Image provisioning started for http://..../rhcos-46.82.202008260918-0-compressed.x86_64.qcow2 Normal ProvisioningError 46m metal3-baremetal-controller Image provisioning failed: node 513f836e-e3e6-4b8f-a824-ba7c9077aa84 command status errored: {'type': 'ImageWriteError', 'code': 500, 'message': 'Error writing image to device', 'details': 'Writing image to device /dev/sdb failed with exit code 1. stdout: write_image.sh: Erasing existing GPT and MBR data structures from /dev/sdb\nCreating new GPT entries.\nGPT data structures destroyed! You may now partition the disk using fdisk or\nother utilities.\nwrite_image.sh: Imaging /tmp/rhcos-46.82.202008260918-0-compressed.x86_64.qcow2 to /dev/sdb\n. stderr: 33+0 records in\n33+0 records out\n16896 bytes (17 kB, 16 KiB) copied, 0.00156113 s, 10.8 MB/s\n33+0 records in\n33+0 records out\n16896 bytes (17 kB, 16 KiB) copied, 0.00101555 s, 16.6 MB/s\nqemu-img: /dev/sdb: error while converting host_device: Device is too small\n'} Normal DeprovisioningStarted 46m metal3-baremetal-controller Image deprovisioning started Normal DeprovisioningComplete 45m metal3-baremetal-controller Image deprovisioning completed Normal ProvisioningStarted 45m metal3-baremetal-controller Image provisioning started for http://.../rhcos-46.82.202008260918-0-compressed.x86_64.qcow2 Normal ProvisioningComplete 41m metal3-baremetal-controller Image provisioning completed for http://.../rhcos-46.82.202008260918-0-compressed.x86_64.qcow2 Version-Release number of selected component (if applicable): ------------------------------------------------------------- 4.6.0-0.nightly-2020-08-31-220837 How reproducible: ----------------- 100% Steps to Reproduce: ------------------- 1. Deploy OCP 4.6 with rootDeviceHint values set for workers 2. 3. Actual results: --------------- Deployment finished but deployed to wrong disk Expected results: ----------------- Deployment finished and deployed to correct disk Additional info: ---------------- Virtual deployment: 3masters + 2workers + provisionhost. Masters with 2 disks(1st - 25Gb, 2nd - 45Gb) Workers with 6 disks(1st - 25Gb, 2-5 - 10Gb, 6th - 45Gb)
According to http://file.emea.redhat.com/~yprokule/SOSReports/OCP/RHBZ-1875745/bmh-openshift-worker-0-0.yml it looks like the rootDeviceHints are not being copied to the worker host definition at all. That's the same problem being fixed for https://bugzilla.redhat.com/show_bug.cgi?id=1871653 so I am going to mark this ticket as a duplicate of the other. *** This bug has been marked as a duplicate of bug 1871653 ***