Description of problem: Hi, this customer is trying to deploy RHOSP 17.1 on diskless baremetal. There's 1TB disk on a 3par storage array. Introspection went well. After the provisioning, the nodes didn't boot. In a remote with the customer, we mounted a rhel 9.2 iso and took a look in the os. We noticed that the san device had lvm logical volumes. But, invoking multipath -ll it wasn't showing any devices because of the lack of multipath.conf. We then added a standard multipath.conf and the devices started to be shown. So we regenerated the initramfs adding multipath support and rebooted the node, that come back up correctly. Then we copied the patched initramfs to the director, and we added it together with multipath.conf in the overcloud image and re-uploaded it in openstack. Then customer unprovisioned the nodes and tried again: openstack overcloud node provision --stack overcloud --network-config /home/stack/templates/baremetal_deployment.yaml --output deployed_baremetal_deployment.yaml but even if now the nodes boot, provisioning fails throwing error in /usr/local/sbin/growvols step: ~~~ 2023-11-27 09:05:05,291 p=1011676 u=stack n=ansible | 2023-11-27 09:05:05.291000 | 00215a9b-df09-6b1b-6b8e-000000000013 | FATAL | Running /usr/local/sbin/growvols /=500GB /tmp=20GB /var/log=30GB /var/log/audit=5GB /home=400GB /var=100% | osp-ctrl03 | error={"changed": true, "cmd": "/usr/local/sbin/growvols --yes /=500GB /tmp=20GB /var/log=30GB /var/log/audit=5GB /home=400GB /var=100%", "delta": "0:00:00.097199", "end": "2023-11-27 02:06:20.704117", "msg": "non-zero return code", "rc": 1, "start": "2023-11-27 02:06:20.606918", "stderr": "Traceback (most recent call last):\n File \"/usr/local/sbin/growvols\", line 624, in <module>\n sys.exit(main(sys.argv))\n File \"/usr/local/sbin/growvols\", line 524, in main\n devname = find_next_device_name(devices, disk_name, partnum)\n File \"/usr/local/sbin/growvols\", line 338, in find_next_device_name\n raise Exception('Could not find partition naming scheme for %s'\nException: Could not find partition naming scheme for sda", "stderr_lines": ["Traceback (most recent call last):", " File \"/usr/local/sbin/growvols\", line 624, in <module>", " sys.exit(main(sys.argv))", " File \"/usr/local/sbin/growvols\", line 524, in main", " devname = find_next_device_name(devices, disk_name, partnum)", " File \"/usr/local/sbin/growvols\", line 338, in find_next_device_name", " raise Exception('Could not find partition naming scheme for %s'", "Exception: Could not find partition naming scheme for sda"], "stdout": "", "stdout_lines": []} ~~~ Version-Release number of selected component (if applicable): How reproducible: In customer environment Steps to Reproduce: 1. 2. 3. Actual results: deployment not working Expected results: deployment work Additional info:
Moving to diskimage-builder, where the growvols script is maintained
Actually lets stick with testing changes with patch files for now. I think the growvols should be run manually while logged into the machine while we're debugging san support. This patch should get past the current error but there may be other issues. Once growvols is patched please run the following and attach the output: /usr/local/sbin/growvols --debug --device mpatha /=500GB /tmp=20GB /var/log=30GB /var/log/audit=5GB /home=400GB /var=100% Once we get a successful run then I can provide instructions to modify the overcloud-hardened-uefi-full.qcow2 image to include this fix.
Looking at the growvols file in diskimage-builder-3.31.1-17.1.20231013080820.0576fad.el9ost.noarch, there are doubled lines between L255->290, L338->413, L593->657 Will this break the way the tool will work? @hjensas
Re-examining the RPM package (extracting with rpm2cpio | cpio, rather than through Archive Manager), growvols and test_growvols.py compared to the gerrit versions correctly. Marking as validated against diskimage-builder-3.31.1-17.1.20231013080820.0576fad.el9ost.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 17.1.3 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:2741
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days