Description of problem: Client is trying to build a lab. He wants to deploy swift on a second disk of the controller nodes. But during overcloud deployment it sometime fails with: 2023-11-24 13:04:48.702263 | 566fdacf-01b6-add0-8052-000000001dbd | FATAL | Format SwiftRawDisks | controller-22 | item=sdb | error={"ansible_loop_var": "item", "changed": false, "cmd": "/sbin/mkfs.xfs -f -f -i size=1024 /dev/sdb", "item": "sdb", "msg": "mkfs.xfs: cannot open /dev/sdb: Device or resource busy", "rc": 1, "stderr": "mkfs.xfs: cannot open /dev/sdb: Device or resource busy\n", "stderr_lines": ["mkfs.xfs: c annot open /dev/sdb: Device or resource busy"], "stdout": "", "stdout_lines": []} The cause of this is sometimes during introspection, sdb is the OS disk instead of sda: (undercloud) [stack@director-02 ~]$ openstack baremetal introspection data save controller-22 | jq .inventory.disks [ { "name": "/dev/sda", "model": "QEMU HARDDISK", "size": 3220151730176, "rotational": true, "wwn": null, "serial": "165ddab4-390d-4711-98d7-dc71e08e8ca2", "vendor": "QEMU", "wwn_with_extension": null, "wwn_vendor_extension": null, "hctl": "0:0:0:1", "by_path": "/dev/disk/by-path/pci-0000:08:00.0-scsi-0:0:0:1" }, { "name": "/dev/sdb", "model": "QEMU HARDDISK", "size": 1073741824000, "rotational": true, "wwn": null, "serial": "d285f75b-f180-4d1e-b35b-3d4f13e62c42", "vendor": "QEMU", "wwn_with_extension": null, "wwn_vendor_extension": null, "hctl": "0:0:0:0", "by_path": "/dev/disk/by-path/pci-0000:08:00.0-scsi-0:0:0:0" } ] The issue is not with using root_serial at the baremetal level because client is doing it and yes the OS does get installed on the right disk each time, its more the labeling that is causing issues: (undercloud) [stack@director-02 ~]$ openstack baremetal node show controller-22 -c properties -f value {'cpus': '24', 'memory_mb': '73728', 'local_gb': '999', 'cpu_arch': 'x86_64', 'capabilities': 'boot_option:local,node:controller-1,profile:control,cpu_aes:true,cpu_hugepages:true,cpu_hugepages_1g:true', 'root_device': {'serial': 'd285f75b-f180-4d1e-b35b-3d4f13e62c42'}} In his templates he tries to deploy with this (as our doc says): parameter_defaults: SwiftMountCheck: true SwiftRawDisks: {"sdb": {}} SwiftUseLocalDir: false I found a variable that should be able to help us here but I am not able to make it work: SwiftUseNodeDataLookup https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/17.1/html/overcloud_parameters/ref_object-storage-swift-parameters_overcloud_parameters#doc-wrapper I believe this feature never worked as intended because of a typo in a sed command used at some point: /usr/share/openstack-tripleo-heat-templates/deployment/swift/swift-storage-container-puppet.yaml hiera -c /etc/puppet/hiera.yaml swift::storage::disks::args | sed =e 's/=>/:/g' ('sed =e' doesnt work, a new patch has been submitted to fix this typo). So I built a lab to replicate this issue (where I manually fixed the sed command btw). I put the following in my templates: resource_registry: OS::TripleO::ControllerExtraConfigPre: /usr/share/openstack-tripleo-heat-templates/puppet/extraconfig/pre_deploy/per_node.yaml parameter_defaults: SwiftMountCheck: true SwiftUseLocalDir: false SwiftUseNodeDataLookup: true # Use NodeDataLookup for disk devices due to non-persistent disk names NodeDataLookup: | { "3043894b-6a6f-48ed-b359-1275778779a7": { "swift::storage::disks::args": { "pci-0000:08:00.0": { "base_dir": "/dev/disk/by-path/" } }, "tripleo::profile::base::swift::ringbuilder::raw_disks": [ ":%PORT%/pci-0000:08:00.0" ] }, "2b9b218f-2d74-4ff4-a8d0-8538c77431c0": { "swift::storage::disks::args": { "pci-0000:08:00.0": { "base_dir": "/dev/disk/by-path/" } }, "tripleo::profile::base::swift::ringbuilder::raw_disks": [ ":%PORT%/pci-0000:08:00.0" ] }, "ada67792-e5d5-4380-b62c-57240cf53749": { "swift::storage::disks::args": { "pci-0000:08:00.0": { "base_dir": "/dev/disk/by-path/" } }, "tripleo::profile::base::swift::ringbuilder::raw_disks": [ ":%PORT%/pci-0000:08:00.0" ] } } When I deploy with this I get: <13>Dec 4 16:46:51 puppet-user: Error: Evaluation Error: Error while evaluating a Resource Statement, Evaluation Error: Error while evaluating a Resource Statement, Duplicate declaration: Ring_object_device[0000:08:00.0] is already declared at (file: /etc/puppet/modules/tripleo/manifests/profile/base/swift/add_devices.pp, line: 47); cannot redeclare (file: /etc/puppet/modules/tripleo/manifests/profile/base/swift/add_devices.pp, line: 47) (file: /etc/puppet/modules/tripleo/manifests/profile/base/swift/add_devices.pp, line: 47, column: 3) (file: /etc/puppet/modules/tripleo/manifests/profile/base/swift/ringbuilder.pp, line: 130) on node controller-0.redhat.local I tried different options like by-id and use the serial of the disk instead but it truncates weirdly (keep only the last digit (because of '-')) and fails with the exact same error I put above (duplicate declaration). I have my lab available if you want to this something out. Please reach out to me. Version-Release number of selected component (if applicable): OSP17.1 How reproducible: Random Steps to Reproduce: 1. Introspect node and if you are unlucky sdb will be your disk OS. 2. Deploying swift on sdb will fail because disk is already in use. 3. Actual results: Failure to deploy swift on every deployment Expected results: Find a way to specify to swift which disk to deploy to. Additional info: I have a lab to test stuff out.