Hide Forgot
Description of problem: Overcloud node deleted failed and caused connectivity lost on 20 computes. * In the os-collect-config logs- We noticed that nic1 is getting mapped to br-XXXXX instead of correct interface[1]. We tried running os-net-config manually but during our manual run nic1 was again getting mapped to br-XXXXX. As nic was not properly getting map, we proposed a workaround of creating a mapping.yaml which successfully worked on 1 of a compute node. But as this is a manual work for 20 compute nodes and we tried to find other possible workarounds. We found that udev rules are different for good and affected compute nodes and affected compute node have an entry for "br-XXXXXX"[3]. We moved the /etc/udev/rules.d/70-persistent-net.rules and reboot one of a compute node which worked. /etc/udev/rules.d/70-persistent-net.rules got recreated[3] and connectivity restored after network and openvswitch service restart. This environment was recently upgraded from 8 to 10 Version-Release number of selected component (if applicable): OSP10 How reproducible: Unsure Steps to Reproduce: 1. delete an overcloud node 2. 3. Actual results: nic mapping changed, connectivity lost, node delete failed Expected results: node deleted only Additional info:
Angela - its interesting that workaround came up Andreas on another bug, see https://bugzilla.redhat.com/show_bug.cgi?id=1760806. We think its a reasonable workaround to prevent cloud-init from overwriting the config.
It seems that a support exception isn't needed in this case - its a workaround for cloud-init behaviour, but I'm not entirely clear in which cases we require an SE.
Marking this bug as closed based on above workaround.