Description of problem: After rebooting a compute, dpdk interfaces are broken because the driver is not loaded in those interfaces Before reboot [ Port dpdkbond0 Interface dpdk1 type: dpdk options: {dpdk-devargs="0000:06:00.1", n_rxq="1"} Interface dpdk0 type: dpdk options: {dpdk-devargs="0000:06:00.0", n_rxq="1"} Port br-link0 -- Bridge br-dpdk0 fail_mode: standalone -- Port br-dpdk0 Interface br-dpdk0 type: internal Port dpdk2 Interface dpdk2 type: dpdk options: {dpdk-devargs="0000:82:00.2"} Bridge br-int -- Bridge br-dpdk1 fail_mode: standalone -- Port dpdk3 Interface dpdk3 type: dpdk options: {dpdk-devargs="0000:82:00.3"} Port br-dpdk1 Interface br-dpdk1 type: internal [root@computedpdksriov-r730-0 heat-admin]# driverctl list-overrides 0000:06:00.0 vfio-pci 0000:06:00.1 vfio-pci 0000:82:00.2 vfio-pci 0000:82:00.3 vfio-pci After reboot [root@computedpdksriov-r730-0 heat-admin]# ovs-vsctl show | grep -A1 dpdk Bridge br-dpdk0 fail_mode: standalone -- Port dpdk2 Interface dpdk2 type: dpdk options: {dpdk-devargs="0000:82:00.2"} error: "Error attaching device '0000:82:00.2' to DPDK" Port br-dpdk0 Interface br-dpdk0 type: internal -- Bridge br-dpdk1 fail_mode: standalone -- Port dpdk3 Interface dpdk3 type: dpdk options: {dpdk-devargs="0000:82:00.3"} error: "Error attaching device '0000:82:00.3' to DPDK" Port br-dpdk1 Interface br-dpdk1 type: internal -- Port dpdkbond0 Interface dpdk1 type: dpdk options: {dpdk-devargs="0000:06:00.1", n_rxq="1"} error: "Error attaching device '0000:06:00.1' to DPDK" Interface dpdk0 type: dpdk options: {dpdk-devargs="0000:06:00.0", n_rxq="1"} error: "Error attaching device '0000:06:00.0' to DPDK" [root@computedpdksriov-r730-0 heat-admin]# driverctl list-overrides driverctl: No overridable devices found. Kernel too old? If i execute os-net-config: [root@computedpdksriov-r730-0 heat-admin]# os-net-config -c /etc/os-net-config/config.json -v --detailed-exit-codes [root@computedpdksriov-r730-0 heat-admin]# ovs-vsctl show | grep -A1 dpdk Bridge br-dpdk0 fail_mode: standalone -- Port dpdk2 Interface dpdk2 type: dpdk options: {dpdk-devargs="0000:82:00.2"} Port br-dpdk0 Interface br-dpdk0 type: internal -- Bridge br-dpdk1 fail_mode: standalone -- Port dpdk3 Interface dpdk3 type: dpdk options: {dpdk-devargs="0000:82:00.3"} Port br-dpdk1 Interface br-dpdk1 type: internal -- Port dpdkbond0 Interface dpdk1 type: dpdk options: {dpdk-devargs="0000:06:00.1", n_rxq="1"} Interface dpdk0 type: dpdk options: {dpdk-devargs="0000:06:00.0", n_rxq="1"} ovs_version: "2.15.4" [root@computedpdksriov-r730-0 heat-admin]# driverctl list-overrides 0000:06:00.0 vfio-pci 0000:06:00.1 vfio-pci 0000:82:00.2 vfio-pci 0000:82:00.3 vfio-pci Version-Release number of selected component (if applicable): RHOS-16.2-RHEL-8-20220210.n.1(undercloud) How reproducible: 1. deploy osp16.2. Templates must configure dpdk on one or more nics 2. reboot one compute 3. check if dpdk is properly configured Actual results: os-net-config is not executed after reboot and dpdk nics are not properly configured Expected results: os-net-config should be executed and dpdk nics should be properly configured Additional info: i will upload sos report and templates used
After reboot, I executed the below commands. [root@computedpdksriov-r740-0 heat-admin]# lsmod | grep vfio vfio_iommu_type1 36864 0 vfio 36864 2 vfio_iommu_type1 [root@computedpdksriov-r740-0 heat-admin]# modprobe vfio-pci [root@computedpdksriov-r740-0 heat-admin]# lsmod | grep vfio vfio_pci 61440 0 vfio_virqfd 16384 1 vfio_pci irqbypass 16384 2 vfio_pci,kvm vfio_iommu_type1 36864 0 vfio 36864 3 vfio_iommu_type1,vfio_pci [root@computedpdksriov-r740-0 heat-admin]# driverctl list-overrides driverctl: No overridable devices found. Kernel too old? [root@computedpdksriov-r740-0 heat-admin]# driverctl set-override 0000:af:00.2 vfio-pci [root@computedpdksriov-r740-0 heat-admin]# sudo ovs-vsctl show c9cf7aa8-60ef-46ef-9afc-0c322c12ebf3 Manager "ptcp:6640:127.0.0.1" is_connected: true Bridge br-link0 fail_mode: standalone datapath_type: netdev Port dpdkbond0 Interface dpdk0 type: dpdk options: {dpdk-devargs="0000:af:00.2", n_rxq="1"} Interface dpdk1 type: dpdk options: {dpdk-devargs="0000:af:00.3", n_rxq="1"} error: "Error attaching device '0000:af:00.3' to DPDK" Port br-link0 tag: 171 Interface br-link0 type: internal Bridge br-dpdk0 fail_mode: standalone datapath_type: netdev Port dpdk2 Interface dpdk2 type: dpdk options: {dpdk-devargs="0000:3b:00.0"} error: "Error attaching device '0000:3b:00.0' to DPDK" Port br-dpdk0 Interface br-dpdk0 type: internal **** Conclusion ***** So, while re running os-net-config, the below commands helps in fixing the issue. modprobe vfio-pci driverctl set-override We need to understand what change has created the need for these.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.2), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:1001