Bug 1915299
| Summary: | os-net-config fails to re-provision networking config on compute node with DPDK interfaces mapped to numbered interfaces | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Alex Stupnikov <astupnik> |
| Component: | os-net-config | Assignee: | Dan Sneddon <dsneddon> |
| Status: | CLOSED ERRATA | QA Contact: | Paras Babbar <pbabbar> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 13.0 (Queens) | CC: | bfournie, dsneddon, fiezzi, hbrock, jslagle, mburns, pbabbar, pweeks, sbaker |
| Target Milestone: | z6 | Keywords: | Triaged |
| Target Release: | 16.1 (Train on RHEL 8.2) | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | os-net-config-11.3.2-1.20210406083710.f49ab16.el8 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-05-26 13:50:32 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Alex Stupnikov
2021-01-12 12:12:26 UTC
Looking at the attached support case, I see that the NICs are not being detected correctly. The NICs p2p1 and p2p2 are being detected twice, so the numbered NIC ordering is skipping nic6 and nic8 which are being mapped to p2p1 and p2p2, however these NICs have already been assigned to p2p1 and p2p2: Jan 11 12:03:14 cpt0-dpdk-dell-tovb os-collect-config: [2021/01/11 11:58:07 AM] [INFO] Active nics are ['em1', 'em2', 'p1p1', 'p1p2', 'p2p1', 'p2p1', 'p2p2', 'p2p2', 'p3p1', 'p3p1', 'p3p2', 'p3p2'] Jan 11 12:03:14 cpt0-dpdk-dell-tovb os-collect-config: [2021/01/11 11:58:07 AM] [INFO] nic2 mapped to: em2 Jan 11 12:03:14 cpt0-dpdk-dell-tovb os-collect-config: [2021/01/11 11:58:07 AM] [INFO] nic3 mapped to: p1p1 Jan 11 12:03:14 cpt0-dpdk-dell-tovb os-collect-config: [2021/01/11 11:58:07 AM] [INFO] nic4 mapped to: p1p2 Jan 11 12:03:14 cpt0-dpdk-dell-tovb os-collect-config: [2021/01/11 11:58:07 AM] [INFO] nic7 mapped to: p2p2 Jan 11 12:03:14 cpt0-dpdk-dell-tovb os-collect-config: [2021/01/11 11:58:07 AM] [INFO] nic5 mapped to: p2p1 Jan 11 12:03:14 cpt0-dpdk-dell-tovb os-collect-config: [2021/01/11 11:58:07 AM] [INFO] nic1 mapped to: em1 Jan 11 12:03:14 cpt0-dpdk-dell-tovb os-collect-config: [2021/01/11 11:58:07 AM] [INFO] nic11 mapped to: p3p2 Jan 11 12:03:14 cpt0-dpdk-dell-tovb os-collect-config: [2021/01/11 11:58:07 AM] [INFO] nic9 mapped to: p3p1 In order to troubleshoot this, I need to see the NIC config templates that are being used in the stack update, as well as more information about what changes were made manually. What was the goal of the manual changes? What were the changes made to the NIC config templates (or network environment files) before running a stack update with NetworkDeployActions set to ["CREATE","UPDATE"]. I think I have discovered where the bug lies here. When os-net-config runs for the first time, the DPDK nics have no entry in /sys/net. Since the NICs are not present there, we look at the DPDK mapping and add the NICs to the list of active NICs. When you made the LACP change and updated the stack, the DPDK NICs would have been active and would have an entry in /sys/net. The NICs were added to the list of active NICs, but the DPDK mapping added those NICs to the list of active NICs a second time. To fix this we probably have to made sure we only add the DPDK NIC to the list once. I can file an upstream bug and patch, but I don't know if or how long it would take for the change to be made in OSP 13. It is probably best to use the following workaround instead. My recommendation is to use real NIC names in the computeDPDK.yaml template. If the nodes do not all have the same NIC name configuration, then a mapping will have to be provided. See the file in firstboot/os-net-config-mappings.yaml in the openstack-tripleo-heat-templates directory and the associated documentation for more information. Thank you so much Dan! We will try to explain available options to customer. Dan, can you fill in the fixed in version field and link to the patch? I'll set tags appropriate for 16.1.5. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.1.6 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2097 |