Bug 1914414
| Summary: | SRIOV enablement for Emulex Corporation OneConnect NIC (10df:0720) is not working anymore | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Denis Ollier <dollierp> | ||||
| Component: | Networking | Assignee: | Peng Liu <pliu> | ||||
| Networking sub component: | SR-IOV | QA Contact: | zhaozhanqi <zzhao> | ||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||
| Severity: | low | ||||||
| Priority: | low | CC: | bbennett, dosmith, pliu, zshi | ||||
| Version: | 4.7 | ||||||
| Target Milestone: | --- | ||||||
| Target Release: | 4.9.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2021-10-18 17:29:03 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Denis Ollier
2021-01-08 18:52:15 UTC
Could you check the dmesg of the node? It looks like a dup of https://bugzilla.redhat.com/show_bug.cgi?id=1875338 Created attachment 1747093 [details]
dmesq output from a worker node
I didn't see any obvious error in dmesg, attaching it to the BZ so that you can have a look. Thanks to Peng Liu we found out that I have been confused in my analysis by the error messages in the logs.
The interface I'm trying to configure (ens2f0) is an Emulex Corporation OneConnect NIC (Skyhawk) (rev 11) with PCI id 10df:0720:
> Subsystem: Emulex Corporation Device e871
> Flags: bus master, fast devsel, latency 0, IRQ 39, NUMA node 0
> Memory at dec0c000 (64-bit, prefetchable) [size=16K]
> Memory at de7e0000 (64-bit, prefetchable) [size=128K]
> Memory at de7c0000 (64-bit, prefetchable) [size=128K]
> Expansion ROM at dee80000 [disabled] [size=512K]
> Capabilities: [40] Power Management version 3
> Capabilities: [48] MSI-X: Enable+ Count=32 Masked-
> Capabilities: [c0] Express Endpoint, MSI 00
> Capabilities: [b8] Vital Product Data
> Capabilities: [100] Advanced Error Reporting
> Capabilities: [180] Single Root I/O Virtualization (SR-IOV)
> Capabilities: [160] Alternative Routing-ID Interpretation (ARI)
> Capabilities: [168] Device Serial Number 00-10-9b-ff-fe-35-88-b0
> Capabilities: [210] Secondary PCI Express
> Kernel driver in use: be2net
> Kernel modules: be2net
The Intel NIC mentioned in the logs is not the one I try to configure, this Intel NIC is not listed by the `ip link show` command in the first place.
In the sriov operator, we assume there is at least one NIC from the supported vendors. In the node, there is one intel NIC. But the driver was not loaded as expected. So, there is no NICs from Intel or Mellanox was discovered by the configure daemon, none of the vendor plugins was loaded. With current logic, the generic plugin will not configure any NICs on that node in this case. It shall be fixed by https://github.com/k8snetworkplumbingwg/sriov-network-operator/pull/145 Verified this bug on 4.9.0-202107090514 1. Add unsupported adapter by `oc edit cm supported-nic-ids` 2. Delete the configdaemon pod and webhook pods to make them recreated. 3. Create unsupported policy and it can be created. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |