Bug 2056340
Summary: | [4.8] SRIOV exclusive pooling | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | zenghui.shi <zshi> | |
Component: | Networking | Assignee: | Balazs Nemeth <bnemeth> | |
Networking sub component: | SR-IOV | QA Contact: | zhaozhanqi <zzhao> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | medium | |||
Priority: | medium | CC: | ddelcian, dosmith, zshi, zzhao | |
Version: | 4.6 | |||
Target Milestone: | --- | |||
Target Release: | 4.8.z | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | 2056339 | |||
: | 2056342 (view as bug list) | Environment: | ||
Last Closed: | 2022-08-09 12:52:44 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 2056339 | |||
Bug Blocks: | 2056342 |
Comment 1
Balazs Nemeth
2022-06-29 09:42:45 UTC
Balazs Nemeth we need to disable webhook as above first step mentioned. 1. disable webhook by edit sriovoperatorconfigs.sriovnetwork.openshift.io to `enableOperatorWebhook: false` Verified this bug on 4.8.0-202207180915 # oc get csv -n openshift-sriov-network-operator NAME DISPLAY VERSION REPLACES PHASE sriov-network-operator.4.8.0-202207180915 SR-IOV Network Operator 4.8.0-202207180915 sriov-network-operator.4.8.0-202207071636 Succeeded with steps 1. disable webhook by edit sriovoperatorconfigs.sriovnetwork.openshift.io to `enableOperatorWebhook: false` 2. Create two policy with same PF . eg # cat intel-dpdk.yaml apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetworkNodePolicy metadata: name: intel-dpdk namespace: openshift-sriov-network-operator spec: deviceType: vfio-pci mtu: 1700 nicSelector: deviceID: "158b" pfNames: - ens1f1 rootDevices: - '0000:3b:00.1' vendor: '8086' nodeSelector: feature.node.kubernetes.io/sriov-capable: 'true' numVfs: 2 priority: 99 resourceName: inteldpdk # cat intel-dpdk.yaml-2 cat: intel-dpdk.yaml-2: No such file or directory [root@dell-per740-36 rhcos]# cat intel-dpdk.yaml_2 apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetworkNodePolicy metadata: name: intel-dpdk3 namespace: openshift-sriov-network-operator spec: deviceType: vfio-pci nicSelector: deviceID: "158b" pfNames: - ens1f1 rootDevices: - '0000:3b:00.1' vendor: '8086' nodeSelector: feature.node.kubernetes.io/sriov-capable: 'true' numVfs: 2 priority: 99 resourceName: inteldpdk3 3. Check the dp logs # oc logs sriov-device-plugin-8w7sm -n openshift-sriov-network-operator I0719 07:09:40.565241 1 manager.go:112] number of config: 2 I0719 07:09:40.565247 1 manager.go:116] I0719 07:09:40.565252 1 manager.go:117] Creating new ResourcePool: inteldpdk I0719 07:09:40.565257 1 manager.go:118] DeviceType: netDevice I0719 07:09:40.569687 1 factory.go:108] device added: [pciAddr: 0000:3b:0a.0, vendor: 8086, device: 154c, driver: vfio-pci] I0719 07:09:40.569701 1 factory.go:108] device added: [pciAddr: 0000:3b:0a.1, vendor: 8086, device: 154c, driver: vfio-pci] I0719 07:09:40.569720 1 manager.go:146] New resource server is created for inteldpdk ResourcePool I0719 07:09:40.569725 1 manager.go:116] I0719 07:09:40.569728 1 manager.go:117] Creating new ResourcePool: inteldpdk3 I0719 07:09:40.569732 1 manager.go:118] DeviceType: netDevice W0719 07:09:40.574739 1 manager.go:159] Cannot add PCI Address [0000:3b:0a.0]. Already allocated. W0719 07:09:40.574752 1 manager.go:159] Cannot add PCI Address [0000:3b:0a.1]. Already allocated. I0719 07:09:40.574757 1 manager.go:132] no devices in device pool, skipping creating resource server for inteldpdk3 I0719 07:09:40.574763 1 main.go:72] Starting all servers... I0719 07:09:40.575039 1 server.go:196] starting inteldpdk device plugin endpoint at: openshift.io_inteldpdk.sock I0719 07:09:40.576932 1 server.go:222] inteldpdk device plugin endpoint started serving I0719 07:09:40.577070 1 main.go:77] All servers started. I0719 07:09:40.577079 1 main.go:78] Listening for term signals I0719 07:09:41.153694 1 server.go:106] Plugin: openshift.io_inteldpdk.sock gets registered successfully at Kubelet I0719 07:09:41.153730 1 server.go:131] ListAndWatch(inteldpdk) invoked I0719 07:09:41.153791 1 server.go:139] ListAndWatch(inteldpdk): send devices &ListAndWatchResponse{Devices:[]*Device{&Device{ID:0000:3b:0a.0,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:3b:0a.1,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},},} 4. and no inteldpdk3 was created. # oc describe node dell-per740-14.rhts.eng.pek2.redhat.com | grep "openshift.io/inteldpdk" openshift.io/inteldpdk: 2 openshift.io/inteldpdk: 2 openshift.io/inteldpdk 0 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.8.47 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:5889 |