Description of problem: When configuring ips via pod annotation, SR-IOV injector cannot parse net-attach-def with an array of ip list, which result in SR-IOV resource request/limit not be injected in pod spec. The pod can still be created successfully, but no SR-IOV interface is attached. Version-Release number of selected component (if applicable): 4.3.0 How reproducible: Always Steps to Reproduce: Assuming SR-IOV devices have been exposed to kubelet with resource name as 'openshift.io/intelnics' 1. create sriov-net1 net-attach-def via SR-IOV Network Operator apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetwork metadata: name: sriov-net1 namespace: openshift-sriov-network-operator spec: ipam: | { "type": "static" } vlan: 0 resourceName: intelnics networkNamespace: default 2. use pod spec below: apiVersion: v1 kind: Pod metadata: name: testpod1 annotations: k8s.v1.cni.cncf.io/networks: '[ { "name": "sriov-net1", "mac": "CA:FE:C0:FF:EE:00", "ips": ["192.168.100.101/24", "2001::2/64"] } ]' spec: containers: - name: appcntr1 image: zenghui/centos-dpdk imagePullPolicy: IfNotPresent command: [ "/bin/bash", "-c", "--" ] args: [ "while true; do sleep 300000; done;" ] resources: requests: cpu: '1' memory: 100Mi limits: cpu: '1' memory: 100Mi 3. check SR-IOV injector log Actual results: Expected results: Additional info: SR-IOV injector log: I1105 12:25:25.020446 1 main.go:36] starting mutating admission controller for network resources injection I1105 12:34:49.678308 1 webhook.go:332] Received mutation request I1105 12:34:49.685541 1 webhook.go:157] '[ { "name": "sriov-intel", "mac": "CA:FE:C0:FF:EE:01", "ips": ["192.168.100.102/24", "2001::1/64"] } ]' is not in JSON format: json: cannot unmarshal array into Go struct field NetworkSelectionElement.ips of type string... trying to parse as comma separated network selections list I1105 12:34:49.685700 1 webhook.go:217] at least one of the network selection units is invalid: error found at '[ { "name": "sriov-intel"' E1105 12:34:49.685721 1 webhook.go:163] error parsing network selection element: at least one of the network selection units is invalid: error found at '[ { "name": "sriov-intel"' I1105 12:34:49.686894 1 webhook.go:391] pod doesn't need any custom network resources I1105 12:34:49.686928 1 webhook.go:257] sending response to the Kubernetes API server I1105 12:37:33.286741 1 webhook.go:332] Received mutation request I1105 12:37:33.305374 1 webhook.go:371] network attachment definition 'default/sriov-intel' found I1105 12:37:33.305413 1 webhook.go:377] resource 'openshift.io/intelnics' needs to be requested for network 'default/sriov-intel' I1105 12:37:33.305441 1 webhook.go:422] patch after all mutations%!(EXTRA []webhook.jsonPatchOperation=[{add /spec/containers/0/resources/requests/openshift.io~1intelnics {{1 0} {<nil>} DecimalSI}} {add /spec/containers/0/resources/limits/openshift.io~1intelnics {{1 0} {<nil>} DecimalSI}} {add /spec/containers/0/volumeMounts/- {podnetinfo false /etc/podnetinfo <nil> }} {add /spec/volumes/- {podnetinfo {nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil &DownwardAPIVolumeSource{Items:[{labels ObjectFieldSelector{APIVersion:,FieldPath:metadata.labels,} nil <nil>} {annotations &ObjectFieldSelector{APIVersion:,FieldPath:metadata.annotations,} nil <nil>}],DefaultMode:nil,} nil nil nil nil nil nil nil nil nil nil nil nil}}}]) I1105 12:37:33.307473 1 webhook.go:257] sending response to the Kubernetes API server
OK, I have figured out the issue. Basically, sriov admission controller is pinned to multus version 3.2. The multus enhancements to allow multiple IPs is on master branch. So, I will update sriov admission controller to get newer version of multus type definition (types.go, NetworkSelectionElement.IPRequest[]) that supports multiple IP Addresses. Here is a pod that I was able to spin up in my local k8s setup with the above changes, showing both IPv4 and IPv6 addresses: [root@vpickard-k8s deployments]# kubectl exec -it pod-sriov-vf sh sh-4.2# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 3: eth0@if48: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default link/ether 3e:d1:62:87:ca:ef brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 10.244.1.37/24 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::3cd1:62ff:fe87:caef/64 scope link valid_lft forever preferred_lft forever 17: net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 3a:7e:7f:48:05:6f brd ff:ff:ff:ff:ff:ff inet 100.100.100.100/24 brd 100.100.100.255 scope global net1 valid_lft forever preferred_lft forever inet6 2001::2/64 scope global valid_lft forever preferred_lft forever inet6 fe80::387e:7fff:fe48:56f/64 scope link valid_lft forever preferred_lft forever sh-4.2# I will submit a PR for sriov admission controller to fix this. I also discussed this with Doug, because I wasn't sure about which version of multus would be running in OCP 4.3. Understanding is that OCP 4.3 will be running tip of multus from master, so should be all good there.
https://github.com/intel/network-resources-injector/pull/15
Move this bug to post since above PR still in open status
Verified this bug on quay.io/openshift-release-dev/ocp-v4.0-art-dev:v4.3.0-201911132228-ose-sriov-network-operator oc rsh -n z2 testpod16s86m sh-4.2# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 3: eth0@if31: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default link/ether 0a:58:0a:80:00:fd brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 10.128.0.253/23 brd 10.128.1.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::a873:a6ff:fee0:6a3b/64 scope link valid_lft forever preferred_lft forever 26: net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether ca:fe:c0:ff:ee:01 brd ff:ff:ff:ff:ff:ff inet 192.168.2.206/24 brd 192.168.2.255 scope global net1 valid_lft forever preferred_lft forever inet6 2001::2/64 scope global valid_lft forever preferred_lft forever inet6 fe80::c8fe:c0ff:feff:ee01/64 scope link valid_lft forever preferred_lft forever
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062