Bug 1769151 - SR-IOV network resources injector cannot parse ipam ips configuration in pod annotation
Summary: SR-IOV network resources injector cannot parse ipam ips configuration in pod ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.3.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.3.0
Assignee: Victor Pickard
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-06 00:48 UTC by zenghui.shi
Modified: 2020-01-23 11:11 UTC (History)
0 users

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-01-23 11:11:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github intel network-resources-injector issues 16 0 'None' closed SR-IOV network resources injector cannot parse multiple ip addresses in ipam config 2020-07-01 08:14:00 UTC
Github intel network-resources-injector pull 15 0 'None' closed Update multus-cni from v3.2 to ecc474a 2020-07-01 08:14:01 UTC
Github openshift sriov-dp-admission-controller pull 14 0 'None' closed BUG 1769151: Update multus-cni from v3.2 to ecc474a (#15) 2020-07-01 08:14:01 UTC
Red Hat Product Errata RHBA-2020:0062 0 None None None 2020-01-23 11:11:16 UTC

Description zenghui.shi 2019-11-06 00:48:04 UTC
Description of problem:

When configuring ips via pod annotation, SR-IOV injector cannot parse net-attach-def with an array of ip list, which result in SR-IOV resource request/limit not be injected in pod spec. The pod can still be created successfully, but no SR-IOV interface is attached.


Version-Release number of selected component (if applicable):
4.3.0

How reproducible:
Always

Steps to Reproduce:

Assuming SR-IOV devices have been exposed to kubelet with resource name as 'openshift.io/intelnics'

1. create sriov-net1 net-attach-def via SR-IOV Network Operator

apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
  name: sriov-net1
  namespace: openshift-sriov-network-operator
spec:
  ipam: |
    {
      "type": "static"
    }
  vlan: 0
  resourceName: intelnics
  networkNamespace: default



2. use pod spec below:

apiVersion: v1
kind: Pod
metadata:
  name: testpod1
  annotations:
    k8s.v1.cni.cncf.io/networks: '[
        {
                "name": "sriov-net1",
                "mac": "CA:FE:C0:FF:EE:00",
                "ips": ["192.168.100.101/24", "2001::2/64"]
        }
]'
spec:
  containers:
  - name: appcntr1
    image: zenghui/centos-dpdk
    imagePullPolicy: IfNotPresent
    command: [ "/bin/bash", "-c", "--" ]
    args: [ "while true; do sleep 300000; done;" ]
    resources:
      requests:
        cpu: '1'
        memory: 100Mi
      limits:
        cpu: '1'
        memory: 100Mi

3. check SR-IOV injector log

Actual results:


Expected results:


Additional info:

SR-IOV injector log:

I1105 12:25:25.020446       1 main.go:36] starting mutating admission controller for network resources injection
I1105 12:34:49.678308       1 webhook.go:332] Received mutation request
I1105 12:34:49.685541       1 webhook.go:157] '[ { "name": "sriov-intel", "mac": "CA:FE:C0:FF:EE:01", "ips": ["192.168.100.102/24", "2001::1/64"] } ]' is not in JSON format: json: cannot unmarshal array into Go struct field NetworkSelectionElement.ips of type string... trying to parse as comma separated network selections list
I1105 12:34:49.685700       1 webhook.go:217] at least one of the network selection units is invalid: error found at '[ { "name": "sriov-intel"'
E1105 12:34:49.685721       1 webhook.go:163] error parsing network selection element: at least one of the network selection units is invalid: error found at '[ { "name": "sriov-intel"'
I1105 12:34:49.686894       1 webhook.go:391] pod doesn't need any custom network resources
I1105 12:34:49.686928       1 webhook.go:257] sending response to the Kubernetes API server
I1105 12:37:33.286741       1 webhook.go:332] Received mutation request
I1105 12:37:33.305374       1 webhook.go:371] network attachment definition 'default/sriov-intel' found
I1105 12:37:33.305413       1 webhook.go:377] resource 'openshift.io/intelnics' needs to be requested for network 'default/sriov-intel'
I1105 12:37:33.305441       1 webhook.go:422] patch after all mutations%!(EXTRA []webhook.jsonPatchOperation=[{add /spec/containers/0/resources/requests/openshift.io~1intelnics {{1 0} {<nil>}  DecimalSI}} {add /spec/containers/0/resources/limits/openshift.io~1intelnics {{1 0} {<nil>}  DecimalSI}} {add /spec/containers/0/volumeMounts/- {podnetinfo false /etc/podnetinfo  <nil> }} {add /spec/volumes/- {podnetinfo {nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil &DownwardAPIVolumeSource{Items:[{labels ObjectFieldSelector{APIVersion:,FieldPath:metadata.labels,} nil <nil>} {annotations &ObjectFieldSelector{APIVersion:,FieldPath:metadata.annotations,} nil <nil>}],DefaultMode:nil,} nil nil nil nil nil nil nil nil nil nil nil nil}}}])
I1105 12:37:33.307473       1 webhook.go:257] sending response to the Kubernetes API server

Comment 1 Victor Pickard 2019-11-08 16:40:12 UTC
OK, I have figured out the issue.

Basically, sriov admission controller is pinned to multus version 3.2. The multus enhancements to allow multiple IPs is on master branch. So, I will update sriov admission controller to get newer version of multus type definition (types.go, NetworkSelectionElement.IPRequest[]) that supports multiple IP Addresses.

Here is a pod that I was able to spin up in my local k8s setup with the above changes, showing both IPv4 and IPv6 addresses:

[root@vpickard-k8s deployments]# kubectl exec -it pod-sriov-vf sh
sh-4.2# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
3: eth0@if48: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default 
    link/ether 3e:d1:62:87:ca:ef brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.244.1.37/24 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::3cd1:62ff:fe87:caef/64 scope link 
       valid_lft forever preferred_lft forever
17: net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 3a:7e:7f:48:05:6f brd ff:ff:ff:ff:ff:ff
    inet 100.100.100.100/24 brd 100.100.100.255 scope global net1
       valid_lft forever preferred_lft forever
    inet6 2001::2/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::387e:7fff:fe48:56f/64 scope link 
       valid_lft forever preferred_lft forever
sh-4.2# 


I will submit a PR for sriov admission controller to fix this.

I also discussed this with Doug, because I wasn't sure about which version of multus would be running in OCP 4.3. Understanding is that OCP 4.3 will be running tip of multus from master, so should be all good there.

Comment 4 zhaozhanqi 2019-11-11 05:30:20 UTC
Move this bug to post since above PR still in open status

Comment 6 zhaozhanqi 2019-11-15 09:34:52 UTC
Verified this bug on quay.io/openshift-release-dev/ocp-v4.0-art-dev:v4.3.0-201911132228-ose-sriov-network-operator

oc rsh -n z2 testpod16s86m
sh-4.2# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
3: eth0@if31: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default 
    link/ether 0a:58:0a:80:00:fd brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.128.0.253/23 brd 10.128.1.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::a873:a6ff:fee0:6a3b/64 scope link 
       valid_lft forever preferred_lft forever
26: net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether ca:fe:c0:ff:ee:01 brd ff:ff:ff:ff:ff:ff
    inet 192.168.2.206/24 brd 192.168.2.255 scope global net1
       valid_lft forever preferred_lft forever
    inet6 2001::2/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::c8fe:c0ff:feff:ee01/64 scope link 
       valid_lft forever preferred_lft forever

Comment 8 errata-xmlrpc 2020-01-23 11:11:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062


Note You need to log in before you can comment on or make changes to this bug.