Bug 1846333 - [RHOCP4.4]: Not able to launch the SRIOV based pod using deployment in non-default project
Summary: [RHOCP4.4]: Not able to launch the SRIOV based pod using deployment in non-de...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.4
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: 4.6.0
Assignee: zenghui.shi
QA Contact: zhaozhanqi
URL:
Whiteboard:
: 1840962 (view as bug list)
Depends On:
Blocks: 1723620 1890939
TreeView+ depends on / blocked
 
Reported: 2020-06-11 12:06 UTC by weiguo fan
Modified: 2023-10-06 20:34 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1890939 (view as bug list)
Environment:
Last Closed: 2020-10-27 16:06:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift sriov-dp-admission-controller pull 18 0 None closed Bug 1846333: Handle pods from other kinds properly (#28) 2020-12-30 01:23:01 UTC
Red Hat Knowledge Base (Solution) 5113011 0 None None None 2020-08-21 07:40:21 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:07:15 UTC

Description weiguo fan 2020-06-11 12:06:38 UTC
1. Bug Overview:

a) Description of bug report: 

  [RHOCP4.4]: Not able to launch the SRIOV based pod using deployment in non-default project


b) Bug Description:

We are not able to launch the SRIOV based pod using kind "deployment" in project except "default" project,
even when we specify a non-default namespace at riovNetwork configuration.
It failed with the following error.

     ==============================
     Events:
       Type     Reason        Age                             From                   Message
       ----     ------        ----                            ----                   -------
       Warning  FailedCreate  <invalid> (x12 over <invalid>)  replicaset-controller  Error creating: admission webhook "network-resources-injector-config.k8s.io" denied the request: could not find network attachment definition 'default/darnet': could not get Network Attachment Definition default/darnet: the server could not find the requested resource
     ==============================

Please fix this problem.

2. Bug Details:

a) Architectures: 64-bit Intel EM64T/AMD64
  x86_64

b) Bugzilla Dependencies:

c) Drivers or hardware dependencies:

d) Upstream acceptance information: 

e) External links:

f) Severity (H,M,L):
  M

g) How reproducible: 
  Always

h) Steps to Reproduce: 

   Step1: Enable SRIOV on the bios of you machine.

   Step2: Install SRIOV network operator.

     eg.) current sriovoperatorconfig
     # oc get sriovoperatorconfig -n openshift-sriov-network-operator -o yaml
     apiVersion: v1
     items:
     - apiVersion: sriovnetwork.openshift.io/v1
       kind: SriovOperatorConfig
       metadata:
         creationTimestamp: "2020-06-08T09:22:33Z"
         generation: 1
         name: default
         namespace: openshift-sriov-network-operator
         resourceVersion: "44738740"
         selfLink: /apis/sriovnetwork.openshift.io/v1/namespaces/openshift-sriov-network-operator/sriovoperatorconfigs/default
         uid: 9ca79ef3-3e90-4eed-8ed9-7dfaa30ee420
       spec:
         enableInjector: true
         enableOperatorWebhook: true
     kind: List
     metadata:
       resourceVersion: ""
       selfLink: ""

   Step3: Create a SriovNetworkNodePolicy.

     eg.) 
     # oc describe sriovnetworknodepolicies.sriovnetwork.openshift.io -n openshift-sriov-network-operator gpu0-policy 
     Name:         gpu0-policy
     Namespace:    openshift-sriov-network-operator
     Labels:       <none>
     Annotations:  <none>
     API Version:  sriovnetwork.openshift.io/v1
     Kind:         SriovNetworkNodePolicy
     Metadata:
       Creation Timestamp:  2020-06-08T09:38:47Z
       Generation:          1
       Resource Version:    44744322
       Self Link:           /apis/sriovnetwork.openshift.io/v1/namespaces/openshift-sriov-network-operator/sriovnetworknodepolicies/gpu0-policy
       UID:                 777f7638-0dd8-43b0-8d6a-0b67dfad70e0
     Spec:
       Device Type:  netdevice
       Is Rdma:      false
       Nic Selector:
         Root Devices:
           0000:5d:00.1
         Vendor:  8086
       Node Selector:
         Mtest:        gpu0
       Num Vfs:        5
       Priority:       50
       Resource Name:  gpu
     Events:           <none>


   Step4: Create a SriovNetwork with target namespace "cde"
     
     eg.)
     # cat darnet.yaml
     apiVersion: sriovnetwork.openshift.io/v1
     kind: SriovNetwork
     metadata:
       name: darnet
       namespace: openshift-sriov-network-operator
     spec:
       networkNamespace: cde
       ipam: '{"type": "static", "addresses":[{"address": "192.168.250.0/24"}]}'
       resourceName: gpu
       capabilities: '{"mac": true, "ips": true}'
     
     # oc create -f darnet.yaml
     
     # oc describe network-attachment-definitions.k8s.cni.cncf.io -n cde
     Name:         darnet
     Namespace:    cde
     Labels:       <none>
     Annotations:  k8s.v1.cni.cncf.io/resourceName: openshift.io/gpu
     API Version:  k8s.cni.cncf.io/v1
     Kind:         NetworkAttachmentDefinition
     Metadata:
       Creation Timestamp:  2020-06-08T09:48:51Z
       Generation:          1
       Resource Version:    44747954
       Self Link:           /apis/k8s.cni.cncf.io/v1/namespaces/cde/network-attachment-definitions/darnet
       UID:                 b2398a8e-7146-4131-8e18-69fa9fa44651
     Spec:
       Config:  { "cniVersion":"0.3.1", "name":"sriov-net", "type":"sriov", "vlan":0,"capabilities":{"mac": true, "ips": true},"vlanQoS":0,"ipam":{"type":"static","addresses":[{"address":"192.168.250.0/24"}]} }
     Status:
     Events:  <none>


   Step5: Define a test deployment.
     
     eg.)
     # cat darnet-pod.yaml
     apiVersion: apps/v1
     kind: Deployment
     metadata:
       name: brdsamplepod1
       labels:
         version: v1
     spec:
       replicas: 1
       selector:
         matchLabels:
           app: brdsamplepod1
           version: v1
       template:
         metadata:
           annotations:
             k8s.v1.cni.cncf.io/networks: '[
           {
             "name": "darnet",
             "ips": [ "192.168.250.3/24" ]
           }
         ]'
           labels:
             app: brdsamplepod1
             version: v1
         spec:
           containers:
           - name: samplepod
             command: ["/bin/bash", "-c", "sleep 2000000000000"]
             image: 172.16.68.92:5000/centos/tools:latest
             imagePullPolicy: IfNotPresent #Always
           nodeSelector:
             mtest: gpu0
     
     # oc project cde
     # oc create -f darnet-pod.yaml

   Step6.) Check the pod isn't deployed. Check that the following error at replicaset.

     # oc describe rs brdsamplepod1-5f9687ccfb -n cde
     Name:           brdsamplepod1-5f9687ccfb
     Namespace:      cde
     Selector:       app=brdsamplepod1,pod-template-hash=5f9687ccfb,version=v1
     Labels:         app=brdsamplepod1
                     pod-template-hash=5f9687ccfb
                     version=v1
     Annotations:    deployment.kubernetes.io/desired-replicas: 1
                     deployment.kubernetes.io/max-replicas: 2
                     deployment.kubernetes.io/revision: 1
     Controlled By:  Deployment/brdsamplepod1
     Replicas:       0 current / 1 desired
     Pods Status:    0 Running / 0 Waiting / 0 Succeeded / 0 Failed
     Pod Template:
       Labels:       app=brdsamplepod1
                     pod-template-hash=5f9687ccfb
                     version=v1
       Annotations:  k8s.v1.cni.cncf.io/networks: [ { "name": "darnet", "ips": [ "192.168.250.3/24" ] } ]
       Containers:
        samplepod:
         Image:      172.16.68.92:5000/centos/tools:latest
         Port:       <none>
         Host Port:  <none>
         Command:
           /bin/bash
           -c
           sleep 2000000000000
         Environment:  <none>
         Mounts:       <none>
       Volumes:        <none>
     Conditions:
       Type             Status  Reason
       ----             ------  ------
       ReplicaFailure   True    FailedCreate
     Events:
       Type     Reason        Age                             From                   Message
       ----     ------        ----                            ----                   -------
       Warning  FailedCreate  <invalid> (x12 over <invalid>)  replicaset-controller  Error creating: admission webhook "network-resources-injector-config.k8s.io" denied the request: could not find network attachment definition 'default/darnet': could not get Network Attachment Definition default/darnet: the server could not find the requested resource


 i) Actual results: 

 j) Expected results:

 k) Additional information

    - Version information:

      $ oc get clusterversion
      NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.4.3     True        False         29d     Cluster version is 4.4.3

    - We are able to launch the SR-IOV based pod in other projects only if we use kind "pod" instead of "deployment".

    - We are able to launch the SR-IOV based pod using kind "deployment" only if we set the networkNamespace to "default" while creating "sriovnetwork" object and launch the pod in "default" project.

    - According to our investigating, this is a problem of Network Resources Injector.

3. Business impact:

   In some case, user would like to their network to be "namespaceIsolation".
   But they cannot do that due to this problem.

4. Primary contact at Red Hat, email, phone (chat)
  mfuruta

5. Primary contact at Partner, email, phone (chat)
  mas-hatada.nec.com (Masaki Hatada)

Comment 1 Ben Bennett 2020-06-11 13:14:53 UTC
Assigning to master so we can identify the issue.  Once done, we will consider where we need to backport the fix.

Comment 2 zhaozhanqi 2020-06-12 02:55:13 UTC
yes. seem network-resources-injector will also block other multus cni not only sriov eg. 

1. create one maclvan bridge net-attach-def:

oc create -f https://raw.githubusercontent.com/openshift/verification-tests/master/testdata/networking/multus-cni/NetworkAttachmentDefinitions/macvlan-bridge.yaml -n z2

2. oc get net-attach-def  -n z2 macvlan-bridge -o yaml
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  creationTimestamp: "2020-06-12T02:25:03Z"
  generation: 1
  name: macvlan-bridge
  namespace: z2
  resourceVersion: "539303"
  selfLink: /apis/k8s.cni.cncf.io/v1/namespaces/z2/network-attachment-definitions/macvlan-bridge
  uid: fc981231-3bc8-47ac-965d-0fb4b53b6c78
spec:
  config: '{ "cniVersion": "0.3.0", "type": "macvlan", "master": "eth0", "mode": "bridge",
    "ipam": { "type": "host-local", "subnet": "10.1.1.0/24", "rangeStart": "10.1.1.100",
    "rangeEnd": "10.1.1.200", "routes": [ { "dst": "0.0.0.0/0" } ], "gateway": "10.1.1.1"
    } }'

3. Create one deployment with following:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: brdsam
  labels:
    version: v1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: brdsamp
      version: v1
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: macvlan-bridge
      labels:
        app: brdsamp
        version: v1
    spec:
      containers:
      - name: samplepod
        image: aosqe/centos-network

4. Check the pod cannot be deployed due to " 
 - lastTransitionTime: "2020-06-12T02:26:50Z"
    lastUpdateTime: "2020-06-12T02:26:50Z"
    message: 'admission webhook "network-resources-injector-config.k8s.io" denied
      the request: could not find network attachment definition ''default/macvlan-bridge'':
      could not get Network Attachment Definition default/macvlan-bridge: the server
      could not find the requested resource'
    reason: FailedCreate
"


there is workaround: added the `project`/$def-attach-def . eg. changed from `"name": "darnet",`  to `"name":"cde/darnet",` in deployment yaml file. works well.

Comment 3 weiguo fan 2020-06-16 11:57:22 UTC
hi, zhozhanqi,

> there is workaround: added the `project`/$def-attach-def . eg. changed from `"name": "darnet",`  to `"name":"cde/darnet",` in deployment yaml file. works well.

This workaround doesn't work for us.
The replicaset still try to access Network Attachment Definition under default namespace,
even after we specify "name":"cde/darnet" to deployment.

# oc get sriovnetwork -n openshift-sriov-network-operator darnet -o yaml
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
  annotations:
    operator.sriovnetwork.openshift.io/last-network-namespace: cde
  creationTimestamp: "2020-06-16T08:52:47Z" 
  finalizers:
  - netattdef.finalizers.sriovnetwork.openshift.io
  generation: 1
  name: darnet
  namespace: openshift-sriov-network-operator
  resourceVersion: "49595923" 
  selfLink: /apis/sriovnetwork.openshift.io/v1/namespaces/openshift-sriov-network-operator/sriovnetworks/darnet
  uid: 1b0dde41-6e30-4907-8105-f6f15d565052
spec:
  capabilities: '{"mac": true, "ips": true}'
  ipam: '{"type": "static", "addresses":[{"address": "192.168.250.0/24"}]}'
  networkNamespace: cde
  resourceName: gpu

# oc describe network-attachment-definitions.k8s.cni.cncf.io -n cde
Name:         darnet
Namespace:    cde
Labels:       <none>
Annotations:  k8s.v1.cni.cncf.io/resourceName: openshift.io/gpu
API Version:  k8s.cni.cncf.io/v1
Kind:         NetworkAttachmentDefinition
Metadata:
  Creation Timestamp:  2020-06-16T08:52:47Z
  Generation:          1
  Resource Version:    49595922
  Self Link:           /apis/k8s.cni.cncf.io/v1/namespaces/cde/network-attachment-definitions/darnet
  UID:                 f8c8224a-fa91-48b7-a3a0-51dd364803a3
Spec:
  Config:  { "cniVersion":"0.3.1", "name":"sriov-net", "type":"sriov", "vlan":0,"capabilities":{"mac": true, "ips": true},"vlanQoS":0,"ipam":{"type":"static","addresses":[{"address":"192.168.250.0/24"}]} }
Status:
Events:  <none>

# cat darnet-p1.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: brdsamplepod1
  labels:
    version: v1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: brdsamplepod1
      version: v1
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: '[
      {
        "name": "cde/darnet",
        "ips": [ "192.168.250.3/24" ]
      }
    ]'
      labels:
        app: brdsamplepod1
        version: v1
    spec:
      containers:
      - name: samplepod
        command: ["/bin/bash", "-c", "sleep 2000000000000"]
        image: 172.16.68.92:5000/centos/tools:latest
        imagePullPolicy: IfNotPresent #Always
      nodeSelector:
        mtest: gpu0

# oc project cde
# oc create -f darnet-p1.yaml

# oc describe rs brdsamplepod1-7d7f66fdf
Name:           brdsamplepod1-7d7f66fdf
Namespace:      cde
Selector:       app=brdsamplepod1,pod-template-hash=7d7f66fdf,version=v1
Labels:         app=brdsamplepod1
                pod-template-hash=7d7f66fdf
                version=v1
Annotations:    deployment.kubernetes.io/desired-replicas: 1
                deployment.kubernetes.io/max-replicas: 2
                deployment.kubernetes.io/revision: 1
Controlled By:  Deployment/brdsamplepod1
Replicas:       0 current / 1 desired
Pods Status:    0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:       app=brdsamplepod1
                pod-template-hash=7d7f66fdf
                version=v1
  Annotations:  k8s.v1.cni.cncf.io/networks: [ { "name": "cde/darnet", "ips": [ "192.168.250.3/24" ] } ]
  Containers:
   samplepod:
    Image:      172.16.68.92:5000/centos/tools:latest
    Port:       <none>
    Host Port:  <none>
    Command:
      /bin/bash
      -c
      sleep 2000000000000
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Conditions:
  Type             Status  Reason
  ----             ------  ------
  ReplicaFailure   True    FailedCreate
Events:
  Type     Reason        Age                From                   Message
  ----     ------        ----               ----                   -------
  Warning  FailedCreate  0s (x12 over 10s)  replicaset-controller  Error creating: admission webhook "network-resources-injector-config.k8s.io" denied the request: could not find network attachment definition 'default/cde/darnet': could not get Network Attachment Definition default/cde/darnet: unknown

Comment 4 zhaozhanqi 2020-06-16 13:06:55 UTC
then it's weird, just confirm: do you use 'admin' user or 'normal' user to create the project `cde`?

Comment 5 zhaozhanqi 2020-06-16 13:07:42 UTC
it's working well in my side using the normal user

Comment 6 zenghui.shi 2020-06-17 12:43:28 UTC
weiguo, would you please try adding namespace prefix (e.g. cde/) to the net-attach-def name 'darnet'?

for example:

     # cat darnet-pod.yaml
     apiVersion: apps/v1
     kind: Deployment
     metadata:
       name: brdsamplepod1
       labels:
         version: v1
     spec:
       replicas: 1
       selector:
         matchLabels:
           app: brdsamplepod1
           version: v1
       template:
         metadata:
           annotations:
             k8s.v1.cni.cncf.io/networks: '[
           {
             "name": "cde/darnet",                      <=============== cde namespace prefix added here
             "ips": [ "192.168.250.3/24" ]
           }
         ]'
           labels:
             app: brdsamplepod1
             version: v1
         spec:
           containers:
           - name: samplepod
             command: ["/bin/bash", "-c", "sleep 2000000000000"]
             image: 172.16.68.92:5000/centos/tools:latest
             imagePullPolicy: IfNotPresent #Always
           nodeSelector:
             mtest: gpu0

Comment 7 zenghui.shi 2020-06-17 12:59:43 UTC
weiguo, I gave a try with the deployment template in comment#6 and I did hit the same issue as yours.
This might be a bug in network-resources-injector that it doesn't recognize the json format of net-attach-def annotation.
The workaround is to use single string to express the net-attach-def in pod annotation, for example:

replace:
       template:
         metadata:
           annotations:
             k8s.v1.cni.cncf.io/networks: '[
           {
             "name": "cde/darnet",
             "ips": [ "192.168.250.3/24" ]
           }
         ]'

with:

       template:
         metadata:
           annotations:
             k8s.v1.cni.cncf.io/networks: cde/darnet

However, the workaround format doesn't allow additional ips/mac be assigned.

I will check the code and see if that's a real bug in network-resources-injector.

Comment 8 weiguo fan 2020-06-18 11:17:07 UTC
hi, zenghui.shi,

Thanks for your workaround.
We will test it and let you know.

> However, the workaround format doesn't allow additional ips/mac be assigned.

I think this means that we cannot use static ip.
If so, it can help us so much.
Since we'd like to use static ip.

> I will check the code and see if that's a real bug in network-resources-injector.

We believe this is a bug of network-resources-injector.
We'll wait for your result.

Comment 9 zenghui.shi 2020-06-22 10:49:38 UTC
(In reply to weiguo fan from comment #8)
> hi, zenghui.shi,
> 
> Thanks for your workaround.
> We will test it and let you know.
> 
> > However, the workaround format doesn't allow additional ips/mac be assigned.
> 
> I think this means that we cannot use static ip.
> If so, it can help us so much.
> Since we'd like to use static ip.
> 
> > I will check the code and see if that's a real bug in network-resources-injector.
> 
> We believe this is a bug of network-resources-injector.
> We'll wait for your result.

@weiguo,

Please try the following template:

     # cat darnet-pod.yaml
     apiVersion: apps/v1
     kind: Deployment
     metadata:
       name: brdsamplepod1
       labels:
         version: v1
     spec:
       replicas: 1
       selector:
         matchLabels:
           app: brdsamplepod1
           version: v1
       template:
         metadata:
           annotations:
             k8s.v1.cni.cncf.io/networks: '[
           {
             "name": "darnet",
             "namespace": "cde",                            <======= add namespace in network annotation
             "ips": [ "192.168.250.3/24" ]
           }
         ]'
           labels:
             app: brdsamplepod1
             version: v1
         spec:
           containers:
           - name: samplepod
             command: ["/bin/bash", "-c", "sleep 2000000000000"]
             image: 172.16.68.92:5000/centos/tools:latest
             imagePullPolicy: IfNotPresent #Always
           nodeSelector:
             mtest: gpu0


Create above deployment with cmd:  `oc create -f darnet-pod.yaml -n cde`

Comment 10 weiguo fan 2020-06-25 03:54:24 UTC
Hi,  zenghui.shi,

Thank you for the information.

We verified the workaround you mentioned works well for us.

We have the following questions and requests.
- Is this just a workaround or is this just what OCP expected?

  If it's just a workaround, we please fix it.

  If this is OCP's expected behavior, please include the workaround in OCP document.

Comment 11 zenghui.shi 2020-06-28 02:40:56 UTC
(In reply to weiguo fan from comment #10)
> Hi,  zenghui.shi,
> 
> Thank you for the information.
> 
> We verified the workaround you mentioned works well for us.
> 
> We have the following questions and requests.
> - Is this just a workaround or is this just what OCP expected?

This is expected by network-resources-injector webhook when using deployment.

> 
>   If it's just a workaround, we please fix it.

This is not a workaround. 
The workaround is to disable network-resource-injectors by editing SriovOperatorConfig default CR.

> 
>   If this is OCP's expected behavior, please include the workaround in OCP
> document.

We will include a deployment example in latest openshift doc and backport to 4.4

Comment 12 weiguo fan 2020-06-30 12:16:18 UTC
Hi, zenghui.shi,

> This is expected by network-resources-injector webhook when using deployment.
...
> We will include a deployment example in latest openshift doc and backport to 4.4

Okay. Please include this to doc and backport to 4.4.

Moreover, currently only SRIOV plugin requires network-resources-injector.
Other network plugins, such as macvlan, ipvlan, don't require network-resources-injector.
But when network-resources-injector is installed by SRIOV operator,
those network plugins are also affected.

We think at least those should be also introduced in doc.
Or users will be confused.

> This is not a workaround. 
> The workaround is to disable network-resource-injectors by editing SriovOperatorConfig default CR.

As we verified that SRIOV network cannot work(we cannot launch pods with SRIOV network)if we disabled network-resource-injectors.
Please look at the details below.
Is this bug? Should we file a new bug ticket for it?

====================
# oc patch sriovoperatorconfig default   --type=merge -n openshift-sriov-network-operator   --patch '{ "spec": { "enableInjector": false } }'         # disabling the injector
sriovoperatorconfig.sriovnetwork.openshift.io/default patched

# oc get sriovoperatorconfigs.sriovnetwork.openshift.io default  -n openshift-sriov-network-operator -o yaml
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovOperatorConfig
metadata:
  creationTimestamp: "2020-06-08T09:22:33Z" 
  generation: 4
  name: default
  namespace: openshift-sriov-network-operator
  resourceVersion: "46094939" 
  selfLink: /apis/sriovnetwork.openshift.io/v1/namespaces/openshift-sriov-network-operator/sriovoperatorconfigs/default
  uid: 9ca79ef3-3e90-4eed-8ed9-7dfaa30ee420
spec:
  enableInjector: false
  enableOperatorWebhook: true

# oc get sriovnetwork -n openshift-sriov-network-operator darnet -o yaml   ## here networkNamespace is cde
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
  annotations:
    operator.sriovnetwork.openshift.io/last-network-namespace: cde
  creationTimestamp: "2020-06-08T09:48:51Z" 
  finalizers:
  - netattdef.finalizers.sriovnetwork.openshift.io
  generation: 1
  name: darnet
  namespace: openshift-sriov-network-operator
  resourceVersion: "44747955" 
  selfLink: /apis/sriovnetwork.openshift.io/v1/namespaces/openshift-sriov-network-operator/sriovnetworks/darnet
  uid: 60d72a3b-de79-46d0-b291-3576148b774c
spec:
  capabilities: '{"mac": true, "ips": true}'
  ipam: '{"type": "static", "addresses":[{"address": "192.168.250.0/24"}]}'
  networkNamespace: cde
  resourceName: gpu
#

# oc get pods -o wide -n openshift-sriov-network-operator
NAME                                      READY   STATUS    RESTARTS   AGE    IP              NODE                NOMINATED NODE   READINESS GATES
operator-webhook-57f2k                    1/1     Running   0          43h    10.130.0.202    host-172-16-68-95   <none>           <none>
operator-webhook-m8skw                    1/1     Running   0          43h    10.128.0.119    host-172-16-68-96   <none>           <none>
operator-webhook-rqngg                    1/1     Running   0          43h    10.129.0.123    host-172-16-68-94   <none>           <none>
sriov-cni-vggsc                           1/1     Running   0          42h    10.130.2.193    gpu0.nec.test       <none>           <none>
sriov-device-plugin-rx2t4                 1/1     Running   0          133m   172.16.64.241   gpu0.nec.test       <none>           <none>
sriov-network-config-daemon-5746t         1/1     Running   23         43h    172.16.64.241   gpu0.nec.test       <none>           <none>
sriov-network-config-daemon-jjs5l         1/1     Running   17         43h    172.16.68.97    host-172-16-68-97   <none>           <none>
sriov-network-config-daemon-p95q7         1/1     Running   18         43h    172.16.68.98    host-172-16-68-98   <none>           <none>
sriov-network-config-daemon-wdjrz         1/1     Running   16         43h    172.16.64.242   gpu1.nec.test       <none>           <none>
sriov-network-operator-5456d4d99d-zrs6x   1/1     Running   31         18h    10.130.1.29     host-172-16-68-95   <none>           <none>
#

# oc describe network-attachment-definitions.k8s.cni.cncf.io darnet -n cde
Name:         darnet
Namespace:    cde
Labels:       <none>
Annotations:  k8s.v1.cni.cncf.io/resourceName: openshift.io/gpu
API Version:  k8s.cni.cncf.io/v1
Kind:         NetworkAttachmentDefinition
Metadata:
  Creation Timestamp:  2020-06-08T09:48:51Z
  Generation:          1
  Resource Version:    44747954
  Self Link:           /apis/k8s.cni.cncf.io/v1/namespaces/cde/network-attachment-definitions/darnet
  UID:                 b2398a8e-7146-4131-8e18-69fa9fa44651
Spec:
  Config:  { "cniVersion":"0.3.1", "name":"sriov-net", "type":"sriov", "vlan":0,"capabilities":{"mac": true, "ips": true},"vlanQoS":0,"ipam":{"type":"static","addresses":[{"address":"192.168.250.0/24"}]} }
Status:
Events:  <none>
#

# cat sriov-pod.yaml                     # creating pod with SR-IOV interface
apiVersion: apps/v1
kind: Deployment
metadata:
  name: brdsamplepod2
  labels:
    version: v1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: brdsamplepod2
      version: v1
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: '[
      {
        "name": "darnet",
        "ips": [ "192.168.250.3/24" ]
      }
    ]'
      labels:
        app: brdsamplepod2
        version: v1
    spec:
      containers:
      - name: samplepod
        command: ["/bin/bash", "-c", "sleep 2000000000000"]
        image: 172.16.68.92:5000/centos/tools:latest
        imagePullPolicy: IfNotPresent #Always

# oc project cde
# oc create -f sriov-pod.yaml

# oc get pods -n cde
NAME                             READY   STATUS              RESTARTS   AGE
brdsamplepod2-7b69bfb899-dbl7h   0/1     ContainerCreating   0          2s

# oc describe pod brdsamplepod2-7b69bfb899-dbl7h -n cde                   # pod cannot be created.
Name:           brdsamplepod2-7b69bfb899-dbl7h
Namespace:      cde
Priority:       0
Node:           gpu0.nec.test/172.16.64.241
Start Time:     Wed, 10 Jun 2020 00:37:44 -0400
Labels:         app=brdsamplepod2
                pod-template-hash=7b69bfb899
                version=v1
Annotations:    k8s.v1.cni.cncf.io/networks: [ { "name": "darnet", "ips": [ "192.168.250.3/24" ] } ]
                k8s.v1.cni.cncf.io/networks-status:
                openshift.io/scc: restricted
Status:         Pending
IP:
IPs:            <none>
Controlled By:  ReplicaSet/brdsamplepod2-7b69bfb899
Containers:
  samplepod:
    Container ID:
    Image:         172.16.68.92:5000/centos/tools:latest
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/bash
      -c
      sleep 2000000000000
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-stn8w (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  default-token-stn8w:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-stn8w
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                  Age        From                    Message
  ----     ------                  ----       ----                    -------
  Normal   Scheduled               <unknown>  default-scheduler       Successfully assigned cde/brdsamplepod2-7b69bfb899-dbl7h to gpu0.nec.test
  Warning  FailedCreatePodSandBox  <invalid>  kubelet, gpu0.nec.test  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_brdsamplepod2-7b69bfb899-dbl7h_cde_b22b7e04-5ec0-4e51-aec6-567a4ce3a3a0_0(1248ff1ba0c064b2eec95fca248cad77b1a3d26a0ee2e1769c3175a45a1e0397): Multus: [cde/brdsamplepod2-7b69bfb899-dbl7h]: error adding container to network "sriov-net": delegateAdd: error invoking DelegateAdd - "sriov": error in getting result from AddNetwork: SRIOV-CNI failed to load netconf: LoadConf(): VF pci addr is required
  Warning  FailedCreatePodSandBox  <invalid>  kubelet, gpu0.nec.test  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_brdsamplepod2-7b69bfb899-dbl7h_cde_b22b7e04-5ec0-4e51-aec6-567a4ce3a3a0_0(d4873aa0c81611026927816ce58e21c2df896c7cc1004aac742e7e571d7a30bf): Multus: [cde/brdsamplepod2-7b69bfb899-dbl7h]: error adding container to network "sriov-net": delegateAdd: error invoking DelegateAdd - "sriov": error in getting result from AddNetwork: SRIOV-CNI failed to load netconf: LoadConf(): VF pci addr is required


Regards.

Comment 13 zenghui.shi 2020-07-01 01:51:09 UTC
(In reply to weiguo fan from comment #12)
> Hi, zenghui.shi,
> 
> > This is expected by network-resources-injector webhook when using deployment.
> ...
> > We will include a deployment example in latest openshift doc and backport to 4.4
> 
> Okay. Please include this to doc and backport to 4.4.

https://github.com/openshift/openshift-docs/pull/23382/files

> 
> Moreover, currently only SRIOV plugin requires network-resources-injector.
> Other network plugins, such as macvlan, ipvlan, don't require
> network-resources-injector.

Correct.

> But when network-resources-injector is installed by SRIOV operator,
> those network plugins are also affected.

Could you elaborate the issue?

> 
> We think at least those should be also introduced in doc.
> Or users will be confused.
> 
> > This is not a workaround. 
> > The workaround is to disable network-resource-injectors by editing SriovOperatorConfig default CR.
> 
> As we verified that SRIOV network cannot work(we cannot launch pods with
> SRIOV network)if we disabled network-resource-injectors.
> Please look at the details below.
> Is this bug? Should we file a new bug ticket for it?

By disabling network-resources-injector, resource request will no longer be injected to pod/deployment spec automatically, which result in the pod creation failure.

network-resources-injector looks for net-attach-def in pod annotation and inspects it to get the sriov resource name, it injects the sriov resource name in pod resource request/limit automatically using mutating webhook. then kubelet allocates a sriov resource for the pod according to the reqeust in pod spec. when network-resources-injector is disabled, user will need to manually add sriov resource name in pod/deployment resource request/limit. This is the expected behavior.

For example:

1) the sriov resource name is: openshift.io/gpu
2) add the resource name in deployment spec:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: brdsamplepod2
  labels:
    version: v1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: brdsamplepod2
      version: v1
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: '[
      {
        "name": "darnet",
        "ips": [ "192.168.250.3/24" ]
      }
    ]'
      labels:
        app: brdsamplepod2
        version: v1
    spec:
      containers:
      - name: samplepod
        command: ["/bin/bash", "-c", "sleep 2000000000000"]
        image: 172.16.68.92:5000/centos/tools:latest
        imagePullPolicy: IfNotPresent #Always
        resources:
          requests:
            openshift.io/gpu: 1
          limits:
            openshift.io/gpu: 1

Comment 14 weiguo fan 2020-07-07 10:06:18 UTC
Hi,  zenghui.shi,

> > But when network-resources-injector is installed by SRIOV operator,
> > those network plugins are also affected.
> 
> Could you elaborate the issue?

At the first, when we just enabled Macvlan plugin following the following document,
network-resources-injector is not installed.

  https://docs.openshift.com/container-platform/4.4/networking/multiple_networks/configuring-macvlan.html
  Configuring a macvlan network

In the document, it doesn't request to add namespace in network annotation.
We can consume Macvlan resource by deployment but don't need to add namespace in network annotation.

     # cat mac1-br.yaml
     apiVersion: apps/v1
     kind: Deployment
     metadata:
       name: brdsamplepod1
       labels:
         version: v1
     spec:
       replicas: 1
       selector:
         matchLabels:
           app: brdsamplepod1
           version: v1
       template:
         metadata:
           annotations:
             k8s.v1.cni.cncf.io/networks: '[
           {
             "name": "macvlan",
             "ips": [ "192.168.250.3/24" ]
           }
         ]'
           labels:
             app: brdsamplepod1
             version: v1
         spec:
           containers:
           - name: samplepod
             command: ["/bin/bash", "-c", "sleep 2000000000000"]
             image: 172.16.68.92:5000/centos/tools:latest
             imagePullPolicy: IfNotPresent #Always
           nodeSelector:
             mtest: gpu0

But after we installed SRIOV operator to the same cluster, network-resources-injector is installed by default.
Since then the deployment with the above deployment became failed. 

Of course it can also workaround by adding namespace in Macvlan network annotation.
But it really make user confusing.

So, we think we need to mention about the network-resource-injectors and the workaround in the
document of other network plugin such as Macvlan, Ipvlan.

> > > This is not a workaround. 
> > > The workaround is to disable network-resource-injectors by editing SriovOperatorConfig default CR.
> > 
> > As we verified that SRIOV network cannot work(we cannot launch pods with
> > SRIOV network)if we disabled network-resource-injectors.
> > Please look at the details below.
> > Is this bug? Should we file a new bug ticket for it?
> 
> By disabling network-resources-injector, resource request will no longer be injected to pod/deployment spec automatically, which result in the pod creation failure.
> 
> network-resources-injector looks for net-attach-def in pod annotation and inspects it to get the sriov resource name, 
> it injects the sriov resource name in pod resource request/limit automatically using mutating webhook. then kubelet allocates a sriov resource for the pod according to 
> the reqeust in pod spec. when network-resources-injector is disabled, user will need to manually add sriov resource name in pod/deployment resource request/limit. 
> This is the expected behavior.
> 
> For example:
> 
> 1) the sriov resource name is: openshift.io/gpu
> 2) add the resource name in deployment spec:

Thank you for the information.
We will verify it and let you know the result.

If this works, we think we should including those information in document.
Or users will be confusing as us.

Regards.
Weiguo Fan.

Comment 15 zhaozhanqi 2020-07-07 10:54:16 UTC
yes, seems this is bug for network-resources-injector

 1. disable the network-resources-injector ( enableInjector: false) : no need specified the `namespace` in k8s.v1.cni.cncf.io/networks when create pod with deployment for macvlan CNI
 2. enable network-resources-injector ( enableInjector: true): the `namespace` is also need to added for macvlan CNI. otherwise: see logs: 

Error creating: admission webhook "network-resources-injector-config.k8s.io" denied the request: could not find network attachment definition 'default/macvlan-bridge': could not get Network Attachment Definition default/macvlan-bridge: the server could not find the requested resource

Comment 16 weiguo fan 2020-07-10 02:08:40 UTC
Hi,  zenghui.shi,

> network-resources-injector looks for net-attach-def in pod annotation and inspects it to get the sriov resource name, 
> it injects the sriov resource name in pod resource request/limit automatically using mutating webhook. then kubelet allocates a sriov resource for the pod according to 
> the reqeust in pod spec. when network-resources-injector is disabled, user will need to manually add sriov resource name in pod/deployment resource request/limit. 
> This is the expected behavior.

SRIOV network is not inserted  to my pod even after I specified resource request/limit in its deployment,
when  network-resources-injector is disabled.

Here is what I did. Is there anything wrong of my configuration?

------------------------------------------
- Disable network-resources-injector
     # oc patch sriovoperatorconfig default   --type=merge -n openshift-sriov-network-operator   --patch '{ "spec": { "enableInjector": false } }'
         
     # oc get pods -n openshift-sriov-network-operator -o wide
     NAME                                      READY   STATUS    RESTARTS   AGE    IP              NODE                NOMINATED NODE   READINESS GATES
     operator-webhook-899cw                    1/1     Running   0          32h    10.128.0.211    host-172-16-68-96   <none>           <none>
     operator-webhook-c6gpt                    1/1     Running   0          32h    10.129.0.251    host-172-16-68-94   <none>           <none>
     operator-webhook-ltgzb                    1/1     Running   0          32h    10.130.1.62     host-172-16-68-95   <none>           <none>
     sriov-cni-gz9r8                           1/1     Running   0          32h    10.130.2.55     gpu0.nec.test       <none>           <none>
     sriov-device-plugin-m2f6h                 1/1     Running   0          151m   172.16.64.241   gpu0.nec.test       <none>           <none>
     sriov-network-config-daemon-55gr4         1/1     Running   1          32h    172.16.68.97    host-172-16-68-97   <none>           <none>
     sriov-network-config-daemon-cqgnh         1/1     Running   1          32h    172.16.68.98    host-172-16-68-98   <none>           <none>
     sriov-network-config-daemon-fkvls         1/1     Running   1          32h    172.16.64.242   gpu1.nec.test       <none>           <none>
     sriov-network-config-daemon-xdmt7         1/1     Running   1          32h    172.16.64.241   gpu0.nec.test       <none>           <none>
     sriov-network-operator-587d55579d-4sdvp   1/1     Running   0          32h    10.129.0.249    host-172-16-68-94   <none>           <none>

- The configuration of SRIOV network.

     # oc get sriovnetwork darnet -o yaml
     apiVersion: sriovnetwork.openshift.io/v1
     kind: SriovNetwork
     metadata:
       annotations:
         operator.sriovnetwork.openshift.io/last-network-namespace: cde
       creationTimestamp: "2020-06-16T08:52:47Z" 
       finalizers:
       - netattdef.finalizers.sriovnetwork.openshift.io
       generation: 8
       name: darnet
       namespace: openshift-sriov-network-operator
       resourceVersion: "55679617" 
       selfLink: /apis/sriovnetwork.openshift.io/v1/namespaces/openshift-sriov-network-operator/sriovnetworks/darnet
       uid: 1b0dde41-6e30-4907-8105-f6f15d565052
     spec:
       capabilities: '{"mac": true, "ips": true}'
       ipam: '{}'
       networkNamespace: cde
       resourceName: gpu

     
     # oc describe network-attachment-definitions.k8s.cni.cncf.io -n cde
     Name:         darnet
     Namespace:    cde
     Labels:       <none>
     Annotations:  k8s.v1.cni.cncf.io/resourceName: openshift.io/gpu
     API Version:  k8s.cni.cncf.io/v1
     Kind:         NetworkAttachmentDefinition
     Metadata:
       Creation Timestamp:  2020-06-23T06:11:58Z
       Generation:          2
       Resource Version:    55679618
       Self Link:           /apis/k8s.cni.cncf.io/v1/namespaces/cde/network-attachment-definitions/darnet
       UID:                 63d7a99a-8f4e-43fc-938d-bced4144d88d
     Spec:
       Config:  { "cniVersion":"0.3.1", "name":"sriov-net", "type":"sriov", "vlan":0,"capabilities":{"mac": true, "ips": true},"vlanQoS":0,"ipam":{} }
     Status:
     Events:  <none>

- The sample deployment we use. We specified request/limit as what you mentioned.

     # cat p.yaml
     apiVersion: apps/v1
     kind: Deployment
     metadata:
       name: brdsamplepod2
       labels:
         version: v1
     spec:
       replicas: 1
       selector:
         matchLabels:
           app: brdsamplepod2
           version: v1
       template:
         metadata:
           annotations:
             k8s.v1.cni.cncf.io/networks: '[
           {
             "name": "darnet",
             "namespace": "cde",
             "ips": [ "192.168.250.3/24" ]
           }
         ]'
           labels:
             app: brdsamplepod2
             version: v1
         spec:
           containers:
           - name: samplepod
             command: ["/bin/bash", "-c", "sleep 2000000000000"]
             image: 172.16.68.92:5000/centos/tools:latest
             imagePullPolicy: IfNotPresent #Always
             resources:
               requests:
                 openshift.io/gpu: 1                 ## using resource "gpu" 
               limits:
                 openshift.io/gpu: 1

- But SRIOV network is not inserted into the Pod.

     # oc get pods -o wide -n cde
     NAME                             READY   STATUS    RESTARTS   AGE     IP            NODE            NOMINATED NODE   READINESS GATES
     brdsamplepod2-7c97b66cdd-hdrv6   1/1     Running   0          4m26s   10.130.2.63   gpu0.nec.test   <none>           <none>
     #
     
     # oc describe pod brdsamplepod2-7c97b66cdd-hdrv6 -n cde
     Name:         brdsamplepod2-7c97b66cdd-hdrv6
     Namespace:    cde
     Priority:     0
     Node:         gpu0.nec.test/172.16.64.241
     Start Time:   Wed, 08 Jul 2020 03:23:17 -0400
     Labels:       app=brdsamplepod2
                   pod-template-hash=7c97b66cdd
                   version=v1
     Annotations:  k8s.v1.cni.cncf.io/networks: [ { "name": "darnet", "namespace": "cde", "ips": [ "192.168.250.3/24" ] } ]
                   k8s.v1.cni.cncf.io/networks-status:
                     [{
                         "name": "openshift-sdn",
                         "interface": "eth0",
                         "ips": [
                             "10.130.2.63" 
                         ],
                         "dns": {},
                         "default-route": [
                             "10.130.2.1" 
                         ]
                     },{
                         "name": "sriov-net",
                         "interface": "net1",
                         "dns": {}
                     }]
                   openshift.io/scc: restricted
     Status:       Running
     IP:           10.130.2.63
     IPs:
       IP:           10.130.2.63
     Controlled By:  ReplicaSet/brdsamplepod2-7c97b66cdd
     Containers:
       samplepod:
         Container ID:  cri-o://8958640e44b4a1d56e4ccea3fe2718765c00c8290cbb57f1cac1ece90ca8cca3
         Image:         172.16.68.92:5000/centos/tools:latest
         Image ID:      172.16.68.92:5000/centos/tools@sha256:81159542603c2349a276a30cc147045bc642bd84a62c1b427a8d243ef1893e2f
         Port:          <none>
         Host Port:     <none>
         Command:
           /bin/bash
           -c
           sleep 2000000000000
         State:          Running
           Started:      Wed, 08 Jul 2020 03:23:19 -0400
         Ready:          True
         Restart Count:  0
         Limits:
           openshift.io/gpu:  1
         Requests:
           openshift.io/gpu:  1
         Environment:         <none>
         Mounts:
           /var/run/secrets/kubernetes.io/serviceaccount from default-token-stn8w (ro)
     Conditions:
       Type              Status
       Initialized       True
       Ready             True
       ContainersReady   True
       PodScheduled      True
     Volumes:
       default-token-stn8w:
         Type:        Secret (a volume populated by a Secret)
         SecretName:  default-token-stn8w
         Optional:    false
     QoS Class:       BestEffort
     Node-Selectors:  <none>
     Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                      node.kubernetes.io/unreachable:NoExecute for 300s
     Events:
       Type    Reason     Age        From                    Message
       ----    ------     ----       ----                    -------
       Normal  Scheduled  <unknown>  default-scheduler       Successfully assigned cde/brdsamplepod2-7c97b66cdd-hdrv6 to gpu0.nec.test
       Normal  Pulled     4m6s       kubelet, gpu0.nec.test  Container image "172.16.68.92:5000/centos/tools:latest" already present on machine
       Normal  Created    4m6s       kubelet, gpu0.nec.test  Created container samplepod
       Normal  Started    4m6s       kubelet, gpu0.nec.test  Started container samplepod
     
     # oc -n cde rsh brdsamplepod2-7c97b66cdd-hdrv6 ip -4 a                 ## No sriov interface gets attached
     1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
         inet 127.0.0.1/8 scope host lo
            valid_lft forever preferred_lft forever
     3: eth0@if77: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default  link-netnsid 0
         inet 10.130.2.63/23 brd 10.130.3.255 scope global eth0
            valid_lft forever preferred_lft forever

Comment 17 zhaozhanqi 2020-07-10 07:11:25 UTC
(In reply to weiguo fan from comment #16)
> Hi,  zenghui.shi,
> 
> > network-resources-injector looks for net-attach-def in pod annotation and inspects it to get the sriov resource name, 
> > it injects the sriov resource name in pod resource request/limit automatically using mutating webhook. then kubelet allocates a sriov resource for the pod according to 
> > the reqeust in pod spec. when network-resources-injector is disabled, user will need to manually add sriov resource name in pod/deployment resource request/limit. 
> > This is the expected behavior.
> 
> SRIOV network is not inserted  to my pod even after I specified resource
> request/limit in its deployment,
> when  network-resources-injector is disabled.
> 
> Here is what I did. Is there anything wrong of my configuration?
> 
> ------------------------------------------
> - Disable network-resources-injector
>      # oc patch sriovoperatorconfig default   --type=merge -n
> openshift-sriov-network-operator   --patch '{ "spec": { "enableInjector":
> false } }'
>          
>      # oc get pods -n openshift-sriov-network-operator -o wide
>      NAME                                      READY   STATUS    RESTARTS  
> AGE    IP              NODE                NOMINATED NODE   READINESS GATES
>      operator-webhook-899cw                    1/1     Running   0         
> 32h    10.128.0.211    host-172-16-68-96   <none>           <none>
>      operator-webhook-c6gpt                    1/1     Running   0         
> 32h    10.129.0.251    host-172-16-68-94   <none>           <none>
>      operator-webhook-ltgzb                    1/1     Running   0         
> 32h    10.130.1.62     host-172-16-68-95   <none>           <none>
>      sriov-cni-gz9r8                           1/1     Running   0         
> 32h    10.130.2.55     gpu0.nec.test       <none>           <none>
>      sriov-device-plugin-m2f6h                 1/1     Running   0         
> 151m   172.16.64.241   gpu0.nec.test       <none>           <none>
>      sriov-network-config-daemon-55gr4         1/1     Running   1         
> 32h    172.16.68.97    host-172-16-68-97   <none>           <none>
>      sriov-network-config-daemon-cqgnh         1/1     Running   1         
> 32h    172.16.68.98    host-172-16-68-98   <none>           <none>
>      sriov-network-config-daemon-fkvls         1/1     Running   1         
> 32h    172.16.64.242   gpu1.nec.test       <none>           <none>
>      sriov-network-config-daemon-xdmt7         1/1     Running   1         
> 32h    172.16.64.241   gpu0.nec.test       <none>           <none>
>      sriov-network-operator-587d55579d-4sdvp   1/1     Running   0         
> 32h    10.129.0.249    host-172-16-68-94   <none>           <none>
> 
> - The configuration of SRIOV network.
> 
>      # oc get sriovnetwork darnet -o yaml
>      apiVersion: sriovnetwork.openshift.io/v1
>      kind: SriovNetwork
>      metadata:
>        annotations:
>          operator.sriovnetwork.openshift.io/last-network-namespace: cde
>        creationTimestamp: "2020-06-16T08:52:47Z" 
>        finalizers:
>        - netattdef.finalizers.sriovnetwork.openshift.io
>        generation: 8
>        name: darnet
>        namespace: openshift-sriov-network-operator
>        resourceVersion: "55679617" 
>        selfLink:
> /apis/sriovnetwork.openshift.io/v1/namespaces/openshift-sriov-network-
> operator/sriovnetworks/darnet
>        uid: 1b0dde41-6e30-4907-8105-f6f15d565052
>      spec:
>        capabilities: '{"mac": true, "ips": true}'
>        ipam: '{}'
>        networkNamespace: cde
>        resourceName: gpu
> 
>      
>      # oc describe network-attachment-definitions.k8s.cni.cncf.io -n cde
>      Name:         darnet
>      Namespace:    cde
>      Labels:       <none>
>      Annotations:  k8s.v1.cni.cncf.io/resourceName: openshift.io/gpu
>      API Version:  k8s.cni.cncf.io/v1
>      Kind:         NetworkAttachmentDefinition
>      Metadata:
>        Creation Timestamp:  2020-06-23T06:11:58Z
>        Generation:          2
>        Resource Version:    55679618
>        Self Link:          
> /apis/k8s.cni.cncf.io/v1/namespaces/cde/network-attachment-definitions/darnet
>        UID:                 63d7a99a-8f4e-43fc-938d-bced4144d88d
>      Spec:
>        Config:  { "cniVersion":"0.3.1", "name":"sriov-net", "type":"sriov",
> "vlan":0,"capabilities":{"mac": true, "ips": true},"vlanQoS":0,"ipam":{} }

could you help check with ipam":{"type":"static"},  saw here is empty, is this expected?

I did a test with ipam":{"type":"static"}, it works well in my side.

Comment 18 zenghui.shi 2020-07-10 12:25:23 UTC
> >        Self Link:          
> > /apis/k8s.cni.cncf.io/v1/namespaces/cde/network-attachment-definitions/darnet
> >        UID:                 63d7a99a-8f4e-43fc-938d-bced4144d88d
> >      Spec:
> >        Config:  { "cniVersion":"0.3.1", "name":"sriov-net", "type":"sriov",
> > "vlan":0,"capabilities":{"mac": true, "ips": true},"vlanQoS":0,"ipam":{} }
> 
> could you help check with ipam":{"type":"static"},  saw here is empty, is
> this expected?
> 
> I did a test with ipam":{"type":"static"}, it works well in my side.

Yes, try with static ipam and btw, what is the deviceType used in sriov network node policy? is it vfio-pci or netdevice?

Could you please also check the environment variable inside container to see if there is a env var with prefix PCIDEVICE_ ?

Comment 19 zenghui.shi 2020-07-14 03:34:02 UTC
(In reply to zhaozhanqi from comment #15)
> yes, seems this is bug for network-resources-injector
> 
>  1. disable the network-resources-injector ( enableInjector: false) : no
> need specified the `namespace` in k8s.v1.cni.cncf.io/networks when create
> pod with deployment for macvlan CNI
>  2. enable network-resources-injector ( enableInjector: true): the
> `namespace` is also need to added for macvlan CNI. otherwise: see logs: 
> 
> Error creating: admission webhook "network-resources-injector-config.k8s.io"
> denied the request: could not find network attachment definition
> 'default/macvlan-bridge': could not get Network Attachment Definition
> default/macvlan-bridge: the server could not find the requested resource

Weiguo, Zhanqi:

Yes, looks like it is indeed an issue with network-resources-injector, some explanation below:

network resources injector inspects the net-attach-def and injects the sriov resource request to pod resource request/limits. 
It works perfectly with a pod, but has issues dealing with a deployment pod in a non-default namespace.

The workflow of network resources injector is:

1) watch the creation/update of pod
2) receive AdmissionReview request
3) deserialize pod object from AdmissionReview request
4) set pod namespace to 'default', If pod object doesn't contain valid namespace (e.g. empty namespace)
5) retrieve net-attach-def with above namespace + net-attach-def name (from pod object annotation)
6) If resource is found in net-attach-def, inject the resource name in pod object. else do nothing.

The problem for a deployment pod happens in step 5), the deserialized deployment pod object contains an empty namespace field, so it is set to default namespace, which in turn causes the actual namespace of deployment not be respected. So when network-resources-injector tries to get the net-attach-def with default namespace, it fails.

There maybe two ways to fix this:

1) find a way to get pod namespace from deployment
2) watch the creation/update of deployment object and mutate the deployment spec (this implies that we should ignore the pod creation/update event that's from the same deployment)

I'll explore above options and see what's the best way to fix it.

Meanwhile the workaround when network-resources-injector is enabled, is to always specify the namespace field for net-attach-def in pod annotation.

Comment 20 weiguo fan 2020-07-17 07:37:12 UTC
hi,  zenghui.shi,

> could you help check with ipam":{"type":"static"},  saw here is empty, is this expected?
>
> I did a test with ipam":{"type":"static"}, it works well in my side.

Thank you for checking.
We verified that after add ipam":{"type":"static"}, the workaround works good.

> There maybe two ways to fix this:
>
> 1) find a way to get pod namespace from deployment
> 2) watch the creation/update of deployment object and mutate the deployment spec (this implies that we should ignore the pod creation/update event that's from the same deployment)
> 
> I'll explore above options and see what's the best way to fix it.
> 
> Meanwhile the workaround when network-resources-injector is enabled, is to always specify the namespace field for net-attach-def in pod annotation.

Thanks.

Comment 24 zenghui.shi 2020-08-21 07:36:08 UTC
Note: the fix of this bug in network-resources-injector requires another change[1] in SR-IOV Operator in order to function properly. 
Make sure use the right version of SR-IOV Operator(contains fix for [1]) when verifying the bug.

[1]:https://bugzilla.redhat.com/show_bug.cgi?id=1870915

Comment 25 zenghui.shi 2020-08-21 07:40:22 UTC
*** Bug 1840962 has been marked as a duplicate of this bug. ***

Comment 30 errata-xmlrpc 2020-10-27 16:06:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.