Bug 2092839

Summary: Downward API (annotations) is missing PCI information when using the tuning metaPlugin on SR-IOV Networks
Product: OpenShift Container Platform Reporter: Jitendra Pradhan <jpradhan>
Component: NetworkingAssignee: Douglas Smith <dosmith>
Networking sub component: multus QA Contact: zhaozhanqi <zzhao>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: adubey, akashem, anbhat, cback, cgoncalves, dosmith, ealcaniz, eglottma, mfojtik, mmirecki, openshift-bugs-escalate, wlewis, zzhao
Version: 4.7   
Target Milestone: ---   
Target Release: 4.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-11-02 06:33:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 7 zhaozhanqi 2022-06-23 06:54:31 UTC
Hi, dosmith,  I'm trying verified this bug according the cases, seems pods cannot be ready now. do we support tuning  now?


seems pods cannot be ready with the following net-attach-def from cases mentioned. 

metaPlugins: |
    {
      "type": "tuning",
      "sysctl": {
        "net.ipv4.conf.all.accept_ra":"0",
        "net.ipv4.conf.all.autoconf":"0"
      }

When creating pods with above config, the pod is failed with 

  Warning  FailedCreatePodSandBox  7s    kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_testpod15pz9c_z1_2b979196-3799-4056-b9c2-da2251b69faa_0(4daa03b4627dfe16553101dcaa7e24cf945c9f12edd070694dc7cf5e9efee380): error adding pod z1_testpod15pz9c to CNI network "multus-cni-network": plugin type="multus" name="multus-cni-network" failed (add): [z1/testpod15pz9c/2b979196-3799-4056-b9c2-da2251b69faa:intel-netdevice-rhcos]: error adding container to network "intel-netdevice-rhcos": Sysctl net.ipv4.conf.all.accept_ra is not allowed. Only the following sysctls are allowed: [^net.ipv4.conf.IFNAME.accept_ra$ ^net.ipv4.conf.IFNAME.accept_redirects$ ^net.ipv4.conf.IFNAME.accept_source_route$ ^net.ipv4.conf.IFNAME.arp_accept$ ^net.ipv4.conf.IFNAME.arp_notify$ ^net.ipv4.conf.IFNAME.disable_policy$ ^net.ipv4.conf.IFNAME.secure_redirects$ ^net.ipv4.conf.IFNAME.send_redirects$ ^net.ipv6.conf.IFNAME.accept_ra$ ^net.ipv6.conf.IFNAME.accept_redirects$ ^net.ipv6.conf.IFNAME.accept_source_route$ ^net.ipv6.conf.IFNAME.arp_accept$ ^net.ipv6.conf.IFNAME.arp_notify$ ^net.ipv6.neigh.IFNAME.base_reachable_time_ms$ ^net.ipv6.neigh.IFNAME.retrans_time_ms$]


*****************************************

After I changed to IFNAME

metaPlugins: |
    {
      "type": "tuning",
      "sysctl": {
        "net.ipv4.conf.IFNAME.accept_ra":"0",
        "net.ipv4.conf.IFNAME.autoconf":"0"
      }


it also failed with 


  Warning  FailedCreatePodSandBox  3m33s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_testpod1gtxdl_z1_7d8bb54d-fabf-403b-a921-1e904634ca28_0(c4ed130bd5a20aae82be7cdc62de57c045c009a0ed4942d6c11fb1dccb49520c): error adding pod z1_testpod1gtxdl to CNI network "multus-cni-network": plugin type="multus" name="multus-cni-network" failed (add): [z1/testpod1gtxdl/7d8bb54d-fabf-403b-a921-1e904634ca28:intel-netdevice-rhcos]: error adding container to network "intel-netdevice-rhcos": open /proc/sys/net/ipv4/conf/net1/accept_ra: no such file or directory




1.  init sriov VF with 

apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
  name: intel-netdevice
  namespace: openshift-sriov-network-operator
spec:
  deviceType: netdevice
  nicSelector:
    pfNames:
      - ens1f0
    rootDevices:
      - '0000:3b:00.0'
    vendor: '8086'
  nodeSelector:
    feature.node.kubernetes.io/sriov-capable: 'true'
  numVfs: 2
  priority: 99
  resourceName: intelnetdevice

2. sriovnetwork CR

apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
  name: intel-netdevice-rhcos
  namespace: openshift-sriov-network-operator
spec:
  ipam: |
    {
      "type": "host-local",
      "subnet": "10.56.217.0/24",
      "rangeStart": "10.56.217.171",
      "rangeEnd": "10.56.217.181",
      "routes": [{
        "dst": "0.0.0.0/0"
      }],
      "gateway": "10.56.217.1"
    }
  vlan: 0
  resourceName: intelnetdevice
  networkNamespace: z1
  metaPlugins : |
    {
      "type": "tuning",
      "sysctl": {
        "net.ipv4.conf.all.accept_ra": "0"
      }
    }

3. pods yaml file

apiVersion: v1
kind: Pod
metadata:
  generateName: testpod1
  namespace: z1
  labels:
    env: test
  annotations:
    k8s.v1.cni.cncf.io/networks: intel-netdevice-rhcos
spec:
  containers:
  - name: test-pod
    image: quay.io/openshifttest/hello-sdn@sha256:d5785550cf77b7932b090fcd1a2625472912fb3189d5973f177a5a2c347a1f95

Comment 11 zhaozhanqi 2022-07-12 03:49:21 UTC
dosmith HI moved this bug to assigned according to https://bugzilla.redhat.com/show_bug.cgi?id=2092839#c7

Comment 14 Akash Dubey 2022-09-22 08:19:45 UTC
Hi Team,

Is there an update we can inform the customer of?

Regarding the Fix & the work on backport

Comment 17 Carlos Goncalves 2022-10-17 08:30:50 UTC
OCP 4.10.z backport: https://issues.redhat.com/browse/OCPBUGS-2448

Comment 19 zhaozhanqi 2022-10-19 09:13:28 UTC
Verified this bug 

update the yaml from comment 7 to: 

apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
  name: intel-netdevice-rhcos
  namespace: openshift-sriov-network-operator
spec:
  ipam: |
    {
      "type": "host-local",
      "subnet": "10.56.217.0/24",
      "rangeStart": "10.56.217.171",
      "rangeEnd": "10.56.217.181",
      "routes": [{
        "dst": "0.0.0.0/0"
      }],
      "gateway": "10.56.217.1"
    }
  vlan: 0
  resourceName: intelnetdevice
  networkNamespace: z1
  metaPlugins : |
    {
      "type": "tuning",
      "sysctl": {
        "net.ipv4.conf.IFNAME.arp_accept": "0"
      }
    }

After pod is created. We can see pci-address is added. 

bug_2092839]# oc rsh -n z1 testpod1vqkvv
/ # cat /etc/podnetinfo/
..2022_10_19_09_06_06.1052521583/  ..data/                            annotations                        labels
/ # cat /etc/podnetinfo/annotations 
k8s.ovn.org/pod-networks="{\"default\":{\"ip_addresses\":[\"10.131.0.109/23\"],\"mac_address\":\"0a:58:0a:83:00:6d\",\"gateway_ips\":[\"10.131.0.1\"],\"ip_address\":\"10.131.0.109/23\",\"gateway_ip\":\"10.131.0.1\"}}"
k8s.v1.cni.cncf.io/network-status="[{\n    \"name\": \"ovn-kubernetes\",\n    \"interface\": \"eth0\",\n    \"ips\": [\n        \"10.131.0.109\"\n    ],\n    \"mac\": \"0a:58:0a:83:00:6d\",\n    \"default\": true,\n    \"dns\": {}\n},{\n    \"name\": \"z1/intel-netdevice-rhcos\",\n    \"interface\": \"net1\",\n    \"ips\": [\n        \"10.56.217.180\"\n    ],\n    \"mac\": \"0a:d1:9b:17:1e:ca\",\n    \"dns\": {},\n    \"device-info\": {\n        \"type\": \"pci\",\n        \"version\": \"1.0.0\",\n        \"pci\": {\n            \"pci-address\": \"0000:3b:02.0\"\n        }\n    }\n}]"
k8s.v1.cni.cncf.io/networks="intel-netdevice-rhcos"
k8s.v1.cni.cncf.io/networks-status="[{\n    \"name\": \"ovn-kubernetes\",\n    \"interface\": \"eth0\",\n    \"ips\": [\n        \"10.131.0.109\"\n    ],\n    \"mac\": \"0a:58:0a:83:00:6d\",\n    \"default\": true,\n    \"dns\": {}\n},{\n    \"name\": \"z1/intel-netdevice-rhcos\",\n    \"interface\": \"net1\",\n    \"ips\": [\n        \"10.56.217.180\"\n    ],\n    \"mac\": \"0a:d1:9b:17:1e:ca\",\n    \"dns\": {},\n    \"device-info\": {\n        \"type\": \"pci\",\n        \"version\": \"1.0.0\",\n        \"pci\": {\n            \"pci-address\": \"0000:3b:02.0\"\n        }\n    }\n}]"
kubernetes.io/config.seen="2022-10-19T09:06:03.128907322Z"
kubernetes.io/config.source="api"

Comment 25 errata-xmlrpc 2022-11-02 06:33:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.12 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7201

Comment 29 Red Hat Bugzilla 2023-09-18 04:38:20 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days