Bug 1857440

Summary: [sriov] openshift.io/intelsriov resources are no longer available after worker node is restarted
Product: OpenShift Container Platform Reporter: Walid A. <wabouham>
Component: NetworkingAssignee: Peng Liu <pliu>
Networking sub component: SR-IOV QA Contact: zhaozhanqi <zzhao>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: unspecified CC: ddharwar, dosmith, mifiedle, zshi
Version: 4.6   
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:14:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Walid A. 2020-07-15 20:46:49 UTC
Description of problem:
Restarting a worker node on an OCP 4.5.0rc.5 baremetal IPI cluster (3 worker nodes and 2 worker nodes), with openshift-sriov-network-operator deployed, causes the openshift.io/intelsriov VF resources to become unavailable and you can no longer deploy pods requesting them.  In `oc describe node` output on the worker node, 'openshift.io/intelsriov:' value goes from 8 to 0 under both Capacity and Allocatable sections after the reboot:
.
.
.

Capacity:
  cpu:                      80
  ephemeral-storage:        1715748Mi
  hugepages-1Gi:            4Gi
  hugepages-2Mi:            0
  memory:                   196655440Ki
  openshift.io/intelsriov:  0.  <========== previously was 8 before reboot
  pods:                     250
Allocatable:
  cpu:                      79500m
  ephemeral-storage:        1618109212859
  hugepages-1Gi:            4Gi
  hugepages-2Mi:            0
  memory:                   191310160Ki
  openshift.io/intelsriov:  0. <========== previously was 8 before reboot
  pods:                     250


All the pods in the openshift-sriov-network-operator namespace are running after the worker node is restarted.

As a workaround, restarting the sriov-device-plugin pod in the openshift-sriov-network-operator namespace fixes this issue and the openshift.io/intelsriov resource become available again, and show up correctly in oc describe node output.


Version-Release number of selected component (if applicable):
4.5.0-rc.5 

How reproducible:
Always

Steps to Reproduce:
1. Deploy and OCP 4.5.x baremetal IPI cluster
2. Deploy SR-IOV operator, create SriovNetworkNodePolicy with 8 VFs and SriovNetwork
3. Restart worker node from OpenShift Console (Compute, Baremetal Hosts -> worker-0, Action: Restart)
4. After rebooted worker node comes back up and is Ready, run `oc describe node <workernode>`

Actual results:
sriov resource count is 0 in oc describe node after reboot

Expected results:
oc describe node after reboot should show:

Capacity:
.
.
  openshift.io/intelsriov:  8
  pods:                     250
Allocatable:
.
.

  openshift.io/intelsriov:  8
  pods:                     250

Additional info:
Link to must-gather lo, oc comd outputs, and sriov operator logs in next comment

Comment 2 zhaozhanqi 2020-07-16 07:14:26 UTC
this issue do not be reproduced in my cluster.  
after the node is reboot.  the sriov-device-plugin pod will be re-created by auto. and when the sriov-device-plugin is running. the VF rollback correctly.

sriov-device-plugin-tm454                 1/1     Running   0          37s


oc get node -o yaml | grep "openshift.io/intelnetdevice"
         
    openshift.io/intelnetdevice: "5"
    openshift.io/intelnetdevice: "5"

Comment 3 Walid A. 2020-07-16 18:41:58 UTC
@zhaozhanqi which version of OCP do you not see this issue on ?  You have "openshift.io/intelnetdevice" and we have "openshift.io/intelsriov" resources, not sure if this why or if our configurations are different.
We are seeing this issue also in another baremetal IPI cluster (same hardware), on OCP 4.5.0.rc1 (@ddharwar's cluster).

Comment 4 Peng Liu 2020-07-20 08:48:08 UTC
Hi Walid,

Could you share the output of 'oc get -n openshift-sriov-network-operator SriovNetworkNodeState -o yaml' and the log of the 'sriov-network-config-daemon-xxxx' pod after the node reboot? Also please check the node resource until the 'syncStatus' in SriovNetworkNodeState CR becomes to 'Succeeded'.

Comment 5 Peng Liu 2020-07-20 14:49:04 UTC
I cannot reproduce this issue in my environment either. As zhanqi mentioned, the sirov-network-device-plugin pod shall be recreated after the node reboot.

Comment 7 Peng Liu 2020-07-21 03:08:40 UTC
Hi Walid,

According to the log you posted, you're using the 4.6 origin images instead of the downstream 4.5 images of the sriov network operator.

Comment 10 Walid A. 2020-07-29 05:12:20 UTC
Hi Peng,
I was able to verify that openshift.io/intelsriov resources do not disappear after worker node reboot, on OCP 4.5.3 baremetal cluster with v4.5 sriov-network-operator images (deployed from OperatorHub).

However, when I tried to deploy the sriov-network-operator from github master branch or from release-4.6 ( both have the merged fix PR openshift/sriov-network-operator/pull/308), the sriov-cni pod goes to Init:CrashLoopBackoff after deploying the sriov policy: 

# oc get pods -n openshift-sriov-network-operator
NAME                                      READY   STATUS                  RESTARTS   AGE
network-resources-injector-5fghq          1/1     Running                 0          138m
network-resources-injector-5n4kt          1/1     Running                 0          138m
network-resources-injector-7rdlc          1/1     Running                 0          138m
operator-webhook-97wbd                    1/1     Running                 0          138m
operator-webhook-s5hv2                    1/1     Running                 0          138m
operator-webhook-sjrcd                    1/1     Running                 0          138m
sriov-cni-ft6r8                           0/1     Init:CrashLoopBackOff   28         124m
sriov-device-plugin-lmknz                 1/1     Running                 0          122m
sriov-network-config-daemon-bzclg         1/1     Running                 0          138m
sriov-network-config-daemon-ks2hw         1/1     Running                 0          138m
sriov-network-operator-785676dfcf-wsgjt   1/1     Running                 0          138m


The sriov-network-operator shows version 4.3.0:

# oc describe pod -n openshift-sriov-network-operator sriov-network-operator-785676dfcf-wsgjt
Name:         sriov-network-operator-785676dfcf-wsgjt
Namespace:    openshift-sriov-network-operator
Priority:     0
Node:         master-2/192.168.222.12
Start Time:   Wed, 29 Jul 2020 01:54:14 +0000
Labels:       name=sriov-network-operator
              pod-template-hash=785676dfcf
Annotations:  k8s.ovn.org/pod-networks:
                {"default":{"ip_addresses":["10.128.0.10/23"],"mac_address":"fa:28:ea:80:00:0b","gateway_ips":["10.128.0.1"],"ip_address":"10.128.0.10/23"...
              k8s.v1.cni.cncf.io/network-status:
                [{
                    "name": "ovn-kubernetes",
                    "interface": "eth0",
                    "ips": [
                        "10.128.0.10"
                    ],
                    "mac": "fa:28:ea:80:00:0b",
                    "default": true,
                    "dns": {}
                }]
              k8s.v1.cni.cncf.io/networks-status:
                [{
                    "name": "ovn-kubernetes",
                    "interface": "eth0",
                    "ips": [
                        "10.128.0.10"
                    ],
                    "mac": "fa:28:ea:80:00:0b",
                    "default": true,
                    "dns": {}
                }]
              openshift.io/scc: restricted
Status:       Running
IP:           10.128.0.10
IPs:
  IP:           10.128.0.10
Controlled By:  ReplicaSet/sriov-network-operator-785676dfcf
Containers:
  sriov-network-operator:
    Container ID:  cri-o://0adae7c5713ddd4dc116f16742ceb31272aa97eb7d592687892b3781b8265288
    Image:         quay.io/openshift/origin-sriov-network-operator@sha256:3383f608660e0b153ddd8b70f33f295b028662ecfa0e732834cfaf97e5a3a34a
    Image ID:      quay.io/openshift/origin-sriov-network-operator@sha256:3383f608660e0b153ddd8b70f33f295b028662ecfa0e732834cfaf97e5a3a34a
    Port:          <none>
    Host Port:     <none>
    Command:
      sriov-network-operator
    State:          Running
      Started:      Wed, 29 Jul 2020 01:54:20 +0000
    Ready:          True
    Restart Count:  0
    Environment:
      WATCH_NAMESPACE:                    openshift-sriov-network-operator (v1:metadata.namespace)
      SRIOV_CNI_IMAGE:                    quay.io/openshift/origin-sriov-cni@sha256:38ce1d1ab4d1e6508ea860fdce37e2746a26796a49eff2ec5c569e689459198b
      SRIOV_INFINIBAND_CNI_IMAGE:         quay.io/openshift/origin-sriov-infiniband-cni@sha256:26e1e88443e2f258dd06b196f549c346d1061c961c4216ac30a6ae0d8e413a57
      SRIOV_DEVICE_PLUGIN_IMAGE:          quay.io/openshift/origin-sriov-network-device-plugin@sha256:757c8f8659ed4702918e30efff604cb40aecc8424ec1cc43ad655bc1594d0739
      NETWORK_RESOURCES_INJECTOR_IMAGE:   quay.io/openshift/origin-sriov-dp-admission-controller@sha256:9128b6ec8c2b4895b9e927b1b54f01d2e51440d3d6232f9b3c08b17b861d8209
      OPERATOR_NAME:                      sriov-network-operator
      SRIOV_NETWORK_CONFIG_DAEMON_IMAGE:  quay.io/openshift/origin-sriov-network-config-daemon@sha256:f9135e2381d0986e421e4a4799615de2cbf0690e22c90470b4863d956164b67e
      SRIOV_NETWORK_WEBHOOK_IMAGE:        quay.io/openshift/origin-sriov-network-webhook@sha256:0f54d40344967eb052a0a90bdf6b9447dc68f93f77ff9d4e40955e6b217b7a9a
      RESOURCE_PREFIX:                    openshift.io
      ENABLE_ADMISSION_CONTROLLER:        true
      NAMESPACE:                          openshift-sriov-network-operator (v1:metadata.namespace)
      POD_NAME:                           sriov-network-operator-785676dfcf-wsgjt (v1:metadata.name)
      RELEASE_VERSION:                    4.3.0
      SRIOV_CNI_BIN_PATH:                 
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from sriov-network-operator-token-dnfkj (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  sriov-network-operator-token-dnfkj:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  sriov-network-operator-token-dnfkj
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  node-role.kubernetes.io/master=
Tolerations:     node-role.kubernetes.io/master:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason          Age        From               Message
  ----    ------          ----       ----               -------
  Normal  Scheduled       <unknown>  default-scheduler  Successfully assigned openshift-sriov-network-operator/sriov-network-operator-785676dfcf-wsgjt to master-2
  Normal  AddedInterface  137m       multus             Add eth0 [10.128.0.10/23]
  Normal  Pulling         137m       kubelet, master-2  Pulling image "quay.io/openshift/origin-sriov-network-operator@sha256:3383f608660e0b153ddd8b70f33f295b028662ecfa0e732834cfaf97e5a3a34a"
  Normal  Pulled          137m       kubelet, master-2  Successfully pulled image "quay.io/openshift/origin-sriov-network-operator@sha256:3383f608660e0b153ddd8b70f33f295b028662ecfa0e732834cfaf97e5a3a34a"
  Normal  Created         137m       kubelet, master-2  Created container sriov-network-operator
  Normal  Started         137m       kubelet, master-2  Started container sriov-network-operator

-----------


# oc describe pods/sriov-cni-ft6r8 -n openshift-sriov-network-operator
Name:         sriov-cni-ft6r8
Namespace:    openshift-sriov-network-operator
Priority:     0
Node:         worker000/192.168.222.13
Start Time:   Wed, 29 Jul 2020 02:09:48 +0000
Labels:       app=sriov-cni
              component=network
              controller-revision-hash=75c97d4f88
              openshift.io/component=network
              pod-template-generation=3
              type=infra
Annotations:  k8s.ovn.org/pod-networks:
                {"default":{"ip_addresses":["10.128.2.4/23"],"mac_address":"fa:28:ea:80:02:05","gateway_ips":["10.128.2.1"],"ip_address":"10.128.2.4/23","...
              k8s.v1.cni.cncf.io/network-status:
                [{
                    "name": "ovn-kubernetes",
                    "interface": "eth0",
                    "ips": [
                        "10.128.2.4"
                    ],
                    "mac": "fa:28:ea:80:02:05",
                    "default": true,
                    "dns": {}
                }]
              k8s.v1.cni.cncf.io/networks-status:
                [{
                    "name": "ovn-kubernetes",
                    "interface": "eth0",
                    "ips": [
                        "10.128.2.4"
                    ],
                    "mac": "fa:28:ea:80:02:05",
                    "default": true,
                    "dns": {}
                }]
              openshift.io/scc: privileged
Status:       Pending
IP:           10.128.2.4
IPs:
  IP:           10.128.2.4
Controlled By:  DaemonSet/sriov-cni
Init Containers:
  sriov-infiniband-cni:
    Container ID:   cri-o://cc4a57eeab17bbb8c58e53c9e8a190082e0ad55925dbbbf0e701229c8cd48cbd
    Image:          quay.io/openshift/origin-sriov-infiniband-cni@sha256:26e1e88443e2f258dd06b196f549c346d1061c961c4216ac30a6ae0d8e413a57
    Image ID:       quay.io/openshift/origin-sriov-infiniband-cni@sha256:26e1e88443e2f258dd06b196f549c346d1061c961c4216ac30a6ae0d8e413a57
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Wed, 29 Jul 2020 02:15:48 +0000
      Finished:     Wed, 29 Jul 2020 02:15:48 +0000
    Ready:          False
    Restart Count:  6
    Environment:    <none>
    Mounts:
      /host/opt/cni/bin from cnibin (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from sriov-cni-token-6gxhb (ro)
Containers:
  sriov-cni:
    Container ID:   
    Image:          quay.io/openshift/origin-sriov-cni@sha256:38ce1d1ab4d1e6508ea860fdce37e2746a26796a49eff2ec5c569e689459198b
    Image ID:       
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /host/opt/cni/bin from cnibin (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from sriov-cni-token-6gxhb (ro)
Conditions:
  Type              Status
  Initialized       False 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  cnibin:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/cni/bin
    HostPathType:  
  sriov-cni-token-6gxhb:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  sriov-cni-token-6gxhb
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  beta.kubernetes.io/os=linux
Tolerations:     
                 node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/pid-pressure:NoSchedule
                 node.kubernetes.io/unreachable:NoExecute
                 node.kubernetes.io/unschedulable:NoSchedule
Events:
  Type     Reason          Age                         From                Message
  ----     ------          ----                        ----                -------
  Normal   Scheduled       <unknown>                   default-scheduler   Successfully assigned openshift-sriov-network-operator/sriov-cni-ft6r8 to worker000
  Normal   AddedInterface  8m1s                        multus              Add eth0 [10.128.2.4/23]
  Normal   Pulled          6m20s (x5 over 8m)          kubelet, worker000  Container image "quay.io/openshift/origin-sriov-infiniband-cni@sha256:26e1e88443e2f258dd06b196f549c346d1061c961c4216ac30a6ae0d8e413a57" already present on machine
  Normal   Created         6m20s (x5 over 8m)          kubelet, worker000  Created container sriov-infiniband-cni
  Normal   Started         6m20s (x5 over 8m)          kubelet, worker000  Started container sriov-infiniband-cni
  Warning  BackOff         <invalid> (x48 over 7m58s)  kubelet, worker000  Back-off restarting failed container

Comment 11 Peng Liu 2020-08-05 13:13:05 UTC
Walid, it's a different issue. The sriov-cni pod problem shall be fixed by https://github.com/openshift/sriov-network-operator/pull/312. However, recently, the sriov-network-operator origin images were not built as expected. So you may need to wait until the origin image build back to normal.

Comment 13 zhaozhanqi 2020-08-13 02:40:28 UTC
Verified this bug on 4.6.0-202008121454.p0

After node restarted. the sriov resource can be restored to normal.

Comment 14 Walid A. 2020-10-12 04:42:04 UTC
Verified this bz also on baremetal OCP 4.6.0-0.nightly-2020-10-03-051134 cluster with sriov deployed from github master branch.  sriov-network-operator image build-date=2020-09-05T01:13:15.933978.  release=202009050041.5133

Comment 16 errata-xmlrpc 2020-10-27 16:14:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196