Bug 1999079 - creating pods before sriovnetworknodepolicy sync up succeed will cause node unschedulable
Summary: creating pods before sriovnetworknodepolicy sync up succeed will cause node u...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.9
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.10.0
Assignee: Peng Liu
QA Contact: Ying Wang
URL:
Whiteboard:
Depends On:
Blocks: 2002508 2086415
TreeView+ depends on / blocked
 
Reported: 2021-08-30 11:31 UTC by Ying Wang
Modified: 2022-06-13 16:34 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2002508 2086415 (view as bug list)
Environment:
Last Closed: 2022-03-10 16:05:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift sriov-network-operator pull 561 0 None None None 2021-09-08 07:36:08 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:06:11 UTC

Description Ying Wang 2021-08-30 11:31:00 UTC
Description of problem:

Created sriovnetworknodepolicy on cluster to enable VFs, before sriovnetworknodestates sync up status changing to succeeded, created sriovnetwork and pods which using the policy and VFs, the node status changed to "Ready,SchedulingDisabled".
Waited for several hours, the status didn't recover.

#  oc describe sriovnetworknodestates.sriovnetwork.openshift.io dell-per740-14.rhts.eng.pek2.redhat.com -n openshift-sriov-network-operator | grep Sync
  Sync Status:      InProgress


# oc get nodes
NAME                                      STATUS                     ROLES    AGE   VERSION
dell-per740-13.rhts.eng.pek2.redhat.com   Ready                      master   12d   v1.22.0-rc.0+dc932e9
dell-per740-14.rhts.eng.pek2.redhat.com   Ready,SchedulingDisabled   worker   12d   v1.22.0-rc.0+dc932e9
dell-per740-31.rhts.eng.pek2.redhat.com   Ready                      master   12d   v1.22.0-rc.0+dc932e9
dell-per740-32.rhts.eng.pek2.redhat.com   Ready                      master   12d   v1.22.0-rc.0+dc932e9
dell-per740-35.rhts.eng.pek2.redhat.com   Ready                      worker   12d   v1.22.0-rc.0+dc932e9      


Version-Release number of selected component (if applicable):

Client Version: 4.8.0-0.nightly-2021-04-23-131610
Server Version: 4.9.0-0.nightly-2021-08-19-180334
Kubernetes Version: v1.22.0-rc.0+dc932e9

How reproducible:


Steps to Reproduce:
1. create sriovnetworknodepolicy
# oc create -f mlx277netpolicy.yaml -n openshift-sriov-network-operator 
sriovnetworknodepolicy.sriovnetwork.openshift.io/mlx277netpolicy created
# cat mlx277netpolicy.yaml
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
  name: mlx277netpolicy
  namespace: openshift-sriov-network-operator
spec:
  mtu: 1800
  nicSelector:
    deviceID: '1015'
    pfNames:
      - ens2f0
    vendor: '15b3'
  nodeSelector:
    feature.node.kubernetes.io/sriov-capable: 'true'
  numVfs: 10
  resourceName: mlx277netpolicy

2.create sriovnetwork

# oc create -f mlx277netdevice.yaml -n openshift-sriov-network-operator 

# cat mlx277netdevice.yaml
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
  name: mlx277netdevice
  namespace: openshift-sriov-network-operator
spec:
  ipam: '{ "type": "static" }'
  capabilities: '{ "ips": true }'
  vlan: 0
  spoofChk: "on"
  trust: "off"
  resourceName: mlx277netpolicy
  networkNamespace: sriov-testing


3. create pod

# oc create -f sriov-testpod1.yaml -n sriov-testing 
pod/sriov-testpod1 created

# cat sriov-testpod1.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: sriov-testpod1
  annotations:
    k8s.v1.cni.cncf.io/networks:  '[
        {
                "name": "mlx277netdevice",
                "ips": ["192.168.2.5/24", "2002::5/64"]
        }
]'
spec:
  containers:
  - name: samplecontainer
    imagePullPolicy: IfNotPresent
    image: quay.io/openshifttest/hello-sdn@sha256:d5785550cf77b7932b090fcd1a2625472912fb3189d5973f177a5a2c347a1f95

4. check pod and node status

[root@dell-per740-36 sriov]# oc describe  pods
Name:         sriov-testpod1
Namespace:    sriov-testing
Priority:     0
Node:         <none>
Labels:       <none>
Annotations:  k8s.v1.cni.cncf.io/networks: [ { "name": "mlx277netdevice", "ips": ["192.168.2.5/24", "2002::5/64"] } ]
              openshift.io/scc: anyuid
Status:       Pending
IP:           
IPs:          <none>
Containers:
  samplecontainer:
    Image:      quay.io/openshifttest/hello-sdn@sha256:d5785550cf77b7932b090fcd1a2625472912fb3189d5973f177a5a2c347a1f95
    Port:       <none>
    Host Port:  <none>
    Limits:
      openshift.io/mlx277netpolicy:  1
    Requests:
      openshift.io/mlx277netpolicy:  1
    Environment:                     <none>
    Mounts:
      /etc/podnetinfo from podnetinfo (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-tr8xj (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  kube-api-access-tr8xj:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
    ConfigMapName:           openshift-service-ca.crt
    ConfigMapOptional:       <nil>
  podnetinfo:
    Type:  DownwardAPI (a volume populated by information about the pod)
    Items:
      metadata.annotations -> annotations
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  6s    default-scheduler  0/5 nodes are available: 1 Insufficient openshift.io/mlx277netpolicy, 1 node(s) were unschedulable, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
[root@dell-per740-36 sriov]# oc get pods
NAME             READY   STATUS    RESTARTS   AGE
sriov-testpod1   0/1     Pending   0          10s
[root@dell-per740-36 sriov]# oc get nodes
NAME                                      STATUS                     ROLES    AGE   VERSION
dell-per740-13.rhts.eng.pek2.redhat.com   Ready                      master   12d   v1.22.0-rc.0+dc932e9
dell-per740-14.rhts.eng.pek2.redhat.com   Ready,SchedulingDisabled   worker   12d   v1.22.0-rc.0+dc932e9
dell-per740-31.rhts.eng.pek2.redhat.com   Ready                      master   12d   v1.22.0-rc.0+dc932e9
dell-per740-32.rhts.eng.pek2.redhat.com   Ready                      master   12d   v1.22.0-rc.0+dc932e9
dell-per740-35.rhts.eng.pek2.redhat.com   Ready                      worker   12d   v1.22.0-rc.0+dc932e9

5. check sriovnetworknodestates
#  oc describe sriovnetworknodestates.sriovnetwork.openshift.io dell-per740-14.rhts.eng.pek2.redhat.com -n openshift-sriov-network-operator | grep Sync
  Sync Status:      InProgress

Actual results:

node status is abnormal


Expected results:


Additional info:

Comment 1 Peng Liu 2021-09-03 09:44:23 UTC
This issue only happens if there is a policy change to a node when the node is rebooting. So the workaround is to not apply any configuration change to nodes until the previous configure is fully applied. I'm also working on a patch to solve this issue.

Comment 4 Ying Wang 2021-09-09 03:30:02 UTC
Verified on Peng Liu's private image, the bug is fixed.

Comment 7 errata-xmlrpc 2022-03-10 16:05:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.