Description of problem: Version-Release number of selected component (if applicable): 4.6.13 to 4.7.0-fc.4 How reproducible: Always Steps to Reproduce: 1.Set up OpenShift Container Platform Cluster with version 4.6.13 2. Apply 'node-workload=app' to Worker Nodes , then update the scheduler. oc get schedulers.config.openshift.io cluster -ojson| jq .spec "defaultNodeSelector": "node-workload=app", 3. do the upgrade to 4.7.0-fc.4 Actual Results: Upgrade was blocked due to the pods in openshift-network-diagnostics were in pending status. oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.0-fc.4 True False 46m Error while reconciling 4.7.0-fc.4: the cluster operator network is degraded oc get -n openshift-network-diagnostics pod NAME READY STATUS RESTARTS AGE network-check-source-c587cc78f-29t6k 1/1 Running 0 14h network-check-target-2zspv 1/1 Running 0 14h network-check-target-5v52l 1/1 Running 0 14h network-check-target-gcv2s 0/1 Pending 0 14h network-check-target-jqqkd 0/1 Pending 0 14h network-check-target-n42wt 0/1 Pending 0 14h network-check-target-pxl29 1/1 Running 0 14h oc describe pod network-check-target-5v52l -n openshift-network-diagnostics Name: network-check-target-5v52l Namespace: openshift-network-diagnostics Priority: 0 Node: ip-10-0-212-65.ap-northeast-1.compute.internal/10.0.212.65 Start Time: Tue, 26 Jan 2021 18:51:27 +0800 Labels: app=network-check-target controller-revision-hash=8f9b6469 kubernetes.io/os=linux pod-template-generation=1 Annotations: k8s.v1.cni.cncf.io/network-status: [{ "name": "", "interface": "eth0", "ips": [ "10.129.2.44" ], "default": true, "dns": {} }] k8s.v1.cni.cncf.io/networks-status: [{ "name": "", "interface": "eth0", "ips": [ "10.129.2.44" ], "default": true, "dns": {} }] openshift.io/scc: restricted Status: Running IP: 10.129.2.44 IPs: IP: 10.129.2.44 Controlled By: DaemonSet/network-check-target Containers: network-check-target-container: Container ID: cri-o://b46a73c896b9aff667fdda2d71f2589621cec80edc1126785f1900a3947def7d Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:52181a2fba40eb3ddf1d3ee953633be686898ec34be819f50a15688420895a93 Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:52181a2fba40eb3ddf1d3ee953633be686898ec34be819f50a15688420895a93 Port: 8080/TCP Host Port: 0/TCP State: Running Started: Tue, 26 Jan 2021 18:51:29 +0800 Ready: True Restart Count: 0 Requests: cpu: 10m memory: 150Mi Readiness: http-get http://:8080/ delay=30s timeout=10s period=10s #success=1 #failure=3 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-dvrf9 (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: default-token-dvrf9: Type: Secret (a volume populated by a Secret) SecretName: default-token-dvrf9 Optional: false QoS Class: Burstable Node-Selectors: beta.kubernetes.io/os=linux node-workload=app Tolerations: Events: <none> huiran-mac:script hrwang$ oc describe pod network-check-target-gcv2s -n openshift-network-diagnostics Name: network-check-target-gcv2s Namespace: openshift-network-diagnostics Priority: 0 Node: <none> Labels: app=network-check-target controller-revision-hash=8f9b6469 kubernetes.io/os=linux pod-template-generation=1 Annotations: openshift.io/scc: restricted Status: Pending IP: IPs: <none> Controlled By: DaemonSet/network-check-target Containers: network-check-target-container: Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:52181a2fba40eb3ddf1d3ee953633be686898ec34be819f50a15688420895a93 Port: 8080/TCP Host Port: 0/TCP Requests: cpu: 10m memory: 150Mi Readiness: http-get http://:8080/ delay=30s timeout=10s period=10s #success=1 #failure=3 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-dvrf9 (ro) Conditions: Type Status PodScheduled False Volumes: default-token-dvrf9: Type: Secret (a volume populated by a Secret) SecretName: default-token-dvrf9 Optional: false QoS Class: Burstable Node-Selectors: beta.kubernetes.io/os=linux node-workload=app Tolerations: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling <unknown> 0/6 nodes are available: 6 node(s) didn't match Pod's node affinity. Expected Results: Upgrade successfully.
With the following workaround, Pods are scheduled successfully, and networking co is upgraded successfully. $ oc annotate ns openshift-network-diagnostics openshift.io/node-selector= $ oc get ns openshift-network-diagnostics -ojson|jq .metadata.annotations { "openshift.io/node-selector": "", "openshift.io/sa.scc.mcs": "s0:c25,c15", "openshift.io/sa.scc.supplemental-groups": "1000630000/10000", "openshift.io/sa.scc.uid-range": "1000630000/10000" } $ oc get -n openshift-network-diagnostics pod NAME READY STATUS RESTARTS AGE network-check-source-c587cc78f-29t6k 1/1 Running 0 15h network-check-target-2zspv 1/1 Running 0 15h network-check-target-5v52l 1/1 Running 0 15h network-check-target-92q4n 0/1 Running 0 34s network-check-target-jswwq 0/1 Running 0 35s network-check-target-pxl29 1/1 Running 0 15h network-check-target-tj4fh 0/1 Running 0 35s $ oc get co network NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE network 4.7.0-fc.4 True False False 3m35s
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633