Bug 1920769
| Summary: | [Upgrade] OCP upgrade from 4.6.13 to 4.7.0-fc.4 for "network-check-target" failed when "defaultNodeSelector" is set | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | huirwang |
| Component: | Networking | Assignee: | Jacob Tanenbaum <jtanenba> |
| Networking sub component: | openshift-sdn | QA Contact: | huirwang |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | high | ||
| Priority: | unspecified | CC: | jtanenba, piqin |
| Version: | 4.7 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.7.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-02-24 15:56:44 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
With the following workaround, Pods are scheduled successfully, and networking co is upgraded successfully.
$ oc annotate ns openshift-network-diagnostics openshift.io/node-selector=
$ oc get ns openshift-network-diagnostics -ojson|jq .metadata.annotations
{
"openshift.io/node-selector": "",
"openshift.io/sa.scc.mcs": "s0:c25,c15",
"openshift.io/sa.scc.supplemental-groups": "1000630000/10000",
"openshift.io/sa.scc.uid-range": "1000630000/10000"
}
$ oc get -n openshift-network-diagnostics pod
NAME READY STATUS RESTARTS AGE
network-check-source-c587cc78f-29t6k 1/1 Running 0 15h
network-check-target-2zspv 1/1 Running 0 15h
network-check-target-5v52l 1/1 Running 0 15h
network-check-target-92q4n 0/1 Running 0 34s
network-check-target-jswwq 0/1 Running 0 35s
network-check-target-pxl29 1/1 Running 0 15h
network-check-target-tj4fh 0/1 Running 0 35s
$ oc get co network
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE
network 4.7.0-fc.4 True False False 3m35s
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633 |
Description of problem: Version-Release number of selected component (if applicable): 4.6.13 to 4.7.0-fc.4 How reproducible: Always Steps to Reproduce: 1.Set up OpenShift Container Platform Cluster with version 4.6.13 2. Apply 'node-workload=app' to Worker Nodes , then update the scheduler. oc get schedulers.config.openshift.io cluster -ojson| jq .spec "defaultNodeSelector": "node-workload=app", 3. do the upgrade to 4.7.0-fc.4 Actual Results: Upgrade was blocked due to the pods in openshift-network-diagnostics were in pending status. oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.0-fc.4 True False 46m Error while reconciling 4.7.0-fc.4: the cluster operator network is degraded oc get -n openshift-network-diagnostics pod NAME READY STATUS RESTARTS AGE network-check-source-c587cc78f-29t6k 1/1 Running 0 14h network-check-target-2zspv 1/1 Running 0 14h network-check-target-5v52l 1/1 Running 0 14h network-check-target-gcv2s 0/1 Pending 0 14h network-check-target-jqqkd 0/1 Pending 0 14h network-check-target-n42wt 0/1 Pending 0 14h network-check-target-pxl29 1/1 Running 0 14h oc describe pod network-check-target-5v52l -n openshift-network-diagnostics Name: network-check-target-5v52l Namespace: openshift-network-diagnostics Priority: 0 Node: ip-10-0-212-65.ap-northeast-1.compute.internal/10.0.212.65 Start Time: Tue, 26 Jan 2021 18:51:27 +0800 Labels: app=network-check-target controller-revision-hash=8f9b6469 kubernetes.io/os=linux pod-template-generation=1 Annotations: k8s.v1.cni.cncf.io/network-status: [{ "name": "", "interface": "eth0", "ips": [ "10.129.2.44" ], "default": true, "dns": {} }] k8s.v1.cni.cncf.io/networks-status: [{ "name": "", "interface": "eth0", "ips": [ "10.129.2.44" ], "default": true, "dns": {} }] openshift.io/scc: restricted Status: Running IP: 10.129.2.44 IPs: IP: 10.129.2.44 Controlled By: DaemonSet/network-check-target Containers: network-check-target-container: Container ID: cri-o://b46a73c896b9aff667fdda2d71f2589621cec80edc1126785f1900a3947def7d Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:52181a2fba40eb3ddf1d3ee953633be686898ec34be819f50a15688420895a93 Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:52181a2fba40eb3ddf1d3ee953633be686898ec34be819f50a15688420895a93 Port: 8080/TCP Host Port: 0/TCP State: Running Started: Tue, 26 Jan 2021 18:51:29 +0800 Ready: True Restart Count: 0 Requests: cpu: 10m memory: 150Mi Readiness: http-get http://:8080/ delay=30s timeout=10s period=10s #success=1 #failure=3 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-dvrf9 (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: default-token-dvrf9: Type: Secret (a volume populated by a Secret) SecretName: default-token-dvrf9 Optional: false QoS Class: Burstable Node-Selectors: beta.kubernetes.io/os=linux node-workload=app Tolerations: Events: <none> huiran-mac:script hrwang$ oc describe pod network-check-target-gcv2s -n openshift-network-diagnostics Name: network-check-target-gcv2s Namespace: openshift-network-diagnostics Priority: 0 Node: <none> Labels: app=network-check-target controller-revision-hash=8f9b6469 kubernetes.io/os=linux pod-template-generation=1 Annotations: openshift.io/scc: restricted Status: Pending IP: IPs: <none> Controlled By: DaemonSet/network-check-target Containers: network-check-target-container: Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:52181a2fba40eb3ddf1d3ee953633be686898ec34be819f50a15688420895a93 Port: 8080/TCP Host Port: 0/TCP Requests: cpu: 10m memory: 150Mi Readiness: http-get http://:8080/ delay=30s timeout=10s period=10s #success=1 #failure=3 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-dvrf9 (ro) Conditions: Type Status PodScheduled False Volumes: default-token-dvrf9: Type: Secret (a volume populated by a Secret) SecretName: default-token-dvrf9 Optional: false QoS Class: Burstable Node-Selectors: beta.kubernetes.io/os=linux node-workload=app Tolerations: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling <unknown> 0/6 nodes are available: 6 node(s) didn't match Pod's node affinity. Expected Results: Upgrade successfully.