Bug 1915312 - Prevent schedule Linux openshift-network-diagnostics pod on Windows node
Summary: Prevent schedule Linux openshift-network-diagnostics pod on Windows node
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.7
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.7.0
Assignee: Ricardo Carrillo Cruz
QA Contact: gaoshang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-12 12:29 UTC by gaoshang
Modified: 2021-02-24 15:52 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-24 15:52:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 947 0 None closed Bug 1915312: Schedule network diagnostics pods on Linux nodes 2021-02-03 06:31:37 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:52:28 UTC

Description gaoshang 2021-01-12 12:29:50 UTC
Description of problem:
Install an OCP cluster with Windows node, found openshift-network-diagnostics is trying to schedule pod on Windows node and failed

Version-Release number of selected component (if applicable):
# oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2021-01-10-070949   True        False         9h      Error while reconciling 4.7.0-0.nightly-2021-01-10-070949: the cluster operator network is degraded

How reproducible:
always

Steps to Reproduce:
1.Install an OCP cluster and bootstrap a Windows node
2.Check pod in project openshift-network-diagnostics, it is trying to schedule pod on Windows node and failed

# oc get pod -n openshift-network-diagnostics
NAME                                   READY   STATUS             RESTARTS   AGE
network-check-source-f694fbf9d-lbhs9   1/1     Running            0          9h
network-check-target-6jst9             1/1     Running            0          9h
network-check-target-btmvt             0/1     ImagePullBackOff   0          161m
network-check-target-jgkdt             1/1     Running            0          9h
network-check-target-kfpf4             1/1     Running            0          9h
network-check-target-n2spd             1/1     Running            0          9h
network-check-target-qjp5r             1/1     Running            0          9h
network-check-target-rb4kx             1/1     Running            0          9h
network-check-target-zwm7p             0/1     ImagePullBackOff   0          166

# oc describe pod/network-check-target-btmvt -n openshift-network-diagnostics
Name:         network-check-target-btmvt
Namespace:    openshift-network-diagnostics
Priority:     0
Node:         ip-10-0-142-195.us-east-2.compute.internal/10.0.142.195
Start Time:   Tue, 12 Jan 2021 03:58:48 -0500
Labels:       app=network-check-target
              controller-revision-hash=84696666bb
              pod-template-generation=1
Annotations:  openshift.io/scc: restricted
Status:       Pending
IP:           10.132.6.5
IPs:
  IP:           10.132.6.5
Controlled By:  DaemonSet/network-check-target
Containers:
  network-check-target-container:
    Container ID:   
    Image:          quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a755920f6c5713d96fdd82d6a6270affdfbd4618e21ff20fff1d1444e0e3873c
    Image ID:       
    Port:           8080/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       ImagePullBackOff
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:        10m
      memory:     150Mi
    Readiness:    http-get http://:8080/ delay=30s timeout=10s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-dwb5z (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-dwb5z:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-dwb5z
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     op=Exists
Events:
  Type     Reason                  Age                         From                                                 Message
  ----     ------                  ----                        ----                                                 -------
  Normal   Scheduled               152m                                                                             Successfully assigned openshift-network-diagnostics/network-check-target-btmvt to ip-10-0-142-195.us-east-2.compute.internal
  Warning  FailedCreatePodSandBox  150m (x13 over 152m)        kubelet, ip-10-0-142-195.us-east-2.compute.internal  Failed to create pod sandbox: open c:\k\etc\resolv.conf: The system cannot find the file specified.
  Normal   SandboxChanged          149m (x3 over 149m)         kubelet, ip-10-0-142-195.us-east-2.compute.internal  Pod sandbox changed, it will be killed and re-created.
  Normal   Pulling                 148m (x3 over 149m)         kubelet, ip-10-0-142-195.us-east-2.compute.internal  Pulling image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a755920f6c5713d96fdd82d6a6270affdfbd4618e21ff20fff1d1444e0e3873c"
  Warning  Failed                  148m (x3 over 149m)         kubelet, ip-10-0-142-195.us-east-2.compute.internal  Failed to pull image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a755920f6c5713d96fdd82d6a6270affdfbd4618e21ff20fff1d1444e0e3873c": rpc error: code = Unknown desc = Error response from daemon: unauthorized: access to the requested resource is not authorized
  Warning  Failed                  <invalid> (x659 over 149m)  kubelet, ip-10-0-142-195.us-east-2.compute.internal  Error: ImagePullBackOff

Actual results:
It's trying to scheduled linux pod on Windows node

Expected results:
Linux pod should not be scheduled to Windows node

Additional info:

Comment 3 Anurag saxena 2021-01-15 18:38:11 UTC
@sgao Can you verify this bug? Thanks

Comment 4 gaoshang 2021-01-18 03:54:42 UTC
(In reply to Anurag saxena from comment #3)
> @sgao Can you verify this bug? Thanks

This bug has been verified on OCP 4.7.0-0.nightly-2021-01-17-211555 and passed, thanks.

Version:
# oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2021-01-17-211555   True        False         81m     Cluster version is 4.7.0-0.nightly-2021-01-17-211555

Steps to Reproduce:
1.Install an OCP cluster and bootstrap a Windows node.
2.Check pod in project openshift-network-diagnostics, network-check-target no longer schedule to Windows node.

# oc get pod -n openshift-network-diagnostics
NAME                                  READY   STATUS    RESTARTS   AGE
network-check-source-8b577f64-g8tkr   1/1     Running   0          82m
network-check-target-2vz2r            1/1     Running   0          102m
network-check-target-46jh6            1/1     Running   0          102m
network-check-target-6cjrm            1/1     Running   0          109m
network-check-target-mh7gr            1/1     Running   0          109m
network-check-target-vj2g2            1/1     Running   0          100m
network-check-target-zzwbn            1/1     Running   0          109m

Comment 7 errata-xmlrpc 2021-02-24 15:52:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.