Description of problem: Nodeport svc not accessible when the backend pod is on a window node Version-Release number of selected component (if applicable):4.10.0-0.nightly-2022-01-13-061145 $ oc get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-10-0-132-139.us-east-2.compute.internal Ready master 4h7m v1.23.0+50f645e 10.0.132.139 <none> Red Hat Enterprise Linux CoreOS 410.84.202201122058-0 (Ootpa) 4.18.0-305.30.1.el8_4.x86_64 cri-o://1.23.0-100.rhaos4.10.git77d20b2.el8 ip-10-0-143-249.us-east-2.compute.internal Ready worker 3h20m v1.22.1-1747+bac83a5ac2d725 10.0.143.249 <none> Windows Server 2019 Datacenter 10.0.17763.2452 docker://20.10.7 ip-10-0-146-159.us-east-2.compute.internal Ready worker 3h54m v1.23.0+50f645e 10.0.146.159 <none> Red Hat Enterprise Linux CoreOS 410.84.202201122058-0 (Ootpa) 4.18.0-305.30.1.el8_4.x86_64 cri-o://1.23.0-100.rhaos4.10.git77d20b2.el8 ip-10-0-156-129.us-east-2.compute.internal Ready worker 3h27m v1.22.1-1747+bac83a5ac2d725 10.0.156.129 <none> Windows Server 2019 Datacenter 10.0.17763.2452 docker://20.10.7 ip-10-0-161-80.us-east-2.compute.internal Ready worker 3h54m v1.23.0+50f645e 10.0.161.80 <none> Red Hat Enterprise Linux CoreOS 410.84.202201122058-0 (Ootpa) 4.18.0-305.30.1.el8_4.x86_64 cri-o://1.23.0-100.rhaos4.10.git77d20b2.el8 ip-10-0-185-251.us-east-2.compute.internal Ready master 4h7m v1.23.0+50f645e 10.0.185.251 <none> Red Hat Enterprise Linux CoreOS 410.84.202201122058-0 (Ootpa) 4.18.0-305.30.1.el8_4.x86_64 cri-o://1.23.0-100.rhaos4.10.git77d20b2.el8 ip-10-0-204-8.us-east-2.compute.internal Ready worker 3h54m v1.23.0+50f645e 10.0.204.8 <none> Red Hat Enterprise Linux CoreOS 410.84.202201122058-0 (Ootpa) 4.18.0-305.30.1.el8_4.x86_64 cri-o://1.23.0-100.rhaos4.10.git77d20b2.el8 ip-10-0-204-86.us-east-2.compute.internal Ready master 4h8m v1.23.0+50f645e 10.0.204.86 <none> Red Hat Enterprise Linux CoreOS 410.84.202201122058-0 (Ootpa) 4.18.0-305.30.1.el8_4.x86_64 cri-o://1.23.0-100.rhaos4.10.git77d20b2.el8 [anusaxen@anusaxen verification-tests]$ oc get pods -o wide oc gNAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES win-webserver-5db7f85d96-dthj6 1/1 Running 0 14m 10.132.0.7 ip-10-0-156-129.us-east-2.compute.internal <none> <none> [anusaxen@anusaxen verification-tests]$ oc get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE win-webserver-5db7f85d96-dthj6 NodePort 172.30.51.36 <none> 27018:32766/TCP 10s [anusaxen@anusaxen verification-tests]$ oc debug node/ip-10-0-161-80.us-east-2.compute.internal Starting pod/ip-10-0-161-80us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.161.80 If you don't see a command prompt, try pressing enter. sh-4.4# curl 10.0.156.129:32766 ^C sh-4.4# curl 10.0.146.159:32766 curl: (7) Failed to connect to 10.0.146.159 port 32766: Connection refused sh-4.4# curl 10.0.156.129:32766 ^C sh-4.4# curl 10.0.132.139:32766 curl: (7) Failed to connect to 10.0.132.139 port 32766: Connection refused sh-4.4# curl 10.0.156.129:32766 curl: (7) Failed to connect to 10.0.156.129 port 32766: Connection timed out sh-4.4# exit How reproducible:Always Steps to Reproduce: 1.Create windows pod https://github.com/openshift/verification-tests/blob/master/testdata/networking/windows_pod_and_service.yaml 2.Create nodeport service https://github.com/openshift/verification-tests/blob/master/testdata/networking/nodeport_test_service.yaml 3.curl <any_node_ip>:<nodeport> from any node Actual results: Nodeport svc not accessible from any node when backend pod is on window node Expected results:Nodeport svc should be accessible from any node regardless of backend pod location Additional info:
Tried to fix this issue by steering the packets DNATed to non-ovnkube nodes to the cluster router then to the int port of the node switch. Hit this an OVN issue https://bugzilla.redhat.com/show_bug.cgi?id=2060462.
upstream PR https://github.com/ovn-org/ovn-kubernetes/pull/2862
downstream PR https://github.com/openshift/ovn-kubernetes/pull/1050 merged
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069