Created attachment 1863893 [details] topology Description of problem: routingViaHost: true externalTrafficPolicy: cluster Trying to send http traffic from a test-container (topology attached) towards an NGINX pod. Static route on a Speaker Pod that was added doesn`t work, when traffic comes back apiVersion: v1 kind: Service metadata: name: nginx-local namespace: default annotations: metallb.universe.tf/address-pool: addresspool3 spec: ports: - port: 80 targetPort: 80 selector: app: nginx-local type: LoadBalancer externalTrafficPolicy: Cluster Version-Release number of selected component (if applicable): 4.10.0-rc.5 metallb-operator.4.10.0-202202160023 How reproducible: 100% Steps to Reproduce: 1.Create metallb layer3 scenario according to the attached topology 2. test-container# wget -qO- 4.4.4.1 3. Actual results: traffic fails Expected results: traffic pass Additional info: 1) works fine with "externalTrafficPolicy: Local" 2) 11:34:59.104722 34:48:ed:f3:6b:7c > 00:00:5e:00:01:01, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 62, id 0, offset 0, flags [DF], proto TCP (6), length 60) 4.4.4.1.80 > 10.100.100.254.54538: Flags [S.], cksum 0xb247 (incorrect -> 0xeae5), seq 414318286, ack 2174062796, win 26960, options [mss 1360,sackOK,TS val 3872640667 ecr 2179664749,nop,wscale 7], length 0 1:35 Default Gateway: 10.46.55.254 dev br-ex lladdr 00:00:5e:00:01:01 REACHABLE
The topology is as follows: The node is connected to multiple routers, one of them is also the default gateway. The traffic is coming from a router that is not the default gateway. Routes were added to the node, to steer the traffic back to the client. What happens is, the traffic comes into the node, but the reply is sent to the default gateway. With a service with Traffic Policy = Cluster, the routes are ignored because all the traffic happens inside br-ex / ovn. With a service with Traffic Policy = Local, the traffic is dropped on the host and it works. My understanding is that such configuration was working until 4.10
Idea is to see if we can steer the traffic into the host for services in lgw, let it hit the routes use the .2 IP to take it back to br-ex and reply should come back the same way (hopefully). Testing this.
Upstream PR merged, downstream PR on its way to merge. Will be backported to 4.10.z
PR merged.
Verified: Client Version: 4.11.0-0.nightly-2022-04-26-030643 Kustomize Version: v4.5.4 Server Version: 4.11.0-0.nightly-2022-04-26-030643 Kubernetes Version: v1.23.3+d464c70 metallb-operator.4.11.0-202203281806 ==================================================== apiVersion: v1 kind: Service metadata: creationTimestamp: "2022-04-27T15:27:09Z" name: hello-world namespace: arti-test resourceVersion: "1055625" uid: 11f8a8f6-42aa-45a9-9208-cb3081af95f7 spec: allocateLoadBalancerNodePorts: true clusterIP: 172.30.75.111 clusterIPs: - 172.30.75.111 externalTrafficPolicy: Cluster internalTrafficPolicy: Cluster ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - nodePort: 30640 port: 80 protocol: TCP targetPort: 8080 selector: app: hello-world sessionAffinity: None type: LoadBalancer status: loadBalancer: ingress: - ip: 10.10.10.10 bash-5.1# wget -qO- 10.10.10.10 Hello Kubernetes! 13:06:13.985017 IP 172.16.0.1.37264 > 10.10.10.10.http: Flags [S], seq 300684042, win 29200, options [mss 1460,sackOK,TS val 3961739875 ecr 0,nop,wscale 7], length 0 13:06:13.988380 IP 10.10.10.10.http > 172.16.0.1.37264: Flags [S.], seq 96525721, ack 300684043, win 26960, options [mss 1360,sackOK,TS val 3371233124 ecr 3961739875,nop,wscale 7], length 0 13:06:13.988790 IP 172.16.0.1.37264 > 10.10.10.10.http: Flags [.], ack 1, win 229, options [nop,nop,TS val 3961739880 ecr 3371233124], length 0 13:06:13.988849 IP 172.16.0.1.37264 > 10.10.10.10.http: Flags [P.], seq 1:75, ack 1, win 229, options [nop,nop,TS val 3961739880 ecr 3371233124], length 74: HTTP: GET / HTTP/1.1 13:06:13.989943 IP 10.10.10.10.http > 172.16.0.1.37264: Flags [.], ack 75, win 211, options [nop,nop,TS val 3371233127 ecr 3961739880], length 0 13:06:13.990289 IP 10.10.10.10.http > 172.16.0.1.37264: Flags [P.], seq 1:132, ack 75, win 211, options [nop,nop,TS val 3371233127 ecr 3961739880], length 131: HTTP: HTTP/1.1 200 OK 13:06:13.990356 IP 10.10.10.10.http > 172.16.0.1.37264: Flags [F.], seq 132, ack 75, win 211, options [nop,nop,TS val 3371233127 ecr 3961739880], length 0 13:06:13.990389 IP 172.16.0.1.37264 > 10.10.10.10.http: Flags [.], ack 132, win 237, options [nop,nop,TS val 3961739881 ecr 3371233127], length 0 13:06:13.990436 IP 172.16.0.1.37264 > 10.10.10.10.http: Flags [F.], seq 75, ack 132, win 237, options [nop,nop,TS val 3961739881 ecr 3371233127], length 0 13:06:13.990449 IP 172.16.0.1.37264 > 10.10.10.10.http: Flags [.], ack 133, win 237, options [nop,nop,TS val 3961739881 ecr 3371233127], length 0 13:06:13.990503 IP 10.10.10.10.http > 172.16.0.1.37264: Flags [.], ack 76, win 211, options [nop,nop,TS val 3371233127 ecr 3961739881], length 0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069