Description of problem: We have a small baremetal cluster. Whenever we change the externalTrafficPolicy from Cluster to Local we lose connectivity to the services (this happens both in L2 and L3 mode). We are using OVNKubernetes. Version-Release number of selected component (if applicable): - OCP 4.10 nightly - MetalLB operator v4.10.0-202112241546 (downstream) How reproducible: 100% Steps to Reproduce: 1. Create a lb type service with externalTrafficPolicy=Local 2. Try to reach the service through its ExternalIP Actual results: 1. The service is not reachable Expected results: 1. The service should be reachable (as is when externalTrafficPolicy=Cluster) Additional info: This is our setup with local mode and L3: $ oc get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE web-service-l3 LoadBalancer 172.30.118.196 10.10.10.10 8080:30719/TCP,80:32376/TCP 6m38s The routes are there: $ ip r 10.10.10.10 proto bgp metric 20 nexthop via 192.168.216.13 dev baremetal weight 1 nexthop via 192.168.216.14 dev baremetal weight 1 We can reach both worker though the node IPs: $ curl 192.168.216.13:32376 <!DOCTYPE html> $ curl 192.168.216.14:32376 <!DOCTYPE html> However no luck from the external IP (this only happens in local mode, cluster mode works just fine): $ curl 10.10.10.10 --connect-timeout 3 curl: (28) Connection timed out after 3001 milliseconds
Troubleshooting steps: We created a deployment, all endpoints were on the same node worker000: [kni@f12-h17-b07-5039ms surya]$ oc get pods -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES web-server-l3-7855447cdc-cn5nm 1/1 Running 0 87m 10.128.3.198 worker000-5039ms <none> <none> web-server-l3-7855447cdc-l25wx 1/1 Running 0 87m 10.128.3.200 worker000-5039ms <none> <none> web-server-l3-7855447cdc-r25g9 1/1 Running 0 87m 10.128.3.197 worker000-5039ms <none> <none> web-server-l3-7855447cdc-rx9jv 1/1 Running 0 87m 10.128.3.199 worker000-5039ms <none> <none> web-server-l3-7855447cdc-snngh 1/1 Running 0 87m 10.128.3.196 worker000-5039ms <none> <none> We created the LB service: oc describe svc web-service-l3 Name: web-service-l3 Namespace: default Labels: app=http-1 group=kb-mb-wl Annotations: metallb.universe.tf/address-pool: addresspool-l3 Selector: app=web-server-l3 Type: LoadBalancer IP Family Policy: SingleStack IP Families: IPv4 IP: 172.30.226.131 IPs: 172.30.226.131 LoadBalancer Ingress: 10.10.10.10 Port: http 8080/TCP TargetPort: 8080/TCP NodePort: http 32229/TCP Endpoints: 10.128.3.196:8080,10.128.3.197:8080,10.128.3.198:8080 + 2 more... Port: http-2 80/TCP TargetPort: 80/TCP NodePort: http-2 31220/TCP Endpoints: 10.128.3.196:80,10.128.3.197:80,10.128.3.198:80 + 2 more... Session Affinity: None External Traffic Policy: Local We checked flows on br-ex sh-4.4# ovs-ofctl dump-flows br-ex | grep 10.10.10.10 cookie=0x2920e98f218c68aa, duration=1348.311s, table=0, n_packets=0, n_bytes=0, idle_age=1348, priority=110,tcp,in_port=1,nw_dst=10.10.10.1 0,tp_dst=8080 actions=ct(commit,table=6,zone=64003) cookie=0xd093673aa729ad2, duration=1348.311s, table=0, n_packets=19, n_bytes=1406, idle_age=388, priority=110,tcp,in_port=1,nw_dst=10.10.10 .10,tp_dst=80 actions=ct(commit,table=6,zone=64003) cookie=0x2920e98f218c68aa, duration=1348.311s, table=0, n_packets=0, n_bytes=0, idle_age=1348, priority=110,tcp,in_port=LOCAL,nw_src=10.10. 10.10,tp_src=8080 actions=ct(table=7,zone=64003) cookie=0xd093673aa729ad2, duration=1348.311s, table=0, n_packets=0, n_bytes=0, idle_age=1348, priority=110,tcp,in_port=LOCAL,nw_src=10.10.1 0.10,tp_src=80 actions=ct(table=7,zone=64003) Looking at the second flow, it was clear that packet was going into the node, but response wasn't coming out (since n_packets=19). We realized that this was ETP=local flows for LGW mode: I0107 19:12:38.020566 40592 config.go:1714] Gateway config: {Mode:local Interface:br-ex EgressGWInterface: NextHop: VLANID:0 NodeportEnable:true DisableSNATMultipleGWs:false V4JoinSubnet:100.64.0.0/16 V6JoinSubnet:fd98::/64 DisablePacketMTUCheck:false RouterSubnet:} Looking at the logs nothing was suspicious: I0107 19:23:05.177756 40592 port_claim.go:182] Handle NodePort service web-service-l3 port 32347 I0107 19:23:05.177794 40592 port_claim.go:40] Opening socket for service: default/web-service-l3, port: 32347 and protocol TCP I0107 19:23:05.177803 40592 port_claim.go:63] Opening socket for LocalPort "nodePort for default/web-service-l3:http" (:32347/tcp) I0107 19:23:05.177913 40592 port_claim.go:182] Handle NodePort service web-service-l3 port 31609 I0107 19:23:05.177921 40592 port_claim.go:40] Opening socket for service: default/web-service-l3, port: 31609 and protocol TCP I0107 19:23:05.177926 40592 port_claim.go:63] Opening socket for LocalPort "nodePort for default/web-service-l3:http-2" (:31609/tcp) I0107 19:23:05.177966 40592 healthcheck.go:142] Opening healthcheck "default/web-service-l3" on port 32043 I0107 19:23:05.178019 40592 gateway_shared_intf.go:528] Adding service web-service-l3 in namespace default I0107 19:23:05.178031 40592 gateway_shared_intf.go:532] No endpoint found for service web-service-l3 in namespace default during service Add I0107 19:23:05.178037 40592 gateway_shared_intf.go:541] Service Add web-service-l3 event in namespace default came before endpoint event setting svcConfig I0107 19:23:05.178048 40592 gateway_shared_intf.go:207] Adding flows on breth0 for Nodeport Service web-service-l3 in Namespace: default since ExternalTrafficPolicy=local I0107 19:23:05.178063 40592 gateway_shared_intf.go:207] Adding flows on breth0 for Nodeport Service web-service-l3 in Namespace: default since ExternalTrafficPolicy=local I0107 19:23:05.178090 40592 healthcheck.go:167] Starting goroutine for healthcheck "default/web-service-l3" on port 32043 I0107 19:23:05.191957 40592 healthcheck.go:222] Reporting 5 endpoints for healthcheck "default/web-service-l3" I0107 19:23:05.191965 40592 gateway_shared_intf.go:644] Adding endpoints web-service-l3 in namespace default I0107 19:23:05.234680 40592 gateway_shared_intf.go:567] Deleting old service rules for: &Service{ObjectMeta:{web-service-l3 default 43096508-5382-4bba-8ea8-fb590b32d8f5 11650838 0 2022-01-07 19:23:05 +0000 UTC <nil> <nil> map[app:http-1 group:kb-mb-wl] map[metallb.universe.tf/address-pool:addresspool-l3-b] [] [] [{kubectl-create Update v1 2022-01-07 19:23:05 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{".":{},"f:metallb.universe.tf/address-pool":{}},"f:labels":{".":{},"f:app":{},"f:group":{}}},"f:spec":{"f:allocateLoadBalancerNodePorts":{},"f:externalTrafficPolicy":{},"f:internalTrafficPolicy":{},"f:ports":{".":{},"k:{\"port\":80,\"protocol\":\"TCP\"}":{".":{},"f:name":{},"f:port":{},"f:protocol":{},"f:targetPort":{}},"k:{\"port\":8080,\"protocol\":\"TCP\"}":{".":{},"f:name":{},"f:port":{},"f:protocol":{},"f:targetPort":{}}},"f:selector":{},"f:sessionAffinity":{},"f:type":{}}} }]},Spec:ServiceSpec{Ports:[]ServicePort{ServicePort{Name:http,Protocol:TCP,Port:8080,TargetPort:{0 8080 },NodePort:32347,AppProtocol:nil,},ServicePort{Name:http-2,Protocol:TCP,Port:80,TargetPort:{0 80 },NodePort:31609,AppProtocol:nil,},},Selector:map[string]string{app: web-server-l3,},ClusterIP:172.30.10.231,Type:LoadBalancer,ExternalIPs:[],SessionAffinity:None,LoadBalancerIP:,LoadBalancerSourceRanges:[],ExternalName:,ExternalTrafficPolicy:Local,HealthCheckNodePort:32043,PublishNotReadyAddresses:false,SessionAffinityConfig:nil,IPFamilyPolicy:*SingleStack,ClusterIPs:[172.30.10.231],IPFamilies:[IPv4],AllocateLoadBalancerNodePorts:*true,LoadBalancerClass:nil,InternalTrafficPolicy:*Cluster,},Status:ServiceStatus{LoadBalancer:LoadBalancerStatus{Ingress:[]LoadBalancerIngress{},},Conditions:[]Condition{},},} I0107 19:23:05.286234 40592 gateway_shared_intf.go:572] Adding new service rules for: &Service{ObjectMeta:{web-service-l3 default 43096508-5382-4bba-8ea8-fb590b32d8f5 11650839 0 2022-01-07 19:23:05 +0000 UTC <nil> <nil> map[app:http-1 group:kb-mb-wl] map[metallb.universe.tf/address-pool:addresspool-l3-b] [] [] [{controller Update v1 2022-01-07 19:23:05 +0000 UTC FieldsV1 {"f:status":{"f:loadBalancer":{"f:ingress":{}}}} status} {kubectl-create Update v1 2022-01-07 19:23:05 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{".":{},"f:metallb.universe.tf/address-pool":{}},"f:labels":{".":{},"f:app":{},"f:group":{}}},"f:spec":{"f:allocateLoadBalancerNodePorts":{},"f:externalTrafficPolicy":{},"f:internalTrafficPolicy":{},"f:ports":{".":{},"k:{\"port\":80,\"protocol\":\"TCP\"}":{".":{},"f:name":{},"f:port":{},"f:protocol":{},"f:targetPort":{}},"k:{\"port\":8080,\"protocol\":\"TCP\"}":{".":{},"f:name":{},"f:port":{},"f:protocol":{},"f:targetPort":{}}},"f:selector":{},"f:sessionAffinity":{},"f:type":{}}} }]},Spec:ServiceSpec{Ports:[]ServicePort{ServicePort{Name:http,Protocol:TCP,Port:8080,TargetPort:{0 8080 },NodePort:32347,AppProtocol:nil,},ServicePort{Name:http-2,Protocol:TCP,Port:80,TargetPort:{0 80 },NodePort:31609,AppProtocol:nil,},},Selector:map[string]string{app: web-server-l3,},ClusterIP:172.30.10.231,Type:LoadBalancer,ExternalIPs:[],SessionAffinity:None,LoadBalancerIP:,LoadBalancerSourceRanges:[],ExternalName:,ExternalTrafficPolicy:Local,HealthCheckNodePort:32043,PublishNotReadyAddresses:false,SessionAffinityConfig:nil,IPFamilyPolicy:*SingleStack,ClusterIPs:[172.30.10.231],IPFamilies:[IPv4],AllocateLoadBalancerNodePorts:*true,LoadBalancerClass:nil,InternalTrafficPolicy:*Cluster,},Status:ServiceStatus{LoadBalancer:LoadBalancerStatus{Ingress:[]LoadBalancerIngress{LoadBalancerIngress{IP:6.6.6.6,Hostname:,Ports:[]PortStatus{},},},},Conditions:[]Condition{},},} I0107 19:23:05.286339 40592 gateway_shared_intf.go:207] Adding flows on breth0 for Nodeport Service web-service-l3 in Namespace: default since ExternalTrafficPolicy=local I0107 19:23:05.286355 40592 gateway_shared_intf.go:325] Adding flows on breth0 for Ingress Service web-service-l3 in Namespace: default since ExternalTrafficPolicy=local I0107 19:23:05.286365 40592 gateway_shared_intf.go:207] Adding flows on breth0 for Nodeport Service web-service-l3 in Namespace: default since ExternalTrafficPolicy=local I0107 19:23:05.286376 40592 gateway_shared_intf.go:325] Adding flows on breth0 for Ingress Service web-service-l3 in Namespace: default since ExternalTrafficPolicy=local Next we checked if all the iptable rules were intact: We could see the rules for NodePort: [0:0] -A OVN-KUBE-NODEPORT -p tcp -m addrtype --dst-type LOCAL -m tcp --dport 31220 -j DNAT --to-destination 169.254.169.3:31220 [0:0] -A OVN-KUBE-NODEPORT -p tcp -m addrtype --dst-type LOCAL -m tcp --dport 32229 -j DNAT --to-destination 169.254.169.3:32229 and the return rule for preventing SNAT: [0:0] -A OVN-KUBE-SNAT-MGMTPORT -p tcp -m tcp --dport 31220 -j RETURN [0:0] -A OVN-KUBE-SNAT-MGMTPORT -p tcp -m tcp --dport 32229 -j RETURN but the OVN-KUBE-EXTERNALIP rule was missing which was necessary to get the packet into ovn-k8s-mp0. Since the externalIP was a LB.ingress.VIP, we were missing this fix: https://github.com/openshift/ovn-kubernetes/pull/888 which creates the iptable rules for ingress IPs as well. Created a custom image to include this fix: quay.io/itssurya/dev-images:af09cb6c-37b6-4d12-a463-8e2b91f49c19 and tested this on the cluster and indeed we were able to see the iptable rule for OVN-KUBE-EXTERNALIP getting created: [1:60] -A OVN-KUBE-EXTERNALIP -d 10.10.10.10/32 -p tcp -m tcp --dport 80 -j DNAT --to-destination 169.254.169.3:31220 [0:0] -A OVN-KUBE-EXTERNALIP -d 10.10.10.10/32 -p tcp -m tcp --dport 8080 -j DNAT --to-destination 169.254.169.3:32229 Doing a curl then worked: [kni@f12-h17-b07-5039ms ~]$ curl 10.10.10.10 <!DOCTYPE html> <html> <head> <title>Hello World</title> This is dupe of https://bugzilla.redhat.com/show_bug.cgi?id=2031012
*** This bug has been marked as a duplicate of bug 2031012 ***