Bug 2018481
Summary: | [osp][octavia lb] Route shard not consistently served in a LoadBalancerService type IngressController | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jon Uriarte <juriarte> | ||||
Component: | Cloud Compute | Assignee: | Michał Dulko <mdulko> | ||||
Cloud Compute sub component: | OpenStack Provider | QA Contact: | Jon Uriarte <juriarte> | ||||
Status: | CLOSED ERRATA | Docs Contact: | |||||
Severity: | high | ||||||
Priority: | medium | CC: | aos-bugs, cholman, egarcia, m.andre, mbridges, mfedosin, mmahmoud, pprinett, stephenfin, surya | ||||
Version: | 4.9 | Keywords: | Reopened, Triaged | ||||
Target Milestone: | --- | ||||||
Target Release: | 4.12.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | No Doc Update | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2023-01-17 19:46:45 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Jon Uriarte
2021-10-29 13:06:28 UTC
How reproducable is this? There could be a lot of reasons why this is happening. Is OVN stable in OpenStack? This is either an issue with OVN reliability in the cloud platform, or an openshift networking problem. All we do is provision resoures, and as far as I can tell, that has been done correctly, since you are able to reach the end service at least some of the time. I forgot to say that the same test works with OpenshiftSDN and Kuryr, but not with OVNKubernetes so that can tell us that it's not underlying Openstack OVN. It's reproducible almost 100% of the times. Moving over to ovn-kubernetes as it's only happening with OVNKubernetes according to comment 2. svc is using ExternalTrafficPolicy set to local, the svc in use here has 2 Endpoints on node ostest-cdmd8-worker-0-4mwck and ostest-cdmd8-worker-0-p85j4 when curl works the traffic hit one of those workers EP pod it was noticed when curl fail the traffic goes to one of the master nodes as shown in the tcpdump traces oc debug node/ostest-cdmd8-master-0 Starting pod/ostest-cdmd8-master-0-debug ... To use host binaries, run `chroot /host` Pod IP: 10.196.1.62 If you don't see a command prompt, try pressing enter. sh-4.4# tcpdump -i any host 10.46.22.194 -nn -vvv dropped privs to tcpdump tcpdump: listening on any, link-type LINUX_SLL (Linux cooked v1), capture size 262144 bytes 12:50:16.190193 IP (tos 0x0, ttl 63, id 56587, offset 0, flags [DF], proto TCP (6), length 60) 10.46.22.194.43792 > 10.196.1.62.32321: Flags [S], cksum 0x77e6 (correct), seq 536703276, win 29200, options [mss 1460,sackOK,TS val 3505563049 ecr 0,nop,wscale 7], length 0 12:50:16.191991 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 40) 10.196.1.62.32321 > 10.46.22.194.43792: Flags [R.], cksum 0xc862 (correct), seq 0, ack 536703277, win 0, length 0 so either Octiva LB doesn't support ETP local and we need to change it to cluster or Octiva LB has a bug when svc is ETP local it doesn't send the packets to the right nodes causing curl failure (In reply to Mohamed Mahmoud from comment #13) > svc is using ExternalTrafficPolicy set to local, the svc in use here has 2 > Endpoints on node ostest-cdmd8-worker-0-4mwck and > ostest-cdmd8-worker-0-p85j4 > when curl works the traffic hit one of those workers EP pod > it was noticed when curl fail the traffic goes to one of the master nodes as > shown in the tcpdump traces > oc debug node/ostest-cdmd8-master-0 > Starting pod/ostest-cdmd8-master-0-debug ... > To use host binaries, run `chroot /host` > Pod IP: 10.196.1.62 > If you don't see a command prompt, try pressing enter. > sh-4.4# tcpdump -i any host 10.46.22.194 -nn -vvv > dropped privs to tcpdump > tcpdump: listening on any, link-type LINUX_SLL (Linux cooked v1), capture > size 262144 bytes > 12:50:16.190193 IP (tos 0x0, ttl 63, id 56587, offset 0, flags [DF], proto > TCP (6), length 60) > 10.46.22.194.43792 > 10.196.1.62.32321: Flags [S], cksum 0x77e6 > (correct), seq 536703276, win 29200, options [mss 1460,sackOK,TS val > 3505563049 ecr 0,nop,wscale 7], length 0 > 12:50:16.191991 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP > (6), length 40) > 10.196.1.62.32321 > 10.46.22.194.43792: Flags [R.], cksum 0xc862 > (correct), seq 0, ack 536703277, win 0, length 0 > > so either Octiva LB doesn't support ETP local and we need to change it to > cluster or Octiva LB has a bug when svc is ETP local it doesn't send the > packets to the right nodes causing curl failure or the worker node that doesn't have EP on all those causes curl will fail Setting the ExternalTrafficPolicy to Cluster makes the curl work 100% of the times. Steps: ## 1. Create a new LoadBalancerService type IngressController: $ cat ingress_controller_cluster.yaml apiVersion: v1 items: - apiVersion: operator.openshift.io/v1 kind: IngressController metadata: name: sharding-test-cluster namespace: openshift-ingress-operator spec: domain: sharding-test-cluster.internalapps.apps.ostest.shiftstack.com endpointPublishingStrategy: type: LoadBalancerService nodePlacement: nodeSelector: matchLabels: node-role.kubernetes.io/worker: "" routeSelector: matchLabels: type: cluster status: {} kind: List metadata: resourceVersion: "" selfLink: "" $ oc apply -f ingress_controller_cluster.yaml ## 2. Check router pods are created $ oc -n openshift-ingress get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES router-default-84d67fb69f-b7hmn 1/1 Running 0 26h 10.196.0.250 ostest-cdmd8-worker-0-4mwck <none> <none> router-default-84d67fb69f-fftwp 1/1 Running 0 26h 10.196.1.104 ostest-cdmd8-worker-0-jcwmq <none> <none> router-sharding-test-cluster-9c9ff8898-fv6vq 1/1 Running 0 39m 10.128.2.28 ostest-cdmd8-worker-0-4mwck <none> <none> router-sharding-test-cluster-9c9ff8898-x2bcg 1/1 Running 0 39m 10.131.0.123 ostest-cdmd8-worker-0-p85j4 <none> <none> ## 3. Check router LB type svc is created [stack@undercloud-0 ~]$ oc -n openshift-ingress get svc -o wide NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR router-internal-default ClusterIP 172.30.217.73 <none> 80/TCP,443/TCP,1936/TCP 2d4h ingresscontroller.operator.openshift.io/deployment-ingresscontroller=default router-internal-sharding-test-cluster ClusterIP 172.30.159.7 <none> 80/TCP,443/TCP,1936/TCP 40m ingresscontroller.operator.openshift.io/deployment-ingresscontroller=sharding-test-cluster router-sharding-test-cluster LoadBalancer 172.30.93.207 10.46.22.230 80:32661/TCP,443:32584/TCP 40m ingresscontroller.operator.openshift.io/deployment-ingresscontroller=sharding-test-cluster ## 4. Check the LB IP $ oc -n openshift-ingress get services/router-sharding-test-cluster -o yaml [...] status: loadBalancer: ingress: - ip: 10.46.22.230 <--- [...] ## 5. Check the router-sharding-test-cluster ExternalTrafficPolicy type $ oc -n openshift-ingress describe svc router-sharding-test-cluster Name: router-sharding-test-cluster Namespace: openshift-ingress Labels: app=router ingresscontroller.operator.openshift.io/owning-ingresscontroller=sharding-test-cluster router=router-sharding-test-cluster Annotations: traffic-policy.network.alpha.openshift.io/local-with-fallback: Selector: ingresscontroller.operator.openshift.io/deployment-ingresscontroller=sharding-test-cluster Type: LoadBalancer IP Family Policy: SingleStack IP Families: IPv4 IP: 172.30.93.207 IPs: 172.30.93.207 LoadBalancer Ingress: 10.46.22.230 Port: http 80/TCP TargetPort: http/TCP NodePort: http 32661/TCP Endpoints: 10.128.2.28:80,10.131.0.123:80 Port: https 443/TCP TargetPort: https/TCP NodePort: https 32584/TCP Endpoints: 10.128.2.28:443,10.131.0.123:443 Session Affinity: None External Traffic Policy: Local <<<<<<<<< HealthCheck NodePort: 32413 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal EnsuringLoadBalancer 92s service-controller Ensuring load balancer Normal EnsuredLoadBalancer 51s service-controller Ensured load balancer The svc was created with Local policy ## 6. The issue is reproduced at this point, curl works intermittently (note it's still with "External Traffic Policy: Local") ## 7. Change External Traffic Policy to Cluster $ oc -n openshift-ingress edit svc router-sharding-test-cluster service/router-sharding-test-cluster edited $ oc -n openshift-ingress describe svc router-sharding-test-cluster Name: router-sharding-test-cluster Namespace: openshift-ingress Labels: app=router ingresscontroller.operator.openshift.io/owning-ingresscontroller=sharding-test-cluster router=router-sharding-test-cluster Annotations: traffic-policy.network.alpha.openshift.io/local-with-fallback: Selector: ingresscontroller.operator.openshift.io/deployment-ingresscontroller=sharding-test-cluster Type: LoadBalancer IP Family Policy: SingleStack IP Families: IPv4 IP: 172.30.93.207 IPs: 172.30.93.207 LoadBalancer Ingress: 10.46.22.230 Port: http 80/TCP TargetPort: http/TCP NodePort: http 32661/TCP Endpoints: 10.128.2.28:80,10.131.0.123:80 Port: https 443/TCP TargetPort: https/TCP NodePort: https 32584/TCP Endpoints: 10.128.2.28:443,10.131.0.123:443 Session Affinity: None External Traffic Policy: Cluster <<<<<<<<< Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal EnsuringLoadBalancer 2m45s (x2 over 10m) service-controller Ensuring load balancer Normal ExternalTrafficPolicy 2m45s service-controller Local -> Cluster Normal EnsuredLoadBalancer 2m44s (x2 over 9m19s) service-controller Ensured load balancer ## 8. The curl starts working 100% of the times Removing the Triaged keyword because: * the QE automation assessment (flag qe_test_coverage) is missing How should this be solved? Is it expected that with ETP=local the LB created in Octavia will only include nodes with the Service pods? How can that be known by the cloud provider? Should it analyze the Service endpoints and check where the pods are placed? Normally we could use health monitors to solve this (members on nodes without pods would just be marked as down), but ovn-octavia-provider doesn't support them. Okay, I think that AWS and GCP cloud-providers solve this using health checks to make sure traffic is not directed to the nodes that will not answer. It's a pickle to solve this for ovn-octavia-provider as health monitors are not supported there yet. As an alternative we could attempt to only add these nodes that are hosting the Service pods to the LB, but that would require us to watch Pods, so it's not ideal as it's not really the model cloud provider interfaces are designed for. At this moment I believe we should document that OVN LBs + ETP=Local won't work. Then in the cloud-provider we can attempt to implement an option to force Amphora for any Service that has ETP=Local. (In reply to Michał Dulko from comment #22) > Okay, I think that AWS and GCP cloud-providers solve this using health > checks to make sure traffic is not directed to the nodes that will not > answer. It's a pickle to solve this for ovn-octavia-provider as health > monitors are not supported there yet. As an alternative we could attempt to > only add these nodes that are hosting the Service pods to the LB, but that > would require us to watch Pods, so it's not ideal as it's not really the > model cloud provider interfaces are designed for. > > At this moment I believe we should document that OVN LBs + ETP=Local won't > work. Then in the cloud-provider we can attempt to implement an option to > force Amphora for any Service that has ETP=Local. this statement is not accurate we are able run with metallb LB with ETP local w/o any issues, again this is limitation with ovn octavia LB, and if you wanted to doc you that that is fine by me. for ETP local to work LB should aim traffic to nodes that only have endpoint(s) that is the basic requirement for LB. (In reply to Mohamed Mahmoud from comment #24) > (In reply to Michał Dulko from comment #22) > > Okay, I think that AWS and GCP cloud-providers solve this using health > > checks to make sure traffic is not directed to the nodes that will not > > answer. It's a pickle to solve this for ovn-octavia-provider as health > > monitors are not supported there yet. As an alternative we could attempt to > > only add these nodes that are hosting the Service pods to the LB, but that > > would require us to watch Pods, so it's not ideal as it's not really the > > model cloud provider interfaces are designed for. > > > > At this moment I believe we should document that OVN LBs + ETP=Local won't > > work. Then in the cloud-provider we can attempt to implement an option to > > force Amphora for any Service that has ETP=Local. > > this statement is not accurate we are able run with metallb LB with ETP > local w/o any issues, again this is limitation with ovn octavia LB, and if > you wanted to doc you that that is fine by me. > for ETP local to work LB should aim traffic to nodes that only have > endpoint(s) that is the basic requirement for LB. Yeah, by "OVN LB" I meant LBs backed by Octavia and it's octavia-ovn-provider. It's quite confusing, I get that. ;) BTW, I realized today that upstream cloud provider openstack has a note about this on the `create-monitor` option [1]: ``` create-monitor Indicates whether or not to create a health monitor for the service load balancer. A health monitor required for services that declare externalTrafficPolicy: Local. Default: false ``` The LB annotation documentation provides a bit more details [2]: ``` The health monitor can be created or deleted dynamically. A health monitor is required for services with externalTrafficPolicy: Local. Not supported when lb-provider=ovn is configured in openstack-cloud-controller-manager. ``` If we were to set `create-monitor=true` for the in-tree cloud provider, we would also have to set the `monitor-delay`, `monitor-timeout`, and `monitor-max-retries` as well as they do not get default values there [3]. We would also have to force the amphora LB provider with `lb-provider=amphora`. We should fix this in the docs. [1] https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/openstack-cloud-controller-manager/using-openstack-cloud-controller-manager.md#load-balancer [2] https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/openstack-cloud-controller-manager/expose-applications-using-loadbalancer-type-service.md#service-annotations [3] https://kubernetes-docsy-staging.netlify.app/docs/concepts/cluster-administration/cloud-providers/#load-balancer Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399 |