Bug 1960469 - [4.7] - vsphere keepalived fails with ingress controllers shards - causing incorrect routing
Summary: [4.7] - vsphere keepalived fails with ingress controllers shards - causing in...
Keywords:
Status: CLOSED DUPLICATE of bug 1988102
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.7
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ---
: ---
Assignee: Ben Nemec
QA Contact: Rio Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-05-14 00:19 UTC by Vladislav Walek
Modified: 2021-08-23 14:30 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-08-23 14:30:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Vladislav Walek 2021-05-14 00:19:15 UTC
Description of problem:

In current configuration, the keepalived is configured to add the ingress-vip on any node which runs any ingress controller.
When using the router sharding, it causes that the keepalived will not configure the ingress-vip to the node which runs the "default" router, but randomly chooses any node running any (sharding) ingress controllers.
Customer has multiple ingress controller instances running in the cluster for route sharding. 

In the vsphere configuration, the keepalived is running on each host as a static pod. 
The static pod manifest is coming from the machine config.

For the worker machine config, the template of the keepalived configures that the ingress vip should run on the node where the health check script finds router listening on port 1936.

// keepalived.conf.tmpl

~~~
# TODO: Improve this check. The port is assumed to be alive.
# Need to assess what is the ramification if the port is not there.
vrrp_script chk_ingress {
    script "/usr/bin/timeout 0.9 /usr/bin/curl -o /dev/null -Lfs http://localhost:1936/healthz/ready"
    interval 1
    weight 50
}

{{$nonVirtualIP := .NonVirtualIP}}

vrrp_instance {{ .Cluster.Name }}_INGRESS {
    state BACKUP
    interface {{ .VRRPInterface }}
    virtual_router_id {{ .Cluster.IngressVirtualRouterID }}
    priority 40
    advert_int 1
    {{if .EnableUnicast}}
    unicast_src_ip {{.NonVirtualIP}}
    unicast_peer {
        {{range .IngressConfig.Peers}}
        {{if ne $nonVirtualIP .}}{{.}}{{end}}
        {{end}}
    }
    {{end}}
    authentication {
        auth_type PASS
        auth_pass {{ .Cluster.Name }}_ingress_vip
    }
    virtual_ipaddress {
        {{ .Cluster.IngressVIP }}/{{ .Cluster.VIPNetmask }}
    }
    track_script {
        chk_ingress
    }
}
~~~

The script has major flaw in case you use multiple ingress controllers (not only, if you have application running listening port 1936 on hostNetwork under path "/healthz/ready" the keepalived will think it is router).

//Workarounds

The immediate workaround is to remove the other ingress shards so keep-alive runs on the infra nodes hosting the default ingress.

The short workaround is to apply new machine config that removes the keepalived component from machines that do not host the default ingress.

Long term solution would be most likely change in the behavior.


Version-Release number of selected component (if applicable):
OpenShift Container Platform 4.7
Vsphere IPI


How reproducible:
- create cluster with 4 worker nodes on vsphere IPI with keepalived
- create two ingress controllers - default and for sharding (schedule the pods to 2 workers each)
- check on which node the keepalived IP is configured - and confirm if this is the right node
- disable the default ingress controller and check if the IP bounces to the other worker nodes


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 17 Ben Nemec 2021-08-03 17:35:13 UTC
I'm going to duplicate this to 1988102 since that's where the fix is being tracked.

Comment 18 Sinny Kumari 2021-08-19 14:15:50 UTC
Assigning this to Ben as he would know better about this bug or feel free to reassign to relevant people in networking team.

Comment 19 Ben Nemec 2021-08-23 14:30:26 UTC
Hmm, I said I was going to duplicate this and then didn't. :-/

Let's see if I get it right this time...

*** This bug has been marked as a duplicate of bug 1988102 ***


Note You need to log in before you can comment on or make changes to this bug.