Description of problem: keepalived virtual routerids clash when running several clusters, causing some vips not to go up Version-Release number of selected component (if applicable): >=4.4 How reproducible: always, given one uses specific cluster names Steps to Reproduce: 1 .deploy a cluster with name cnf10 and a second one with cnf11 2. api virtual id on first cluster conflicts with ingress virtual id on the second one as both are evaluated with this function https://github.com/openshift/baremetal-runtimecfg/pull/54/files#diff-3b5c896aef01987443b23dc503e418eaR147 Actual results: conflicts, resulting in ingress vip not going up on workers Expected results: no conflicts Additional info: a tool should at least anticipates the generated ids to warn end user that he should not use those two cluster names together something like this for instance ``` package main import "fmt" func FletcherChecksum8(inp string) uint8 { var ckA, ckB uint8 for i := 0; i < len(inp); i++ { ckA = (ckA + inp[i]) % 0xf ckB = (ckB + ckA) % 0xf } return (ckB << 4) | ckA } func main() { cluster1 := "cnf10" cluster2 := "cnf11" api_id1 := FletcherChecksum8(cluster1+"-api") + 1 dns_id1 := FletcherChecksum8(cluster1+"-dns") + 1 ingress_id1 := FletcherChecksum8(cluster1+"-ingress") + 1 api_id2 := FletcherChecksum8(cluster2+"-api") + 1 dns_id2 := FletcherChecksum8(cluster2+"-dns") + 1 ingress_id2 := FletcherChecksum8(cluster2+"-ingress") + 1 fmt.Printf("cluster: %s api: %d dns: %d ingress: %d\n", cluster1, api_id1, dns_id1, ingress_id1) fmt.Printf("cluster: %s api: %d dns: %d ingress: %d\n", cluster2, api_id2, dns_id2, ingress_id2) } ```
Just to clarify, keepalived virtual router ids clashes only if the clusters deployed on the same L2 domain.
Verified on 4.5.0-0.nightly-2020-04-14-031010 checked from master node: [master-0-0 ~]$ sudo crictl exec $(sudo crictl ps --name keepalived-monitor | awk 'FNR==2{ print $1}') runtimecfg vr-ids cnf10 APIVirtualRouterID: 147 DNSVirtualRouterID: 158 IngressVirtualRouterID: 2 [core@master-0-0 ~]$ sudo crictl exec $(sudo crictl ps --name keepalived-monitor | awk 'FNR==2{ print $1}') runtimecfg vr-ids cnf11 APIVirtualRouterID: 228 DNSVirtualRouterID: 239 IngressVirtualRouterID: 147 Checked on external host by documentation provided here https://github.com/openshift/installer/blob/master/docs/user/metal/install_ipi.md [~]# podman run quay.io/openshift/origin-baremetal-runtimecfg:4.5 vr-ids cnf11 APIVirtualRouterID: 228 DNSVirtualRouterID: 239 IngressVirtualRouterID: 147
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409