Description of problem: Keepalived chk_default_ingress track script [1] should succeed at nodes running an instance of default router pod. But Keepalived logs (from a node running default router pod instance) shows that chk_default_ingress track script failed [2]. As a result of this issue, Ingress VIP may be mistakenly assigned with a wrong node in case multiple ingress controllers run in the cluster. Version-Release number of selected component (if applicable): [kni@worker-0 dev-scripts]$ oc version Client Version: 4.9.0-0.nightly-2021-08-28-220206 How reproducible: Steps to Reproduce: 1. Deploy cluster 2. Check which nodes run an instance of default router pod (you can use [3] command) 3. ssh into one of nods from step 2 4. Check Keepalived container logs Actual results: chk_default_ingress track script failed Expected results: chk_default_ingress track script should succeed Additional info: [1] https://github.com/openshift/machine-config-operator/blob/master/templates/master/00-master/on-prem/files/keepalived-keepalived.yaml#L49 [2] Mon Sep 13 11:27:48 2021: Script `chk_default_ingress` now returning 1 Mon Sep 13 11:27:48 2021: VRRP_Script(chk_default_ingress) failed (exited with status 1) [3] $ oc get pods -n openshift-ingress -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES router-default-6548d747b-6cw9f 1/1 Running 4 8d 192.168.111.23 worker-0 <none> <none> router-default-6548d747b-pjrw5 1/1 Running 0 8d 192.168.111.24 worker-1 <none> <none> [kni@worker-0 dev-scripts]$
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056