Description of problem: When keepalived is starting, the startup script calls iptables to add a rule allowing VRRP multicast. However, iptables is not present in the image, so the error "iptables: command not found" is thrown Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Launch a pod using the image registry.redhat.io/openshift4/ose-keepalived-ipfailover:latest 2. Run oc logs <pod-name> 3. Actual results: $ oc logs ipfailover-keepalived-7d95646cb5-7vkcm ... snip ... - check for iptables rule for keepalived multicast (224.0.0.18) ... /var/lib/ipfailover/keepalived/lib/failover-functions.sh: line 55: iptables: command not found - adding iptables rule to INPUT to access 224.0.0.18. /var/lib/ipfailover/keepalived/lib/failover-functions.sh: line 58: iptables: command not found ... snip ... Expected results: Additional info:
We suspect that the requirement to use the host's iptables command comes from the RHEL8 rebase, which landed in OpenShift 4.6. Let's get this fixed in 4.8 and then check if we need to backport changes to 4.7 and 4.6.
attempted to verify with 4.8.0-0.nightly-2021-06-07-034343 with image path set to: registry.redhat.io/openshift4/ose-keepalived-ipfailover:latest $ oc get deploy/ipfailover -oyaml apiVersion: apps/v1 kind: Deployment metadata: annotations: deployment.kubernetes.io/revision: "3" creationTimestamp: "2021-06-07T12:40:06Z" generation: 3 labels: ipfailover: ipfailover name: ipfailover namespace: default resourceVersion: "56239" uid: 9167c2cd-263e-417e-9f64-dcbaa365c4b8 spec: progressDeadlineSeconds: 600 replicas: 3 revisionHistoryLimit: 10 selector: matchLabels: ipfailover: ipfailover strategy: type: Recreate template: metadata: creationTimestamp: null labels: ipfailover: ipfailover spec: containers: - env: - name: OPENSHIFT_HA_CHECK_INTERVAL value: "2" - name: OPENSHIFT_HA_CHECK_SCRIPT - name: OPENSHIFT_HA_CHECK_SCRIPT value: /etc/keepalive/mycheckscript.sh - name: OPENSHIFT_HA_CONFIG_NAME value: ipfailover - name: OPENSHIFT_HA_IPTABLES_CHAIN value: INPUT - name: OPENSHIFT_HA_MONITOR_PORT value: "8080" - name: OPENSHIFT_HA_NETWORK_INTERFACE - name: OPENSHIFT_HA_NOTIFY_SCRIPT - name: OPENSHIFT_HA_PREEMPTION value: nopreempt - name: OPENSHIFT_HA_REPLICA_COUNT value: "1" - name: OPENSHIFT_HA_USE_UNICAST value: "true" - name: OPENSHIFT_HA_VIP_GROUPS value: "0" - name: OPENSHIFT_HA_VIRTUAL_IPS value: 192.168.1.9 - name: OPENSHIFT_HA_VRRP_ID_OFFSET value: "0" - name: OPENSHIFT_HA_UNICAST_PEERS value: 10.0.212.193,10.0.155.217,10.0.161.172 image: registry.redhat.io/openshift4/ose-keepalived-ipfailover:latest imagePullPolicy: IfNotPresent <--snip--> still seeing iptables: command not found error messages, $ oc logs pod/ipfailover-66cb8499b9-9nsnm - Loading ip_vs module ... - Checking if ip_vs module is available ... ip_vs 172032 0 - Module ip_vs is loaded. - check for iptables rule for keepalived multicast (224.0.0.18) ... - adding iptables rule to INPUT to access 224.0.0.18. - Generating and writing config to /etc/keepalived/keepalived.conf /var/lib/ipfailover/keepalived/lib/failover-functions.sh: line 55: iptables: command not found /var/lib/ipfailover/keepalived/lib/failover-functions.sh: line 58: iptables: command not found - Starting failover services ... <--snip-->
update from Ryan: Looking at the dockerfile, the keepalived image version is out of date: # pwd /var/lib/ipfailover/keepalived # cat Dockerfile --- snip --- LABEL \ io.k8s.display-name="OpenShift Container Platform IP Failover" \ io.k8s.description="This is a component of OpenShift Container Platform and runs a clustered keepalived instance across multiple hosts to allow highly available IP addresses." \ io.openshift.tags="openshift,ha,ip,failover" \ License="GPLv2+" \ vendor="Red Hat" \ name="openshift/ose-keepalived-ipfailover" \ com.redhat.component="openshift-enterprise-keepalived-ipfailover-container" \ io.openshift.maintainer.product="OpenShift Container Platform" \ release="202105210300.p0" \ io.openshift.build.commit.id="0e45f638fbf5fa9e9bdb507d81b2cb9f12fadbaf" \ io.openshift.build.source-location="https://github.com/openshift/images" \ io.openshift.build.commit.url="https://github.com/openshift/images/commit/0e45f638fbf5fa9e9bdb507d81b2cb9f12fadbaf" \ version="v4.7.0"
used image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:644bf2d63cc24035ec82a39e0b14e6d61e3ca4ba39181b409590132f59bfc2cf in keepalive yaml file, verified the fix in 4.8.0-0.nightly-2021-06-08-034312 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.nightly-2021-06-08-034312 True False 101m Cluster version is 4.8.0-0.nightly-2021-06-08-034312 $ oc get pod NAME READY STATUS RESTARTS AGE ipfailover-c895f99bf-8v5jc 1/1 Running 0 52m ipfailover-c895f99bf-8wdkw 1/1 Running 0 52m ipfailover-c895f99bf-vrlgf 1/1 Running 0 52m web-server-rc-2dxtd 1/1 Running 0 88m web-server-rc-lj9pd 1/1 Running 0 88m web-server-rc-tzpr8 1/1 Running 0 88m $ oc logs ipfailover-c895f99bf-8v5jc | grep "iptables" - check for iptables rule for keepalived multicast (224.0.0.18) ... - adding iptables rule to INPUT to access 224.0.0.18. $ oc logs ipfailover-c895f99bf-8wdkw |grep "iptables" - check for iptables rule for keepalived multicast (224.0.0.18) ... - adding iptables rule to INPUT to access 224.0.0.18. $ oc logs ipfailover-c895f99bf-vrlgf |grep "iptables" - check for iptables rule for keepalived multicast (224.0.0.18) ... - adding iptables rule to INPUT to access 224.0.0.18. no more "iptables: command not found" error message is seen.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438