Description of problem: VIP support (VRRP protocol) relies on multicast address 224.00.18/32 to determine the master. If this address is blocked by iptables rules all nodes become master. 224.0.0.18/32 must not be blocked when VIP (VRRP protocol) is configured. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Set up VIP (high availbility) 2.block 224.0.0.24 in iptables 3.Observe all VIP nodes report as Master. Actual results: Expected results: 224.0.0.18 must not be blocked when VIP is configured. Additional info:
Phil, please lead a discussion about this. As I see it, the choices are: 1) Make ipfailover change iptables (needs a privileged container, and scares me) 2) Document this better in the failover docs (manual config is annoying) To me, 2 feels better, but please work this out.
I lean towards #1. Isn't ipfailover already privileged to be able to add/remove ip addresses?
I like #1 as well. We should quietly configure everything that is needed to make the feature work. Should we validate traffic can flow over 224.0.0.18 as part of configuring HA? The down side is the same IP on multiple nodes. Who needs to be involved in this discussion? I have added Eric P, Clayton and Jordan to the cc:
I'm not sure what you mean by validate. Do you mean assign a multicast address to an interface and try to use it in some way? No.
I was thinking of a side channel like ping the host to establish connectivity exists. Also verify keepalived is running. If keepalived can't talk to the peer keepalived, something is wrong (and both/many/all become MASTER).
Spoke with Ben. I will change the ipf container to create the iptable rule if it is not present.
openshift/origin PR 11327 openshift/openshift-docs PR 3051 Ended up adding the --iptables-chain option to oadm ipfailover The iptables rule is added if one doesn't exist. Re-wrote much of the high availability doc to describe this and the general operation.
PR 11327 MERGed
Documentation - openshift-docs PR 3051 - being reviewed.
this rule '-A INPUT -d 224.0.0.18/32 -j ACCEPT' will be added after the ipfailover pod is creating for now. but found the rule deleted not be deleted when the ipfailover pod is deleted.
typo in comment 10 but found the rule did not be deleted when the ipfailover pod is deleted.
The rule is common to all ipfailover setups. So if you have more than 1 ipfailover all of them share the same rule. The rule cannot be deleted until all of the ipfailover DCs are deleted. As long as keepalived is running on any node in the cluster the rule must remain.
To clarify, we can not automatically remove the rule safely from inside a container. So it is expected that the rule continues even after the pod has been deleted. Phil, can you clarify that in your docs? Given that, I think this issue is resolved.
Ben, I have changed the docs pr3051
ok, thanks clarify this. verified this bug.