Bug 1381632 - vip (vrrp) requires 224.0.0.18/32
Summary: vip (vrrp) requires 224.0.0.18/32
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OKD
Classification: Red Hat
Component: Networking
Version: 3.x
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Phil Cameron
QA Contact: Meng Bo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-10-04 15:27 UTC by Phil Cameron
Modified: 2017-05-30 12:51 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-05-30 12:51:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift openshift-docs pull 3051 0 None None None 2016-10-31 17:41:02 UTC
Origin (Github) 11327 0 None None None 2016-10-14 19:10:35 UTC

Description Phil Cameron 2016-10-04 15:27:16 UTC
Description of problem:
VIP support (VRRP protocol) relies on multicast address 224.00.18/32 to determine the master. If this address is blocked by iptables rules all nodes become master.

224.0.0.18/32 must not be blocked when VIP (VRRP protocol) is configured.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Set up VIP (high availbility)
2.block 224.0.0.24 in iptables
3.Observe all VIP nodes report as Master.

Actual results:


Expected results:
224.0.0.18 must not be blocked when VIP is configured.

Additional info:

Comment 1 Ben Bennett 2016-10-05 19:13:52 UTC
Phil, please lead a discussion about this.  As I see it, the choices are:

1) Make ipfailover change iptables (needs a privileged container, and scares me)
2) Document this better in the failover docs (manual config is annoying)

To me, 2 feels better, but please work this out.

Comment 2 Eric Paris 2016-10-06 13:33:18 UTC
I lean towards #1. Isn't ipfailover already privileged to be able to add/remove ip addresses?

Comment 3 Phil Cameron 2016-10-06 13:43:41 UTC
I like #1 as well. We should quietly configure everything that is needed to make the feature work.

Should we validate traffic can flow over 224.0.0.18 as part of configuring HA? The down side is the same IP on multiple nodes.

Who needs to be involved in this discussion? I have added Eric P, Clayton and Jordan to the cc:

Comment 4 Eric Paris 2016-10-06 13:50:13 UTC
I'm not sure what you mean by validate. Do you mean assign a multicast address to an interface and try to use it in some way? No.

Comment 5 Phil Cameron 2016-10-06 14:00:37 UTC
I was thinking of a side channel like ping the host to establish connectivity exists. Also verify keepalived is running. If keepalived can't talk to the peer keepalived, something is wrong (and both/many/all become MASTER).

Comment 6 Phil Cameron 2016-10-06 21:22:25 UTC
Spoke with Ben. I will change the ipf container to create the iptable rule if it
is not present.

Comment 7 Phil Cameron 2016-10-14 19:10:36 UTC
openshift/origin PR 11327
openshift/openshift-docs PR 3051

Ended up adding the --iptables-chain option to oadm ipfailover
The iptables rule is added if one doesn't exist.

Re-wrote much of the high availability doc to describe this and the general operation.

Comment 8 Phil Cameron 2016-10-21 14:00:25 UTC
PR 11327 MERGed

Comment 9 Phil Cameron 2016-10-21 14:12:22 UTC
Documentation - openshift-docs PR 3051 - being reviewed.

Comment 10 zhaozhanqi 2016-11-02 03:27:57 UTC
this rule '-A INPUT -d 224.0.0.18/32 -j ACCEPT' will be added after the ipfailover pod is creating for now.

but found the rule deleted not be deleted when the ipfailover pod is deleted.

Comment 11 zhaozhanqi 2016-11-02 03:29:19 UTC
typo in comment 10

but found the rule did not be deleted when the ipfailover pod is deleted.

Comment 12 Phil Cameron 2016-11-02 12:48:24 UTC
The rule is common to all ipfailover setups. So if you have more than 1 ipfailover all of them share the same rule. The rule cannot be deleted until all of the ipfailover DCs are deleted. 

As long as keepalived is running on any node in the cluster the rule must remain.

Comment 13 Ben Bennett 2016-11-02 13:13:35 UTC
To clarify, we can not automatically remove the rule safely from inside a container.  So it is expected that the rule continues even after the pod has been deleted.

Phil, can you clarify that in your docs?

Given that, I think this issue is resolved.

Comment 14 Phil Cameron 2016-11-02 14:29:24 UTC
Ben, I have changed the docs pr3051

Comment 15 zhaozhanqi 2016-11-03 02:01:41 UTC
ok, thanks clarify this. verified this bug.


Note You need to log in before you can comment on or make changes to this bug.