Bug 1411501
Summary: | Only one ipfailover container can be run on a node | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Ben Bennett <bbennett> |
Component: | Networking | Assignee: | Phil Cameron <pcameron> |
Networking sub component: | router | QA Contact: | zhaozhanqi <zzhao> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | high | CC: | aloughla, aos-bugs, bmeng, eparis, ramr, tdawson |
Version: | 3.5.0 | ||
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause: pods can start on same node
Consequence:
Fix: force pods to start on different nodes
Result:
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2017-04-12 19:08:58 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Ben Bennett
2017-01-09 20:56:51 UTC
bbennett digging around I found: Pod anti-affinity does not work in Openshift (Jan 2017) https://access.redhat.com/solutions/2840171 Also port 1985 is not anti-affinity since the pod is started and just gets stuck on the port. The pod is not moved to another node. ramr Do you know the intended use for port 1985? @phil - there is nothing binding to the port 1985 if that's what you are asking. As Ben mentioned, it was a cheap way (and the only way when this was done) to ensure that two ipfailover pods are not placed on the same node/host when kubernetes scheduled the pods. You can't run two keepaliveds on the same node as it would clash managing the same network/interfaces and the VRRP messages [src/dest would be the same for 2 pods]. Also note the ipfailover pod _has_ to run in host networking mode. @ramr - We can run two _different_ configurations of keepalived (i.e. managing different addresses and with different virtual_router_ids) on the same node, right? The problem is just if you run the same config on one node that it would fight. Phil and I tried the same config and different configs, and with the same config keepalived detects a problem and logs vociferously. With different configs all was good (and we are already setting the virtual_router_id differently). http://serverfault.com/questions/473058/keepaliveds-virtual-router-id-should-it-be-unique-per-node seems to back up this assessment. Proposal: Continue to use port number with each ipf config having a different port. The port for a config could be port 1985 (current prot) + the vrrp_id in the config. vrrp_id is in the range 0-255 so the actual port would be in the range 1985-2240 (assuming that range is available.) There is one port per config taken from that range. When pod affinity, anti-affinity become GA we can switch to that. Affinity in 1.4, 1.5 is alpha using annotations, in 1.6 it becomes beta using a field. When beta arrives alpha is deprecated. I think we need to figure out an upgrade path for customers that use this. Hopefully there are not very many. The port based configurations would continue work going forward. The affinity base solution would require customer mods to the dc as part of upgrades. pcameron: That seems reasonable. If we do that in 1.5 (and perhaps also apply the change in 1.4) then we should not have an upgrade problem. We do need to flip to anti-affinity at some point, but the port hack doesn't hurt. So let's do that now, and make a card for the future anti-affinity change so we don't lose track of it. Please consider the port range we are using... I'm not sure if there's a good reason to start at 1980 (and we need to make sure there's nothing between 1980 and 1980 + 255 that we care about). I suspect there is, so we need to work out if there's a better range to use. Since nothing actually binds to the ports, we could use a high range (in the dynamic assigned area) to avoid conflicts. In merge queue. Commit pushed to master at https://github.com/openshift/origin https://github.com/openshift/origin/commit/959a4bfd2fb30b64400124a50d59219901dda012 Allow multiple ipfailover configs on same node The ipfailover pods for a given configuration must run on different nodes. We are using the ServicePort as a mechanism to prevent multiple pods for same configuration from starting on the same node. Since pods for different configurations can run on the same node a different ServicePort is used for each configuration. In the future, this may be changed to pod anti-affinity. bug 1411501 https://bugzilla.redhat.com/show_bug.cgi?id=1411501 Signed-off-by: Phil Cameron <pcameron> This has been merged into ocp and is in OCP v3.5.0.16 or newer. Verified this bug on openshift version openshift v3.5.0.16+a26133a kubernetes v1.5.2+43a9be4 etcd 3.1.0 When creating two ipfailover pod in same node. both they are working well. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0884 |