Bug 1666820

Summary: [3.11] [vsphere] The "Internal IP/Host IP" of the infra nodes starts changing to the VIPs, and changes constantly/randomly all on its own, to any of these VIPs on eth0 ( confirmed by oc get hostsubnet output).
Product: OpenShift Container Platform Reporter: Dan Winship <danw>
Component: Cloud ComputeAssignee: Dan Winship <danw>
Status: CLOSED ERRATA QA Contact: Jianwei Hou <jhou>
Severity: high Docs Contact:
Priority: high    
Version: 3.11.0CC: adeshpan, bmeng, openshift-bugs-escalate
Target Milestone: ---   
Target Release: 3.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: A changed introduced in Kubernetes 1.11 affected nodes with many IP addresses in vSphere deployments. Consequence: Under vSphere, a node hosting several Egress IPs or Router HA addresses would sporadically "forget" which of the IPs was its official "node IP" (even if that node IP had been explicitly specified in the node configuration) and start using one of the other ones, causing networking problems. Fix: If a "node IP" is specified in the node configuration, it will be used correctly, regardless of how many other IPs the node has. Result: Networking should work reliably.
Story Points: ---
Clone Of: 1643348 Environment:
Last Closed: 2019-02-20 14:11:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 1 Dan Winship 2019-01-16 16:20:29 UTC
https://github.com/openshift/origin/pull/21808

Comment 3 Meng Bo 2019-02-12 06:51:05 UTC
Checked on ocp build v3.11.82 and vsphere platform

After add VIP to the node via ipfailover, the node IP will not be switched to the virtual ips.


# ip a s ens192
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:9f:9d:89 brd ff:ff:ff:ff:ff:ff
    inet 10.66.146.238/22 brd 10.66.147.255 scope global noprefixroute dynamic ens192
       valid_lft 79369sec preferred_lft 79369sec
    inet 10.66.147.200/32 scope global ens192
       valid_lft forever preferred_lft forever
    inet 10.66.147.201/32 scope global ens192
       valid_lft forever preferred_lft forever
    inet6 fe80::379a:d274:c448:d980/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever
    inet6 fe80::fc1c:bae6:8dac:d2ed/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever
    inet6 fe80::f497:122c:8d5:60e1/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever

# oc get hostsubnet
NAME                          HOST                          HOST IP         SUBNET          EGRESS CIDRS   EGRESS IPS
ocp311.master.vsphere.local   ocp311.master.vsphere.local   10.66.144.168   10.128.0.0/23   []             []
ocp311.node1.vsphere.local    ocp311.node1.vsphere.local    10.66.146.238   10.129.0.0/23   []             []
ocp311.node2.vsphere.local    ocp311.node2.vsphere.local    10.66.147.208   10.130.0.0/23   []             []

# oc get node -o yaml ocp311.node1.vsphere.local
status:
  addresses:
  - address: 10.66.146.238
    type: ExternalIP
  - address: 10.66.146.238
    type: InternalIP
  - address: ocp311.node1.vsphere.local
    type: Hostname

Comment 5 errata-xmlrpc 2019-02-20 14:11:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0326