Bug 1696628 - Using AWS when a secondary IP address is added the order of the list of InternalIPs is not preserved.
Summary: Using AWS when a secondary IP address is added the order of the list of Inter...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.2.0
Assignee: Robert Krawitz
QA Contact: Weinan Liu
URL:
Whiteboard:
Depends On:
Blocks: 1729276 1734385
TreeView+ depends on / blocked
 
Reported: 2019-04-05 10:01 UTC by Oscar Casal Sanchez
Modified: 2019-11-18 06:31 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Merging algorithm for additional IP addresses added to a node was incorrect. Consequence: After adding an additional IP address to a node, the list of addresses was out of order, resulting in the node being unable to talk to the api server. Fix: Change the merge algorithm for addresses to not reorder the addresses. Result: Adding secondary IP addresses to a node no longer changes the ordering and the node is able to continue communication with the api server.
Clone Of:
: 1729276 (view as bug list)
Environment:
Last Closed: 2019-10-16 06:28:05 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github kubernetes kubernetes pull 79391 'None' closed Don't use strategic merge patch on Node.Status.Addresses 2020-05-18 03:25:57 UTC
Github openshift origin pull 23345 'None' closed Bug 1696628: Don't use strategic merge patch on Node.Status.Addresses 2020-05-18 03:25:57 UTC
Red Hat Knowledge Base (Solution) 4130021 None None None 2019-05-11 05:23:06 UTC
Red Hat Product Errata RHBA-2019:2922 None None None 2019-10-16 06:28:22 UTC

Comment 37 Phil Cameron 2019-05-14 15:14:55 UTC
Changed the title to reflect what is actually happening. On aws as new additional IP addresses are added they are added to the end of the list of IP addresses as expected. However, when the new address is added to the Node's list of InternalIP addresses, the ordering of IP addresses in the list is not preserver. Kubeernetes/Openshift always uses the first address in the list as the node address for cluster operations. When this address changes (by the reordering) the cluster loses access to the node.

Comment 38 Dan Winship 2019-07-08 16:20:21 UTC
Fixed in upstream master. Not sure if we want to do backports upstream or just backport it here? The patch ought to apply pretty cleanly back to 3.11.

Comment 39 Dan Williams 2019-07-09 14:57:36 UTC
Upstream PR is linked in the External trackers above, but I'll paste here too: https://github.com/kubernetes/kubernetes/pull/79391

Comment 41 Dan Winship 2019-07-11 14:59:02 UTC
It looks like this bug would have been reported sooner and by more people, except for the fact that you can mostly use --node-ip to work around it in 3.x. But in 4.x that's not available, and so more people are running into this (https://github.com/openshift/machine-config-operator/issues/944). We may need to backport this to 4.1.z.

Comment 42 Sunil Choudhary 2019-07-19 11:22:57 UTC
Verified on 4.2.0-0.nightly-2019-07-18-120653.

From AWS console added 4th secondary IP addresses to node ip-10-0-137-237.us-east-2.compute.internal. Also rebooted the node, it is still using original IP address.

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.2.0-0.nightly-2019-07-18-120653   True        False         170m    Cluster version is 4.2.0-0.nightly-2019-07-18-120653

$ oc get nodes -o wide
NAME                                         STATUS   ROLES    AGE    VERSION             INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                                                   KERNEL-VERSION               CONTAINER-RUNTIME
ip-10-0-137-203.us-east-2.compute.internal   Ready    master   145m   v1.14.0+bbfcbc8ac   10.0.137.203   <none>        Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa)   4.18.0-80.4.2.el8_0.x86_64   cri-o://1.14.8-3.rhaos4.2.el8
ip-10-0-137-237.us-east-2.compute.internal   Ready    worker   136m   v1.14.0+bbfcbc8ac   10.0.137.237   <none>        Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa)   4.18.0-80.4.2.el8_0.x86_64   cri-o://1.14.8-3.rhaos4.2.el8
ip-10-0-145-136.us-east-2.compute.internal   Ready    worker   136m   v1.14.0+bbfcbc8ac   10.0.145.136   <none>        Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa)   4.18.0-80.4.2.el8_0.x86_64   cri-o://1.14.8-3.rhaos4.2.el8
ip-10-0-154-217.us-east-2.compute.internal   Ready    master   145m   v1.14.0+bbfcbc8ac   10.0.154.217   <none>        Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa)   4.18.0-80.4.2.el8_0.x86_64   cri-o://1.14.8-3.rhaos4.2.el8
ip-10-0-173-64.us-east-2.compute.internal    Ready    master   145m   v1.14.0+bbfcbc8ac   10.0.173.64    <none>        Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa)   4.18.0-80.4.2.el8_0.x86_64   cri-o://1.14.8-3.rhaos4.2.el8

$ aws ec2 describe-instances --instance-ids i-0632805cfdff1e0c4 --query 'Reservations[].Instances[].NetworkInterfaces[].PrivateIpAddresses'
[
    [
        {
            "Primary": true,
            "PrivateDnsName": "ip-10-0-134-237.ap-south-1.compute.internal",
            "PrivateIpAddress": "10.0.134.237"
        },
        {
            "Primary": false,
            "PrivateDnsName": "ip-10-0-137-187.ap-south-1.compute.internal",
            "PrivateIpAddress": "10.0.137.187"
        },
        {
            "Primary": false,
            "PrivateDnsName": "ip-10-0-131-93.ap-south-1.compute.internal",
            "PrivateIpAddress": "10.0.131.93"
        },
        {
            "Primary": false,
            "PrivateDnsName": "ip-10-0-132-193.ap-south-1.compute.internal",
            "PrivateIpAddress": "10.0.132.193"
        },
        {
            "Primary": false,
            "PrivateDnsName": "ip-10-0-135-119.ap-south-1.compute.internal",
            "PrivateIpAddress": "10.0.135.119"
        }
    ]
]

$ oc describe node ip-10-0-137-237.us-east-2.compute.internal
Name:               ip-10-0-137-237.us-east-2.compute.internal
...
Addresses:
  InternalIP:   10.0.137.237
  InternalDNS:  ip-10-0-137-237.us-east-2.compute.internal
  Hostname:     ip-10-0-137-237.us-east-2.compute.internal

Comment 43 Sunil Choudhary 2019-07-19 11:52:14 UTC
Ignore previous comment. Output from wrong node. Below is output from correct node. Verified on 4.2.0-0.nightly-2019-07-18-120653.

From AWS console added 4th secondary IP addresses to node ip-10-0-145-136.us-east-2.compute.internal. Also rebooted the node, it is still using original IP address.

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.2.0-0.nightly-2019-07-18-120653   True        False         170m    Cluster version is 4.2.0-0.nightly-2019-07-18-120653

$ oc get nodes -o wide
NAME                                         STATUS   ROLES    AGE    VERSION             INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                                                   KERNEL-VERSION               CONTAINER-RUNTIME
ip-10-0-137-203.us-east-2.compute.internal   Ready    master   145m   v1.14.0+bbfcbc8ac   10.0.137.203   <none>        Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa)   4.18.0-80.4.2.el8_0.x86_64   cri-o://1.14.8-3.rhaos4.2.el8
ip-10-0-137-237.us-east-2.compute.internal   Ready    worker   136m   v1.14.0+bbfcbc8ac   10.0.137.237   <none>        Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa)   4.18.0-80.4.2.el8_0.x86_64   cri-o://1.14.8-3.rhaos4.2.el8
ip-10-0-145-136.us-east-2.compute.internal   Ready    worker   136m   v1.14.0+bbfcbc8ac   10.0.145.136   <none>        Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa)   4.18.0-80.4.2.el8_0.x86_64   cri-o://1.14.8-3.rhaos4.2.el8
ip-10-0-154-217.us-east-2.compute.internal   Ready    master   145m   v1.14.0+bbfcbc8ac   10.0.154.217   <none>        Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa)   4.18.0-80.4.2.el8_0.x86_64   cri-o://1.14.8-3.rhaos4.2.el8
ip-10-0-173-64.us-east-2.compute.internal    Ready    master   145m   v1.14.0+bbfcbc8ac   10.0.173.64    <none>        Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa)   4.18.0-80.4.2.el8_0.x86_64   cri-o://1.14.8-3.rhaos4.2.el8

Added 4 secondary IPs

sh-4.4# curl -s http://169.254.169.254/latest/meta-data/network/interfaces/macs/06:8b:1b:6c:bc:24/local-ipv4s
10.0.145.136
10.0.150.98
10.0.156.131
10.0.146.68
10.0.155.234

$ aws ec2 describe-instances --instance-ids i-0c27102bad9011ea1 --query 'Reservations[].Instances[].NetworkInterfaces[].PrivateIpAddresses'
[
    [
        {
            "Primary": true,
            "PrivateDnsName": "ip-10-0-145-136.us-east-2.compute.internal",
            "PrivateIpAddress": "10.0.145.136"
        },
        {
            "Primary": false,
            "PrivateDnsName": "ip-10-0-150-98.us-east-2.compute.internal",
            "PrivateIpAddress": "10.0.150.98"
        },
        {
            "Primary": false,
            "PrivateDnsName": "ip-10-0-156-131.us-east-2.compute.internal",
            "PrivateIpAddress": "10.0.156.131"
        },
        {
            "Primary": false,
            "PrivateDnsName": "ip-10-0-146-68.us-east-2.compute.internal",
            "PrivateIpAddress": "10.0.146.68"
        },
        {
            "Primary": false,
            "PrivateDnsName": "ip-10-0-155-234.us-east-2.compute.internal",
            "PrivateIpAddress": "10.0.155.234"
        }
    ]
]

$ oc describe node ip-10-0-145-136.us-east-2.compute.internal
Name:               ip-10-0-145-136.us-east-2.compute.internal
...
Addresses:
  InternalIP:   10.0.145.136
  InternalIP:   10.0.150.98
  InternalIP:   10.0.156.131
  InternalIP:   10.0.146.68
  InternalIP:   10.0.155.234
  InternalDNS:  ip-10-0-145-136.us-east-2.compute.internal
  Hostname:     ip-10-0-145-136.us-east-2.compute.internal

$ oc get nodes -o wide
NAME                                         STATUS   ROLES    AGE     VERSION             INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                                                   KERNEL-VERSION               CONTAINER-RUNTIME
ip-10-0-137-203.us-east-2.compute.internal   Ready    master   4h35m   v1.14.0+bbfcbc8ac   10.0.137.203   <none>        Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa)   4.18.0-80.4.2.el8_0.x86_64   cri-o://1.14.8-3.rhaos4.2.el8
ip-10-0-137-237.us-east-2.compute.internal   Ready    worker   4h26m   v1.14.0+bbfcbc8ac   10.0.137.237   <none>        Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa)   4.18.0-80.4.2.el8_0.x86_64   cri-o://1.14.8-3.rhaos4.2.el8
ip-10-0-145-136.us-east-2.compute.internal   Ready    worker   4h26m   v1.14.0+bbfcbc8ac   10.0.145.136   <none>        Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa)   4.18.0-80.4.2.el8_0.x86_64   cri-o://1.14.8-3.rhaos4.2.el8
ip-10-0-154-217.us-east-2.compute.internal   Ready    master   4h35m   v1.14.0+bbfcbc8ac   10.0.154.217   <none>        Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa)   4.18.0-80.4.2.el8_0.x86_64   cri-o://1.14.8-3.rhaos4.2.el8
ip-10-0-173-64.us-east-2.compute.internal    Ready    master   4h35m   v1.14.0+bbfcbc8ac   10.0.173.64    <none>        Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa)   4.18.0-80.4.2.el8_0.x86_64   cri-o://1.14.8-3.rhaos4.2.el8

Comment 44 errata-xmlrpc 2019-10-16 06:28:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922


Note You need to log in before you can comment on or make changes to this bug.