Bug 1696628
Summary: | Using AWS when a secondary IP address is added the order of the list of InternalIPs is not preserved. | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Oscar Casal Sanchez <ocasalsa> | |
Component: | Node | Assignee: | Robert Krawitz <rkrawitz> | |
Status: | CLOSED ERRATA | QA Contact: | Weinan Liu <weinliu> | |
Severity: | medium | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 3.11.0 | CC: | aos-bugs, bbennett, danw, dcbw, dmoessne, florin-alexandru.peter, jokerman, lmartinh, mmccomas, pcameron, rkrawitz, rsandu, schoudha, weliang | |
Target Milestone: | --- | |||
Target Release: | 4.2.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Cause: Merging algorithm for additional IP addresses added to a node was incorrect.
Consequence: After adding an additional IP address to a node, the list of addresses was out of order, resulting in the node being unable to talk to the api server.
Fix: Change the merge algorithm for addresses to not reorder the addresses.
Result: Adding secondary IP addresses to a node no longer changes the ordering and the node is able to continue communication with the api server.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1729276 (view as bug list) | Environment: | ||
Last Closed: | 2019-10-16 06:28:05 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1729276, 1734385 |
Comment 37
Phil Cameron
2019-05-14 15:14:55 UTC
Fixed in upstream master. Not sure if we want to do backports upstream or just backport it here? The patch ought to apply pretty cleanly back to 3.11. Upstream PR is linked in the External trackers above, but I'll paste here too: https://github.com/kubernetes/kubernetes/pull/79391 It looks like this bug would have been reported sooner and by more people, except for the fact that you can mostly use --node-ip to work around it in 3.x. But in 4.x that's not available, and so more people are running into this (https://github.com/openshift/machine-config-operator/issues/944). We may need to backport this to 4.1.z. Verified on 4.2.0-0.nightly-2019-07-18-120653. From AWS console added 4th secondary IP addresses to node ip-10-0-137-237.us-east-2.compute.internal. Also rebooted the node, it is still using original IP address. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.0-0.nightly-2019-07-18-120653 True False 170m Cluster version is 4.2.0-0.nightly-2019-07-18-120653 $ oc get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-10-0-137-203.us-east-2.compute.internal Ready master 145m v1.14.0+bbfcbc8ac 10.0.137.203 <none> Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa) 4.18.0-80.4.2.el8_0.x86_64 cri-o://1.14.8-3.rhaos4.2.el8 ip-10-0-137-237.us-east-2.compute.internal Ready worker 136m v1.14.0+bbfcbc8ac 10.0.137.237 <none> Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa) 4.18.0-80.4.2.el8_0.x86_64 cri-o://1.14.8-3.rhaos4.2.el8 ip-10-0-145-136.us-east-2.compute.internal Ready worker 136m v1.14.0+bbfcbc8ac 10.0.145.136 <none> Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa) 4.18.0-80.4.2.el8_0.x86_64 cri-o://1.14.8-3.rhaos4.2.el8 ip-10-0-154-217.us-east-2.compute.internal Ready master 145m v1.14.0+bbfcbc8ac 10.0.154.217 <none> Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa) 4.18.0-80.4.2.el8_0.x86_64 cri-o://1.14.8-3.rhaos4.2.el8 ip-10-0-173-64.us-east-2.compute.internal Ready master 145m v1.14.0+bbfcbc8ac 10.0.173.64 <none> Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa) 4.18.0-80.4.2.el8_0.x86_64 cri-o://1.14.8-3.rhaos4.2.el8 $ aws ec2 describe-instances --instance-ids i-0632805cfdff1e0c4 --query 'Reservations[].Instances[].NetworkInterfaces[].PrivateIpAddresses' [ [ { "Primary": true, "PrivateDnsName": "ip-10-0-134-237.ap-south-1.compute.internal", "PrivateIpAddress": "10.0.134.237" }, { "Primary": false, "PrivateDnsName": "ip-10-0-137-187.ap-south-1.compute.internal", "PrivateIpAddress": "10.0.137.187" }, { "Primary": false, "PrivateDnsName": "ip-10-0-131-93.ap-south-1.compute.internal", "PrivateIpAddress": "10.0.131.93" }, { "Primary": false, "PrivateDnsName": "ip-10-0-132-193.ap-south-1.compute.internal", "PrivateIpAddress": "10.0.132.193" }, { "Primary": false, "PrivateDnsName": "ip-10-0-135-119.ap-south-1.compute.internal", "PrivateIpAddress": "10.0.135.119" } ] ] $ oc describe node ip-10-0-137-237.us-east-2.compute.internal Name: ip-10-0-137-237.us-east-2.compute.internal ... Addresses: InternalIP: 10.0.137.237 InternalDNS: ip-10-0-137-237.us-east-2.compute.internal Hostname: ip-10-0-137-237.us-east-2.compute.internal Ignore previous comment. Output from wrong node. Below is output from correct node. Verified on 4.2.0-0.nightly-2019-07-18-120653. From AWS console added 4th secondary IP addresses to node ip-10-0-145-136.us-east-2.compute.internal. Also rebooted the node, it is still using original IP address. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.0-0.nightly-2019-07-18-120653 True False 170m Cluster version is 4.2.0-0.nightly-2019-07-18-120653 $ oc get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-10-0-137-203.us-east-2.compute.internal Ready master 145m v1.14.0+bbfcbc8ac 10.0.137.203 <none> Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa) 4.18.0-80.4.2.el8_0.x86_64 cri-o://1.14.8-3.rhaos4.2.el8 ip-10-0-137-237.us-east-2.compute.internal Ready worker 136m v1.14.0+bbfcbc8ac 10.0.137.237 <none> Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa) 4.18.0-80.4.2.el8_0.x86_64 cri-o://1.14.8-3.rhaos4.2.el8 ip-10-0-145-136.us-east-2.compute.internal Ready worker 136m v1.14.0+bbfcbc8ac 10.0.145.136 <none> Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa) 4.18.0-80.4.2.el8_0.x86_64 cri-o://1.14.8-3.rhaos4.2.el8 ip-10-0-154-217.us-east-2.compute.internal Ready master 145m v1.14.0+bbfcbc8ac 10.0.154.217 <none> Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa) 4.18.0-80.4.2.el8_0.x86_64 cri-o://1.14.8-3.rhaos4.2.el8 ip-10-0-173-64.us-east-2.compute.internal Ready master 145m v1.14.0+bbfcbc8ac 10.0.173.64 <none> Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa) 4.18.0-80.4.2.el8_0.x86_64 cri-o://1.14.8-3.rhaos4.2.el8 Added 4 secondary IPs sh-4.4# curl -s http://169.254.169.254/latest/meta-data/network/interfaces/macs/06:8b:1b:6c:bc:24/local-ipv4s 10.0.145.136 10.0.150.98 10.0.156.131 10.0.146.68 10.0.155.234 $ aws ec2 describe-instances --instance-ids i-0c27102bad9011ea1 --query 'Reservations[].Instances[].NetworkInterfaces[].PrivateIpAddresses' [ [ { "Primary": true, "PrivateDnsName": "ip-10-0-145-136.us-east-2.compute.internal", "PrivateIpAddress": "10.0.145.136" }, { "Primary": false, "PrivateDnsName": "ip-10-0-150-98.us-east-2.compute.internal", "PrivateIpAddress": "10.0.150.98" }, { "Primary": false, "PrivateDnsName": "ip-10-0-156-131.us-east-2.compute.internal", "PrivateIpAddress": "10.0.156.131" }, { "Primary": false, "PrivateDnsName": "ip-10-0-146-68.us-east-2.compute.internal", "PrivateIpAddress": "10.0.146.68" }, { "Primary": false, "PrivateDnsName": "ip-10-0-155-234.us-east-2.compute.internal", "PrivateIpAddress": "10.0.155.234" } ] ] $ oc describe node ip-10-0-145-136.us-east-2.compute.internal Name: ip-10-0-145-136.us-east-2.compute.internal ... Addresses: InternalIP: 10.0.145.136 InternalIP: 10.0.150.98 InternalIP: 10.0.156.131 InternalIP: 10.0.146.68 InternalIP: 10.0.155.234 InternalDNS: ip-10-0-145-136.us-east-2.compute.internal Hostname: ip-10-0-145-136.us-east-2.compute.internal $ oc get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-10-0-137-203.us-east-2.compute.internal Ready master 4h35m v1.14.0+bbfcbc8ac 10.0.137.203 <none> Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa) 4.18.0-80.4.2.el8_0.x86_64 cri-o://1.14.8-3.rhaos4.2.el8 ip-10-0-137-237.us-east-2.compute.internal Ready worker 4h26m v1.14.0+bbfcbc8ac 10.0.137.237 <none> Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa) 4.18.0-80.4.2.el8_0.x86_64 cri-o://1.14.8-3.rhaos4.2.el8 ip-10-0-145-136.us-east-2.compute.internal Ready worker 4h26m v1.14.0+bbfcbc8ac 10.0.145.136 <none> Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa) 4.18.0-80.4.2.el8_0.x86_64 cri-o://1.14.8-3.rhaos4.2.el8 ip-10-0-154-217.us-east-2.compute.internal Ready master 4h35m v1.14.0+bbfcbc8ac 10.0.154.217 <none> Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa) 4.18.0-80.4.2.el8_0.x86_64 cri-o://1.14.8-3.rhaos4.2.el8 ip-10-0-173-64.us-east-2.compute.internal Ready master 4h35m v1.14.0+bbfcbc8ac 10.0.173.64 <none> Red Hat Enterprise Linux CoreOS 420.8.20190718.1 (Ootpa) 4.18.0-80.4.2.el8_0.x86_64 cri-o://1.14.8-3.rhaos4.2.el8 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922 |