1949827 – Kubelet bound to incorrect IPs, referring to incorrect NICs in 4.5.x

Bug 1949827 - Kubelet bound to incorrect IPs, referring to incorrect NICs in 4.5.x

Summary: Kubelet bound to incorrect IPs, referring to incorrect NICs in 4.5.x

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	4.11.0
Assignee:	Ben Nemec
QA Contact:	Victor Voronkov
Docs Contact:
URL:
Whiteboard:
Duplicates (2):	1897743 1957615 (view as bug list)
Depends On:
Blocks:	2071696
TreeView+	depends on / blocked

Reported:	2021-04-15 08:23 UTC by Yash Chouksey
Modified:	2024-10-01 17:56 UTC (History)
CC List:	34 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-08-10 10:36:17 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift machine-config-operator pull 2888	0	None	open	Bug 1949827: Add KUBELET_NODEIP_HINT to nodeip-configuration	2022-02-24 20:43:11 UTC
Red Hat Bugzilla	1920282	1	high	CLOSED	kubelet bound to incorrect nic causing change in the IPs	2024-10-01 17:23:03 UTC
Red Hat Knowledge Base (Solution)	5800261	0	None	None	None	2021-08-12 10:33:56 UTC
Red Hat Product Errata	RHSA-2022:5069	0	None	None	None	2022-08-10 10:36:37 UTC

Comment 15 Matthew Staebler 2021-05-06 16:13:00 UTC

*** Bug 1957615 has been marked as a duplicate of this bug. ***

Comment 48 Ben Nemec 2021-06-11 15:38:19 UTC

This bug is where most of the discussion of the problem has been happening, but since the comments are private I wanted to capture the status publicly so we can duplicate other bugs to this one. Here are the main points:

- This configuration was never officially supported, but in some cases it may have happened to work due to quirks of the old implementation.
- In 4.6 a change was made that caused behavior in multiple nic scenarios to be more consistent, but broke some environments that may have been working on 4.5.
- There were far more environments that had problems with the old behavior so we can't just revert the change, but there is a workaround.
- Documentation of the workaround is not yet available (but is forthcoming). Please let us know if you need details in the meantime.

Comment 49 Ben Nemec 2021-06-11 15:41:58 UTC

*** Bug 1897743 has been marked as a duplicate of this bug. ***

Comment 59 Dan Winship 2021-07-22 14:45:19 UTC

(In reply to comment #58)
> What are the next steps for this BZ?


1. We need a better solution for the future.

It seems to me that the fix is to make UPI work more like IPI; the node IP should be the IP from the interface that has the most direct route to the other nodes, not the interface that has the default route (assuming those are different).

Currently, for UPI, if the MCO knows the apiserver IP at install time, it will write it into the nodeip-configuration service, and so we will pick the IP with the most direct route to the apiserver (just like in the IPI case). However, in some cases (notably vSphere UPI), the MCO does not know the apiserver IP at install time and thus cannot do this. From what I can tell though, the apiserver IP is still *known* (eg, to the installer) in this case, it's just not recorded anywhere that the MCO can see at install time. So if we fix the install-time config plumbing so that the MCO always has access to the configured apiserver IP, then it should always be able to pass that to nodeip-configuration, and we should always get the right IP. (So this probably requires one or more additions to openshift/api, plus changes to openshift/installer to fill in the new API, and changes to openshift/machine-config-operator to consume the new API. It should not require any changes to openshift/baremetal-runtimecfg.)

(It is theoretically possible that "most direct route to the apiserver" is not identical to "most direct route to the other nodes" in some UPI configurations, though this seems like it would require a pretty weird network configuration... Maybe it would be better to pick "the interface that has the most direct route to the `machineNetwork`", but I'm not sure `machineNetwork` is guaranteed to be set/correct for all UPI platforms...)



2. We need a better solution for existing customers until we have the better future solution

The workaround we've discussed here *works*, but it's ugly, and has customer-specific bits which make it hard to document and to provide to other customers running into the same bug.

If we think that "pick the IP from the interface with the most direct route to the other nodes" should work for everyone, then the next-best workaround would be to provide a standardized way to get that, so that instead of needing a complex customer-specific MachineConfig like in comment 42, they'd just have to write something like:

     apiVersion: machineconfiguration.openshift.io/v1
     kind: MachineConfig
     metadata:
       labels:
          machineconfiguration.openshift.io/role: worker
       name: 99-upi-node-ip-override
     spec:
       config:
         ignition:
           version: 3.1.0
         storage:
           files:
           - path: /etc/default/nodeip-configuration
             contents:
               source: data:,KUBE_APISERVER_HINT=192.168.162.3
             mode: 0644
             overwrite: true

and then we provide a more complicated MachineConfig that they install verbatim (eg, "curl http://access.redhat.com/... | oc apply -f -") which will read the file created by the MachineConfig above and pass it to `baremetal-runtimecfg node-ip` so that it will pick the corresponding local IP on the same network.

(The next possible improvement after this would be to merge the "more complicated MachineConfig" into the existing nodeip-configuration service and then backport it, so you can just create the "99-upi-node-ip-override" MachineConfig without needing to manually create the other more complicated MachineConfig as well:

     --- a/templates/common/on-prem/units/nodeip-configuration.service.yaml
     +++ b/templates/common/on-prem/units/nodeip-configuration.service.yaml
     @@ -26,4 +26,5 @@ contents: |
          node-ip \
          set --retry-on-failure \
     +    $KUBE_APISERVER_HINT \
          {{ onPremPlatformAPIServerInternalIP . }}; \
          do \
     @@ -32,4 +33,5 @@ contents: |
        ExecStart=/bin/systemctl daemon-reload
      
     +  EnvironmentFile=-/etc/default/nodeip-configuration
        {{if .Proxy -}}
        EnvironmentFile=/etc/mco/proxy.env

But this is kind of "backporting a new feature" so people may not want to do it, unless we think it's going to be a long time before we can have the proper future fix available.)

Comment 69 errata-xmlrpc 2022-08-10 10:36:17 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069

Comment 70 Red Hat Bugzilla 2023-09-15 01:33:41 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days

Note You need to log in before you can comment on or make changes to this bug.

agarcial
aos-bugs
asegurap
augol
beth.white
bnemec
dahernan
danw
dhellmann
erich
hchatter
jorge.martinezgarcia
jowilkin
kahara
malonso
mcurry
mifiedle
mkrejci
nagrawal
nbhatt
nstielau
oarribas
openshift-bugs-escalate
palonsor
pawankum
racedoro
rbost
rphillips
skumari
smozowei
sponnaga
stbenjam
vpagar
vpickard