Bug 1949827 - Kubelet bound to incorrect IPs, referring to incorrect NICs in 4.5.x
Summary: Kubelet bound to incorrect IPs, referring to incorrect NICs in 4.5.x
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.5
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.11.0
Assignee: Ben Nemec
QA Contact: Victor Voronkov
URL:
Whiteboard:
: 1897743 1957615 (view as bug list)
Depends On:
Blocks: 2071696
TreeView+ depends on / blocked
 
Reported: 2021-04-15 08:23 UTC by Yash Chouksey
Modified: 2023-12-15 20:12 UTC (History)
34 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 10:36:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 2888 0 None open Bug 1949827: Add KUBELET_NODEIP_HINT to nodeip-configuration 2022-02-24 20:43:11 UTC
Red Hat Bugzilla 1920282 1 high CLOSED kubelet bound to incorrect nic causing change in the IPs 2024-03-25 18:00:04 UTC
Red Hat Knowledge Base (Solution) 5800261 0 None None None 2021-08-12 10:33:56 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 10:36:37 UTC

Comment 15 Matthew Staebler 2021-05-06 16:13:00 UTC
*** Bug 1957615 has been marked as a duplicate of this bug. ***

Comment 48 Ben Nemec 2021-06-11 15:38:19 UTC
This bug is where most of the discussion of the problem has been happening, but since the comments are private I wanted to capture the status publicly so we can duplicate other bugs to this one. Here are the main points:

- This configuration was never officially supported, but in some cases it may have happened to work due to quirks of the old implementation.
- In 4.6 a change was made that caused behavior in multiple nic scenarios to be more consistent, but broke some environments that may have been working on 4.5.
- There were far more environments that had problems with the old behavior so we can't just revert the change, but there is a workaround.
- Documentation of the workaround is not yet available (but is forthcoming). Please let us know if you need details in the meantime.

Comment 49 Ben Nemec 2021-06-11 15:41:58 UTC
*** Bug 1897743 has been marked as a duplicate of this bug. ***

Comment 59 Dan Winship 2021-07-22 14:45:19 UTC
(In reply to comment #58)
> What are the next steps for this BZ?


1. We need a better solution for the future.

It seems to me that the fix is to make UPI work more like IPI; the node IP should be the IP from the interface that has the most direct route to the other nodes, not the interface that has the default route (assuming those are different).

Currently, for UPI, if the MCO knows the apiserver IP at install time, it will write it into the nodeip-configuration service, and so we will pick the IP with the most direct route to the apiserver (just like in the IPI case). However, in some cases (notably vSphere UPI), the MCO does not know the apiserver IP at install time and thus cannot do this. From what I can tell though, the apiserver IP is still *known* (eg, to the installer) in this case, it's just not recorded anywhere that the MCO can see at install time. So if we fix the install-time config plumbing so that the MCO always has access to the configured apiserver IP, then it should always be able to pass that to nodeip-configuration, and we should always get the right IP. (So this probably requires one or more additions to openshift/api, plus changes to openshift/installer to fill in the new API, and changes to openshift/machine-config-operator to consume the new API. It should not require any changes to openshift/baremetal-runtimecfg.)

(It is theoretically possible that "most direct route to the apiserver" is not identical to "most direct route to the other nodes" in some UPI configurations, though this seems like it would require a pretty weird network configuration... Maybe it would be better to pick "the interface that has the most direct route to the `machineNetwork`", but I'm not sure `machineNetwork` is guaranteed to be set/correct for all UPI platforms...)



2. We need a better solution for existing customers until we have the better future solution

The workaround we've discussed here *works*, but it's ugly, and has customer-specific bits which make it hard to document and to provide to other customers running into the same bug.

If we think that "pick the IP from the interface with the most direct route to the other nodes" should work for everyone, then the next-best workaround would be to provide a standardized way to get that, so that instead of needing a complex customer-specific MachineConfig like in comment 42, they'd just have to write something like:

     apiVersion: machineconfiguration.openshift.io/v1
     kind: MachineConfig
     metadata:
       labels:
          machineconfiguration.openshift.io/role: worker
       name: 99-upi-node-ip-override
     spec:
       config:
         ignition:
           version: 3.1.0
         storage:
           files:
           - path: /etc/default/nodeip-configuration
             contents:
               source: data:,KUBE_APISERVER_HINT=192.168.162.3
             mode: 0644
             overwrite: true

and then we provide a more complicated MachineConfig that they install verbatim (eg, "curl http://access.redhat.com/... | oc apply -f -") which will read the file created by the MachineConfig above and pass it to `baremetal-runtimecfg node-ip` so that it will pick the corresponding local IP on the same network.

(The next possible improvement after this would be to merge the "more complicated MachineConfig" into the existing nodeip-configuration service and then backport it, so you can just create the "99-upi-node-ip-override" MachineConfig without needing to manually create the other more complicated MachineConfig as well:

     --- a/templates/common/on-prem/units/nodeip-configuration.service.yaml
     +++ b/templates/common/on-prem/units/nodeip-configuration.service.yaml
     @@ -26,4 +26,5 @@ contents: |
          node-ip \
          set --retry-on-failure \
     +    $KUBE_APISERVER_HINT \
          {{ onPremPlatformAPIServerInternalIP . }}; \
          do \
     @@ -32,4 +33,5 @@ contents: |
        ExecStart=/bin/systemctl daemon-reload
      
     +  EnvironmentFile=-/etc/default/nodeip-configuration
        {{if .Proxy -}}
        EnvironmentFile=/etc/mco/proxy.env

But this is kind of "backporting a new feature" so people may not want to do it, unless we think it's going to be a long time before we can have the proper future fix available.)

Comment 69 errata-xmlrpc 2022-08-10 10:36:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069

Comment 70 Red Hat Bugzilla 2023-09-15 01:33:41 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days


Note You need to log in before you can comment on or make changes to this bug.