Bug 1948533 - Several cluster operators are going to degraded state after upgrading to OCP v4.6.22
Summary: Several cluster operators are going to degraded state after upgrading to OCP ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.6.z
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.6.z
Assignee: Gal Zaidman
QA Contact: Michael Burman
URL:
Whiteboard:
: 1948020 (view as bug list)
Depends On: 1957530
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-12 11:49 UTC by Anandhu B Raj
Modified: 2021-06-02 15:11 UTC (History)
27 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-27 08:05:54 UTC
Target Upstream Version:
Embargoed:
palonsor: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 2567 0 None closed Bug 1948533: Add nodeip-configuration.service for oVirt 2021-05-19 11:42:24 UTC
Red Hat Knowledge Base (Solution) 6002051 0 None None None 2021-04-29 07:56:23 UTC

Comment 22 Dan Winship 2021-04-22 19:43:55 UTC
> Additional info:
> - I have asked customer to workaround the issue by creating drop-in for kubelet.service systemd unit. It sets `KUBELET_NODE_IP` so that kubelet starts with `--node-ip $WHATEVER_IP` and forces to always use correct IP. However, I understand  this cannot just be kept forever.

Normally KUBELET_NODE_IP gets set by nodeip-configuration.service on platforms where we expect that to be needed.

It appears that in 4.6, we run nodeip-configuration.service for bare-metal, vsphere, and openstack, but *not* ovirt. In 4.7, it looks like due to a refactoring of the "on-prem" platform types, we run nodeip-configuration.service for ovirt too. It seems like probably a bug that we are not running it for ovirt in 4.6. (Maybe there just aren't enough ovirt users for us to have noticed?)

Specifically: if the installer/MCO is setting up a node that has both its own IP and a keepalived VIP IP, then we need to be running nodeip-configuration because kubelet won't reliably pick the right IP on its own.

(If installer/MCO is not setting up that VIP and it's being set up by some third-party operator or customer pod instead, then there's some argument that this isn't our bug, but it would still be pretty easy for us to just run nodeip-configuration.service on ovirt and fix it...)

Comment 24 Pablo Alonso Rodriguez 2021-04-23 07:37:07 UTC
Thanks for your insights Dan. It was my mistake to believe that the kubelet would be able to figure out correct IP on its own, so thanks for correctly re-routing it.

Important to note: The VIP is ours, as per https://github.com/openshift/machine-config-operator/blob/release-4.6/templates/master/00-master/ovirt/files/ovirt-keepalived-keepalived.yaml and https://github.com/openshift/machine-config-operator/blob/release-4.6/templates/master/00-master/ovirt/files/ovirt-keepalived-script.yaml

Comment 25 Dan Winship 2021-04-26 19:08:59 UTC
*** Bug 1948020 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.