Bug 1948533
| Summary: | Several cluster operators are going to degraded state after upgrading to OCP v4.6.22 | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Anandhu B Raj <abraj> |
| Component: | Machine Config Operator | Assignee: | Gal Zaidman <gzaidman> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Michael Burman <mburman> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 4.6.z | CC: | aconstan, adeshpan, aos-bugs, apjagtap, aprajapa, chdeshpa, dahernan, danw, gzaidman, jcrumple, jerzhang, juanluis.alarcon, mburman, mfojtik, nkashyap, oarribas, openshift-bugs-escalate, palonsor, pelauter, prdeshpa, rupatel, sbatsche, skolicha, vpagar, wduan, wking, xxia |
| Target Milestone: | --- | Flags: | palonsor:
needinfo-
|
| Target Release: | 4.6.z | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-05-27 08:05:54 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1957530 | ||
| Bug Blocks: | |||
Thanks for your insights Dan. It was my mistake to believe that the kubelet would be able to figure out correct IP on its own, so thanks for correctly re-routing it. Important to note: The VIP is ours, as per https://github.com/openshift/machine-config-operator/blob/release-4.6/templates/master/00-master/ovirt/files/ovirt-keepalived-keepalived.yaml and https://github.com/openshift/machine-config-operator/blob/release-4.6/templates/master/00-master/ovirt/files/ovirt-keepalived-script.yaml *** Bug 1948020 has been marked as a duplicate of this bug. *** |
> Additional info: > - I have asked customer to workaround the issue by creating drop-in for kubelet.service systemd unit. It sets `KUBELET_NODE_IP` so that kubelet starts with `--node-ip $WHATEVER_IP` and forces to always use correct IP. However, I understand this cannot just be kept forever. Normally KUBELET_NODE_IP gets set by nodeip-configuration.service on platforms where we expect that to be needed. It appears that in 4.6, we run nodeip-configuration.service for bare-metal, vsphere, and openstack, but *not* ovirt. In 4.7, it looks like due to a refactoring of the "on-prem" platform types, we run nodeip-configuration.service for ovirt too. It seems like probably a bug that we are not running it for ovirt in 4.6. (Maybe there just aren't enough ovirt users for us to have noticed?) Specifically: if the installer/MCO is setting up a node that has both its own IP and a keepalived VIP IP, then we need to be running nodeip-configuration because kubelet won't reliably pick the right IP on its own. (If installer/MCO is not setting up that VIP and it's being set up by some third-party operator or customer pod instead, then there's some argument that this isn't our bug, but it would still be pretty easy for us to just run nodeip-configuration.service on ovirt and fix it...)