Bug 1792749
Summary: | worker pool is not able to reconcile change in machine config | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Alay Patel <alpatel> |
Component: | Node | Assignee: | Urvashi Mohnani <umohnani> |
Status: | CLOSED ERRATA | QA Contact: | Sunil Choudhary <schoudha> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.4 | CC: | aconstan, amurdaca, aos-bugs, augol, fpaoline, fsimonce, jokerman, miabbott, minmli, msluiter, mvirgil, rbartal, rphillips, sbatsche, scuppett, smilner, umohnani, wking |
Target Milestone: | --- | ||
Target Release: | 4.4.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-05-04 11:25:32 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1771572 |
Description
Alay Patel
2020-01-19 16:35:23 UTC
Is this something for the SDN team? the only change related to network that I can find that went in recently is https://github.com/openshift/machine-config-operator/commit/b0637fc8a51618842aac832b7363ce219ac731c2 (cleanup sdn ips on reboot) cc'ing Ryan on that and moving to SDN *** Bug 1793012 has been marked as a duplicate of this bug. *** The RHCOS/CRI-O rollback brought the CI in MCO back to normal - there’s at least one PR in crio meant to fix this bug, I’m moving this to Node then. Might be a dup of bug 1794493. We'll see if this clears up once [1] lands (and we revert the RHCOS rollback?). [1]: https://github.com/openshift/machine-config-operator/pull/1405 Sorry to bother. Just to understand what to track: is this going to be fixed by the RHCOS rollback, by the CRI-O fix [1] or via https://bugzilla.redhat.com/show_bug.cgi?id=1794493 ? [1]: https://github.com/cri-o/cri-o/pull/3138 (In reply to Federico Paolinelli from comment #11) > Sorry to bother. > Just to understand what to track: is this going to be fixed by the RHCOS > rollback, by the CRI-O fix [1] or via > https://bugzilla.redhat.com/show_bug.cgi?id=1794493 ? > > > [1]: https://github.com/cri-o/cri-o/pull/3138 This has been fixed temporarily in the CI by rolling back CRI-O to v1.16.x The CRI-O PR aims at fixing this permanently (it's just a revert tho, I'll let the crio team speak for it). The BZ you linked (https://bugzilla.redhat.com/show_bug.cgi?id=1794493) has nothing to do with this also - that's another BZ being tracked there (which targets 4.4) Hi Alay, can you tell me how to get the repo in your reproduce step : Run `make test-e2e` ? confirm cri-o 1.17 with CoreOS 44.81.202003092122-0, verified! $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.4.0-0.nightly-2020-03-10-002851 True False 26h Cluster version is 4.4.0-0.nightly-2020-03-10-002851 $oc get node -owide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-10-0-56-239.us-east-2.compute.internal Ready master 27h v1.17.1 10.0.56.239 <none> Red Hat Enterprise Linux CoreOS 44.81.202003092122-0 (Ootpa) 4.18.0-147.5.1.el8_1.x86_64 cri-o://1.17.0-8.dev.rhaos4.4.git36920a5.el8 ip-10-0-60-181.us-east-2.compute.internal Ready worker 24h v1.17.1 10.0.60.181 <none> Red Hat Enterprise Linux Server 7.6 (Maipo) 3.10.0-1062.12.1.el7.x86_64 cri-o://1.17.0-8.dev.rhaos4.4.git36920a5.el7 ip-10-0-62-127.us-east-2.compute.internal Ready master 27h v1.17.1 10.0.62.127 <none> Red Hat Enterprise Linux CoreOS 44.81.202003092122-0 (Ootpa) 4.18.0-147.5.1.el8_1.x86_64 cri-o://1.17.0-8.dev.rhaos4.4.git36920a5.el8 ip-10-0-62-200.us-east-2.compute.internal Ready worker 24h v1.17.1 10.0.62.200 <none> Red Hat Enterprise Linux Server 7.6 (Maipo) 3.10.0-1062.12.1.el7.x86_64 cri-o://1.17.0-8.dev.rhaos4.4.git36920a5.el7 ip-10-0-63-25.us-east-2.compute.internal Ready worker 26h v1.17.1 10.0.63.25 <none> Red Hat Enterprise Linux CoreOS 44.81.202003092122-0 (Ootpa) 4.18.0-147.5.1.el8_1.x86_64 cri-o://1.17.0-8.dev.rhaos4.4.git36920a5.el8 ip-10-0-63-67.us-east-2.compute.internal Ready worker 24h v1.17.1 10.0.63.67 <none> Red Hat Enterprise Linux Server 7.6 (Maipo) 3.10.0-1062.12.1.el7.x86_64 cri-o://1.17.0-8.dev.rhaos4.4.git36920a5.el7 ip-10-0-63-75.us-east-2.compute.internal Ready worker 26h v1.17.1 10.0.63.75 <none> Red Hat Enterprise Linux CoreOS 44.81.202003092122-0 (Ootpa) 4.18.0-147.5.1.el8_1.x86_64 cri-o://1.17.0-8.dev.rhaos4.4.git36920a5.el8 ip-10-0-68-250.us-east-2.compute.internal Ready master 27h v1.17.1 10.0.68.250 <none> Red Hat Enterprise Linux CoreOS 44.81.202003092122-0 (Ootpa) 4.18.0-147.5.1.el8_1.x86_64 cri-o://1.17.0-8.dev.rhaos4.4.git36920a5.el8 ip-10-0-69-52.us-east-2.compute.internal Ready worker 26h v1.17.1 10.0.69.52 <none> Red Hat Enterprise Linux CoreOS 44.81.202003092122-0 (Ootpa) 4.18.0-147.5.1.el8_1.x86_64 cri-o://1.17.0-8.dev.rhaos4.4.git36920a5.el8 $ oc debug node/ip-10-0-56-239.us-east-2.compute.internal sh-4.4# rpm -qa | grep cri-o cri-o-1.17.0-8.dev.rhaos4.4.git36920a5.el8.x86_64 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |