Bug 2155991
Summary: | VLAN device doesn't activate after reload | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Andrea Panattoni <apanatto> | ||||
Component: | NetworkManager | Assignee: | Beniamino Galvani <bgalvani> | ||||
Status: | CLOSED ERRATA | QA Contact: | Filip Pokryvka <fpokryvk> | ||||
Severity: | low | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 9.0 | CC: | bgalvani, fpokryvk, lrintel, manrodri, rkhan, sdodson, sfaye, sukulkar, thaller, till, vbenes | ||||
Target Milestone: | rc | Keywords: | Triaged | ||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | NetworkManager-1.43.4-1.el9 | Doc Type: | No Doc Update | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 2159758 (view as bug list) | Environment: | |||||
Last Closed: | 2023-11-07 08:37:57 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 2159758 | ||||||
Attachments: |
|
Description
Andrea Panattoni
2022-12-23 09:57:56 UTC
If this is root cause for the referenced OCPBUGS-3612[1], this represents a regression for all OCP clusters upgrading from 4.10 to 4.11 or 4.12. Since 4.11 has already shipped and 4.12 is imminently shipping we need this to be looked into urgently. 1 - https://issues.redhat.com/browse/OCPBUGS-3612 Hi Andrea, I'm working on a patch to fix this bug. Unfortunately, I can't reproduce the problem locally; would you test (or ask the customer to test) a scratch build once there is a proposed fix? Hi @bgalvani , thanks for the update. @manrodri Do you think it's possible to test a scratch build (supposing an RPM) on the distributed CI you used for https://issues.redhat.com/browse/OCPBUGS-3612? We should test the ovs-configure.sh script without the `touch /run/configure-ovs-boot-done` workaround. Hi @apanatto, In our Distributed CI we can install clusters from several releases (nightly builds, EC, RC) of OCP, but if I'm understanding correctly we'll need an OCP cluster running 4.12 without the work-around, then we'll upgrade an RPM package and I guess, reboot the nodes to test? Please let me know if that's the case, I can prepare a cluster. Thanks, It would be better to install the RPM package before the OpenShift installation, but I suppose it's not that easy as it's all automated. If you install the package later in the process, we need to be sure a simple reboot does not solve the problem. So the steps are slightly different: 1. Setup OCP cluster 4.12 2. Apply the MachineConfig that makes configure-ovs.sh fail 3. Reboot the node and check if it still fails 4. Install the RPM provided by Beniamino 5. Reboot the node 6. Check if it comes up without errors @manrodri do you think it is feasible? @bgalvani any feedback on the above process? The test procedure above looks ok. @apanatto thanks for the details, that looks good to me, I've never installed an RPM in an OCP node, but I'm up for testing, so please let me know when you have an RPM available and I'll run the procedure. @manrodri did you have the chance to do more tests on this? Although the nature of this problem is not so deterministic, can we say it improves the overall stability of the startup process? *** Bug 2159758 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (NetworkManager bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:6585 |