Bug 2092204
Summary: | Modifying nncp with running virtual machines causes disconnection of VMs from network | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | nijin ashok <nashok> |
Component: | Networking | Assignee: | Quique Llorente <ellorent> |
Networking sub component: | kubernetes-nmstate | QA Contact: | Aleksandra Malykhin <amalykhi> |
Status: | CLOSED WONTFIX | Docs Contact: | |
Severity: | urgent | ||
Priority: | high | CC: | bnemec, cnv-qe-bugs, davegord, ellorent, fdeutsch, gveitmic, jhopper, maydin, phoracek |
Version: | 4.10 | ||
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-12-14 19:31:30 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 2092762 | ||
Bug Blocks: |
Description
nijin ashok
2022-06-01 05:23:43 UTC
This bug can also cause a major network outage of all VMs in the cluster during an upgrade from 2.6 to 4.8. It looks like the vlan filtering was not enabled by default in 2.6 and after the upgrade, the new nmstate-handler adds the vlan to nnce [1]. This causes reconfiguration of all the bridges which results in veth ports getting detached from the bridge. I didn't test this but the logs from an affected environment point to this. [1] https://github.com/nmstate/kubernetes-nmstate/pull/793/commits/bc345316695f1189e4c6fd9e6c731c19735dcb02 I will keep the severity, but lower the priority, to make sure we leave some headroom for cases where a quick fix is needed to resolve an unavoidable breakage. Since the standalone operator was moved out of CNV with 4.11, I'm moving this BZ to the OpenShift Network team. We will be happy to backport new fixes on this topic to CNV 4.10 and 4.11. Note that the issue with veths getting disconnected is not a single issue. There were already a few BZs opened to address this problem in various situations, for example: https://bugzilla.redhat.com/show_bug.cgi?id=2076131 https://bugzilla.redhat.com/show_bug.cgi?id=2035519 We would like to fix these issues at its root, by making sure NetworkManager supports editing of bridges, without disconnecting any interfaces. However, we may also consider a more protective approach where we reject some operations on a higher level. Reassigning to Quique since he has been driving this fix. |