Bug 1896469
| Summary: | In cluster with OVN Kubernetes networking - a node doesn't recover when configuring linux-bridge over its default NIC | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Yossi Segev <ysegev> | ||||||||||
| Component: | Networking | Assignee: | Quique Llorente <ellorent> | ||||||||||
| Status: | CLOSED ERRATA | QA Contact: | Meni Yakove <myakove> | ||||||||||
| Severity: | unspecified | Docs Contact: | |||||||||||
| Priority: | unspecified | ||||||||||||
| Version: | 2.5.0 | CC: | cnv-qe-bugs, ellorent, phoracek, yboaron | ||||||||||
| Target Milestone: | --- | Flags: | yboaron:
needinfo-
|
||||||||||
| Target Release: | 4.9.0 | ||||||||||||
| Hardware: | Unspecified | ||||||||||||
| OS: | Unspecified | ||||||||||||
| Whiteboard: | |||||||||||||
| Fixed In Version: | kubernetes-nmstate-handler-container-v4.9.0-10 | Doc Type: | If docs needed, set a value | ||||||||||
| Doc Text: | Story Points: | --- | |||||||||||
| Clone Of: | Environment: | ||||||||||||
| Last Closed: | 2021-11-02 15:57:26 UTC | Type: | Bug | ||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||
| Documentation: | --- | CRM: | |||||||||||
| Verified Versions: | Category: | --- | |||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
| Embargoed: | |||||||||||||
| Bug Depends On: | 1915850 | ||||||||||||
| Bug Blocks: | |||||||||||||
| Attachments: |
|
||||||||||||
|
Description
Yossi Segev
2020-11-10 16:06:32 UTC
Created attachment 1738972 [details]
nmstate handler pod logs
Deferring this to 2.7. OVN is still a tech preview and we document that it is not allowed to reconfigure the default iface using knmstate when OVN is used. @yboaron can we retest this with latest CNV ? Created attachment 1746999 [details]
nmstate crictl logs cnv 2.6
This is the logs taking directly from the node since we lose TCP connectivity it's done using openstack novnc.
Looks like nmstate is not able to rollback this kind of configuration since it involves linux-bridge and ovs also the ping we do after rollback is failing (since nmstate is not able to do the rollback) and it ends with handler trying to mark NNCE as success (wich is wrong) but it cannot since apiserver connectivity is broken. Also I suspect that nmstate 1.0 will fix that since it does not allow from the beginning to have the same slave a multiple devices, so it should be fixed ad CNV 2.8. Also note that restaring the node make it accessible again. @ellorent , I think you tagged the wrong Yossi Created attachment 1747053 [details]
NetworkManager at debug level
Created attachment 1747055 [details]
NodeNetworkState before apply the policy
Created attachment 1747056 [details]
policy applied
Bug openned at nmstate team https://bugzilla.redhat.com/show_bug.cgi?id=1915850 Just as a sidenot restarting the worker restores the connectivity. Moving this tracker to NEW. Keeping it until the linked nmstate bug gets resolved. Rollback is working fine at CNV 4.8 with nmstdate 1.0.2, now we have to see if veth works fine too. (In reply to Quique Llorente from comment #14) > Rollback is working fine at CNV 4.8 with nmstdate 1.0.2, now we have to see > if veth works fine too. The cluster was using openshift-sdn not OVNKubernetes. This should be now addressed in the latest rebuild of 4.9. Verified on cluster with OVN Kubernetes Networking. Version verified: kubernetes-nmstate-handler-container version is: v4.9.0-18 Steps verified: 1. Create and applied Linux Bridge over default NIC (Took from here: https://bugzilla.redhat.com/show_bug.cgi?id=1885605) 2. The nodes that the NNCP applied on recovered and are on status Ready: [cnv-qe-jenkins@onash-490-ovn-9nbdm-executor extract-cnv-image-versions]$ oc get nodes NAME STATUS ROLES AGE VERSION onash-490-ovn-9nbdm-master-0 Ready master 134m v1.21.1+8268f88 onash-490-ovn-9nbdm-master-1 Ready master 134m v1.21.1+8268f88 onash-490-ovn-9nbdm-master-2 Ready master 133m v1.21.1+8268f88 onash-490-ovn-9nbdm-worker-0-8btgf Ready worker 117m v1.21.1+8268f88 onash-490-ovn-9nbdm-worker-0-fvqv8 Ready worker 117m v1.21.1+8268f88 onash-490-ovn-9nbdm-worker-0-vzgqb Ready worker 113m v1.21.1+8268f88 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 4.9.0 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:4104 |