Bug 1967771
| Summary: | nmstate is not progressing on a node and not configuring vlan filtering that causes an outage for VMs | |||
|---|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | kseremet | |
| Component: | Networking | Assignee: | Quique Llorente <ellorent> | |
| Status: | CLOSED ERRATA | QA Contact: | Ofir Nash <onash> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | high | |||
| Version: | 2.6.3 | CC: | bverschu, cnv-qe-bugs, ncocker, onash, phoracek | |
| Target Milestone: | --- | |||
| Target Release: | 4.8.0 | |||
| Hardware: | x86_64 | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | kubernetes-nmstate-handler-container-v4.8.0-18 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1967887 (view as bug list) | Environment: | ||
| Last Closed: | 2021-07-27 14:32:39 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1967887 | |||
|
Description
kseremet
2021-06-03 20:44:32 UTC
As a workaround we can bypass the rollout buy setting "parallel: false" on the NNCP, but we have to be sure that cluster will be ok if all the nodes are configuring in parallel. The u/s kubernetes-nmstate version is 0.37 u/s fix for CNV 2.6 https://github.com/nmstate/kubernetes-nmstate/pull/763 The workaround is "parallel: true" not "parallel: false". We are going to keep this bz open since solution for 4.8 is different from 2.6 that has it's own bz already https://bugzilla.redhat.com/show_bug.cgi?id=1967887 This may be a blocker for 4.8. In progress fix https://github.com/nmstate/kubernetes-nmstate/pull/771 *** Bug 1973734 has been marked as a duplicate of this bug. *** Team, Had a question. Would node reboots put us back into a state where the nncp is stuck in progressing again? We had node reboots in the cluster and see nncp with "nomatchingnodes" or progressing... Just want clarification. thanks Nabeel I talked with Quique and he confirmed that after reboot, the issue is expected to reappear. This should be too solved with the fix. Are configured interfaces persisted after the reboot, even though the status gets stuck there? Verified on version: nmstate-handler version is: v4.8.0-18. Scenario checked: 1. Create NNCP that configures Linux Bridge on worker node X. 2. Delete the matching nmstate-handler pod of the worker node X while it is progressing (Status: ConfigurationProgressing). 3. Verified new nmstate-handler pod created and releases the lock - causing it to progress and create successfully the NNCP. We have an automation that verifies the exact scenario: https://code.engineering.redhat.com/gerrit/c/cnv-tests/+/250668 (Currently under CR and will be merged once approved) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 4.8.0 Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2920 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days |