Bug 2130287
| Summary: | ports can be left attached when controller dependency fails early | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Thomas Haller <thaller> | ||||
| Component: | NetworkManager | Assignee: | Thomas Haller <thaller> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Matej Berezny <mberezny> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 9.1 | CC: | bgalvani, lrintel, rkhan, sfaye, sukulkar, till, vbenes | ||||
| Target Milestone: | rc | Keywords: | Triaged | ||||
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | NetworkManager-1.41.3-1.el9 | Doc Type: | No Doc Update | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2023-05-09 08:17:33 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
patch on review at https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1385 (In reply to Thomas Haller from comment #1) > patch on review at > https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/ > merge_requests/1385 No proposed fix on MR https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1406 Seems this was a regression introduced in 1.40 by commit 1fe8166fc9fb93dc64992325e31e7611725aaeb2. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (NetworkManager bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2485 |
Created attachment 1914638 [details] logfile showing the issue See attached logfile. that was th/mlag-bonding-slb branch (on top of current `main`, 3871c670ab9417fc54d3c0450e91e08ced4a98b4). First, we create a bond profile + 5 port profiles. Then, the bond gets activated with autoconnect-slaves on. During "ip-config" state something happens and, and _LOGD(LOGD_BOND, "balance-slb: failed"); nm_device_state_changed(NM_DEVICE(self), NM_DEVICE_STATE_FAILED, NM_DEVICE_STATE_REASON_CONFIG_FAILED); gets called: <info> [1664299540.9661] device (bond0): state change: secondaries -> failed (reason 'config-failed', sys-iface-state: 'managed') The first time, we are already in state "secondaries". Consequently we see <trace> [1664299540.9667] device[6b76ac7314eb0b53] (bond0): master: release one slave a9f10ea824bb1725/eth1 (enslaved) (configure) and all the port profiles get correctly deactivated. Later, try the same again. This time: <info> [1664299566.1065] device (bond0): state change: ip-config -> failed (reason 'config-failed', sys-iface-state: 'managed') ... <trace> [1664299566.1073] device[6b76ac7314eb0b53] (bond0): master: release one slave a9f10ea824bb1725/eth1 (not enslaved) (configure) the result is that the devices linger indefinitely in ip-config state and don't get wrapped up, although the controller is gone.