RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2092215 - [NMCI] dracut_NM_vlan_over_team_no_boot test fails
Summary: [NMCI] dracut_NM_vlan_over_team_no_boot test fails
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: NetworkManager
Version: 9.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: Beniamino Galvani
QA Contact: Filip Pokryvka
URL:
Whiteboard:
Depends On: 2066816 2166257
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-01 06:13 UTC by Thomas Haller
Modified: 2023-03-06 16:09 UTC (History)
11 users (show)

Fixed In Version: NetworkManager-1.41.4-1.el9
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2166257 (view as bug list)
Environment:
Last Closed: 2023-03-06 16:09:05 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github dracutdevs dracut pull 2177 0 None Merged fix(network-manager): add "After" dependency on dbus.service 2023-02-13 14:50:36 UTC
Red Hat Issue Tracker NMT-26 0 None None None 2023-01-20 14:45:54 UTC
Red Hat Issue Tracker RHELPLAN-123833 0 None None None 2022-06-01 06:19:36 UTC
freedesktop.org Gitlab NetworkManager NetworkManager-ci merge_requests 1281 0 None merged dracut: update and enable team tests 2023-02-13 14:50:32 UTC
freedesktop.org Gitlab NetworkManager NetworkManager merge_requests 1351 0 None merged device: don't emit recheck-assume if there is a queued activation request 2023-02-13 14:50:31 UTC
freedesktop.org Gitlab NetworkManager NetworkManager merge_requests 1523 0 None merged device: preserve assume state if updating port fails 2023-02-13 14:50:30 UTC

Comment 1 Beniamino Galvani 2022-07-13 09:46:24 UTC
On RHEL9 this dracut commit is missing https://github.com/dracutdevs/dracut/commit/a97d2cedcf65a9a2fbff2591171f0163c7d3cb46 , needed to properly support teaming in initrd.

Comment 2 Pavel Valena 2022-07-13 22:01:45 UTC
Hello, please test, if you can, whether the planned rebase for 9.1 works for you.

RPMS: https://github.com/pvalena/rpms/tree/main/dracut/2066816

Comment 3 Vladimir Benes 2022-07-15 08:51:06 UTC
(In reply to Pavel Valena from comment #2)
> Hello, please test, if you can, whether the planned rebase for 9.1 works for
> you.
> 
> RPMS: https://github.com/pvalena/rpms/tree/main/dracut/2066816

It's far better but we still are still not 100% stable. Switching back to NetworkManager. When can we expect the 57 version to be in compose?

Comment 4 Beniamino Galvani 2022-07-19 16:09:32 UTC
(In reply to Vladimir Benes from comment #3)
> (In reply to Pavel Valena from comment #2)
> > Hello, please test, if you can, whether the planned rebase for 9.1 works for
> > you.
> > 
> > RPMS: https://github.com/pvalena/rpms/tree/main/dracut/2066816
> 
> It's far better but we still are still not 100% stable.

Hi Vladimir,

do you have a report for a failure using the new dracut? It seems I can't reproduce the problem.

Comment 10 Filip Pokryvka 2022-11-21 10:05:53 UTC
There is still fix in teamd needed, waiting for upstream merge. Moving ITM.

Comment 12 Filip Pokryvka 2023-01-10 07:58:46 UTC
The test is re-enabled (together with new team tests, which seems to alive now due to libteam fixes), if no fail occurs, we can verify.

Comment 13 Filip Pokryvka 2023-01-16 07:58:25 UTC
The problem happened again, setting FailedQA, moving back to ASSIGNED. It happened only once so far, but we need to be stable.

https://desktopqe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/beaker-NetworkManager-main-veth-rhel9-upstream/1954/artifact/artifacts/FAIL_report_NetworkManager-ci_Test0631_dracut_NM_vlan_over_team_no_boot.html
(jenkins build kept forever).

Comments from Beniamino:

> the problem is that after switch-root NM tries to assume the previous
> connection, but since it can't connect to teamd it doesn't re-use the
> existing connection; instead it creates a new one:
> 
>   Jan 11 14:23:38 localhost.localdomain NetworkManager[787]: <debug> [1673465018.6242] device[bbd8a218654cdaa5] (eth0): assume-state: set guess-assume=0, connection="171b737e-afd5-45b7-b929-7c76183a35d6"
>   [...]
>   Jan 11 14:23:38 localhost.localdomain NetworkManager[787]: <debug> [1673465018.6278] device[3c034abd74aecce8] (team0): failure to connect to teamdctl with cli_type=dbus, err=-22
>   Jan 11 14:23:38 localhost.localdomain NetworkManager[787]: <debug> [1673465018.6280] device[3c034abd74aecce8] (team0): failure to connect to teamdctl, err=-22
>   Jan 11 14:23:38 localhost.localdomain NetworkManager[787]: <debug> [1673465018.6280] device[bbd8a218654cdaa5] (eth0): assume-state: set guess-assume=0, connection=(null)
> 
> I don't know the exact reason, this needs to be investigated.

Maybe original problem is fixed and this is consequent, but result is the same, test is unstable. Assumed connections should not be there and connection to teamdctl should be stable. In 1.34 and 1.38 test sometimes fails with VLAN connection not activated in 45 seconds, which might be also related to unstable teamdctl.

Comment 17 Beniamino Galvani 2023-01-20 14:43:50 UTC
Hi Filip,

looking at the failed executions for this test on RHEL 9.1, I see that most are crashes. The libteam version there is 1.31-14.el9.

The crash is solved by the RHEL-9.1.0.1 z-stream update (libteam-1.31-16.el9_1).

The z-stream package also contains a fix to improve the transition between the initrd and real root, and I believe it will help also with the other failures.

Can you please try the new libteam package?

Comment 18 Filip Pokryvka 2023-01-20 16:31:04 UTC
It was RHEL 9.1 updates, with fixed libteam. But so far, issue happened only once (excluding crashes of older libteam, which for some reason reappears in newer compose?)

Comment 19 Beniamino Galvani 2023-01-31 15:52:47 UTC
@fpokryvk I have linked two MRs, one for NM and one for dracut. With those, I got >100 runs of the test without failures, while normally they would fail after <10. Please check them, thanks!

Comment 20 Filip Pokryvka 2023-02-01 06:53:28 UTC
Great news! I will check patched versions. Does it mean, we will need clone of this for dracut, when upstream MR is merged?

Comment 21 Beniamino Galvani 2023-02-01 07:56:49 UTC
> Does it mean, we will need clone of this for dracut, when upstream MR is merged?

Yes, that's correct.

Comment 22 Filip Pokryvka 2023-02-01 10:04:08 UTC
I have checked all dracut tests and it is stable again. Thanks a lot! I will make the clone of this then, and add "blocked by" flag to this bug.

Comment 24 Fernando F. Mancera 2023-03-06 16:09:05 UTC
This test is now passing and stable.


Note You need to log in before you can comment on or make changes to this bug.