Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2050216

Summary: Device is failing on DHCPv4 after NM restart
Product: Red Hat Enterprise Linux 9 Reporter: Vladimir Benes <vbenes>
Component: NetworkManagerAssignee: Fernando F. Mancera <ferferna>
Status: CLOSED ERRATA QA Contact: Vladimir Benes <vbenes>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 9.0CC: bgalvani, ferferna, lrintel, miabbott, rkhan, sdodson, sukulkar, thaller, till
Target Milestone: rcKeywords: Triaged
Target Release: ---Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: NetworkManager-1.39.3-1.el9 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-11-15 10:49:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2077605    

Description Vladimir Benes 2022-02-03 13:09:57 UTC
Description of problem:
    @rhbz1086906
    @veth @delete_testeth0 @newveth @con_general_remove @teardown_testveth @restart_if_needed
    @wait-online-for-both-ips
    Scenario: NM - general - wait-online - for both ipv4 and ipv6
    * Prepare simulated test "testG" device
    * Add a new connection of type "ethernet" and options "ifname testG con-name con_general ipv4.may-fail no ipv6.may-fail no"
    * Restart NM
    * Execute "/usr/bin/nm-online -s -q --timeout=30"
    When "inet .* global" is visible with command "ip a s testG"
    Then "inet6 .* global" is visible with command "ip a s testG"

This fails here and there when IPv4 is slower than IPv6 and the profile is not fully connected and the device is not fully activated. We need to define what to do here. The test was updated with waiting for the connected state so please remove the line if you want to reproduce it. We need to add another test with a delayed DHCP server anyway.

Version-Release number of selected component (if applicable):
1.36.0

How reproducible:
when DHCP is slow

Steps to Reproduce:
1. run the above-mentioned test from NMCI

Actual results:
the test is racy

Expected results:
determinism

Additional info:

Comment 2 Thomas Haller 2022-02-03 13:34:49 UTC
> This fails here and there when IPv4 is slower than IPv6 and the profile is not fully connected and the device is not fully activated.

Not really. The two steps

      * Add a new connection of type "ethernet" and options "ifname testG con-name con_general ipv4.may-fail no ipv6.may-fail no"
      * Restart NM

can happen fast after each other, where the new profile did not yet complete (auto)activation and is still activating.

Then when restarting NM, it "assumes" the device, that was not fully configured earlier.


> We need to define what to do here.

I guess the solution is that during stop, we tear down interfaces that are still activating.


> The test was updated with waiting for the connected state so please remove the line if you want to reproduce it.

This: https://gitlab.freedesktop.org/NetworkManager/NetworkManager-ci/-/commit/07beacb8b540a134a9732b4d8beac522d7c57a5c

Comment 6 Vladimir Benes 2022-05-03 14:20:16 UTC
    @rhbz1086906
    @delete_testeth0 @restart_if_needed
    @wait-online-for-both-ips
    Scenario: NM - general - wait-online - for both ipv4 and ipv6
    * Prepare simulated test "testG" device
    * Add "ethernet" connection named "con_general" for device "testG" with options "ipv4.may-fail no ipv6.may-fail no"
    * Restart NM
    * Execute "/usr/bin/nm-online -s -q --timeout=30"
    When "inet .* global" is visible with command "ip a s testG"
    Then "inet6 .* global" is visible with command "ip a s testG"

tested 100 times w/o any issue with main branch copr package and original test as shown above

Comment 9 Vladimir Benes 2022-05-19 07:29:59 UTC
We have a new test in NMCI
https://gitlab.freedesktop.org/NetworkManager/NetworkManager-ci/-/merge_requests/1050

It randomly slows down the DHCP server and does a service restart. This covers both situations when DHCPv4 is done or not.

Comment 11 errata-xmlrpc 2022-11-15 10:49:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (NetworkManager bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8265