1763062 – device reapply when connection is not fully up does not work

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1763062 - device reapply when connection is not fully up does not work

Summary: device reapply when connection is not fully up does not work

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	NetworkManager
Sub Component:
Version:	8.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	8.0
Assignee:	Beniamino Galvani
QA Contact:	Desktop QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-10-18 07:52 UTC by Filip Pokryvka
Modified:	2020-04-28 16:53 UTC (History)
CC List:	9 users (show)
Fixed In Version:	NetworkManager-1.22.0-0.2.el8
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-04-28 16:53:15 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2020:1847	0	None	None	None	2020-04-28 16:53:41 UTC

Description Filip Pokryvka 2019-10-18 07:52:22 UTC

Description of problem:
device reapply when connection is not fully up does not work

Version-Release number of selected component (if applicable):
1.20.0-3.el8

How reproducible:
slow/no dhcp on eth4

Steps to Reproduce:
1. nmcli connection add type ethernet ifname eth4 con-name test
2. nmcli con mod id test ipv4.method manual ipv4.addresses 192.168.8.5/24
3. nmcli device reapply eth4

Actual results:
if DHCP server is slow enough and in step 3. eth4 is still getting IP configuration, then eth4 does not have static address:

Connection 'test' (83f86191-35dd-479d-b939-9d825ab70ba6) successfully added.
Error: Reapplying connection to device 'eth4' (/org/freedesktop/NetworkManager/Devices/8) failed: Device is not activated

Comment 1 Beniamino Galvani 2019-10-23 07:07:38 UTC

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/merge_requests/318

Comment 2 Beniamino Galvani 2019-10-23 14:15:19 UTC

Fixed:

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/commit/01920d3d523795c6f2917007bbf471ac28603371

Comment 4 Vladimir Benes 2020-02-11 10:12:43 UTC

to get proper DHCP reply I need sometimes to wait more than 60 seconds. Is that something we would like to solve here or should I file yet another bug?

so far test looks like this:
    @rhbz1763062
    @ver+=1.22
    @con_general_remove @teardown_testveth @dhcpd
    @device_reapply_routes
    Scenario: NM - device - reapply just routes
    * Prepare simulated test "testG" device
    * Execute "ip netns exec testG_ns kill -SIGSTOP $(cat /tmp/testG_ns.pid)"
    * Add a new connection of type "ethernet" and options "ifname testG con-name con_general"
    * Modify connection "con_general" changing options "ipv4.routes '192.168.5.0/24 192.168.99.111 1' ipv4.route-metric 21 ipv6.method static ipv6.addresses 2000::2/126 ipv6.routes '1010::1/128 2000::1 1'"
    * "Error.*" is not visible with command "nmcli device reapply testG" in "1" seconds
    * Execute "ip netns exec testG_ns kill -SIGCONT $(cat /tmp/testG_ns.pid)"
    When "connected" is visible with command "nmcli -g GENERAL.STATE dev show testG" in "25" seconds
    Then "1010::1 via 2000::1 dev testG\s+proto static\s+metric 1" is visible with command "ip -6 route" in "5" seconds
    And "2000::/126 dev testG\s+proto kernel\s+metric 1" is visible with command "ip -6 route"
    And "192.168.5.0/24 via 192.168.99.111 dev testG\s+proto static\s+metric" is visible with command "ip route"
    And "routers = 192.168.99.1" is visible with command "nmcli con show con_general" in "70" seconds
^^ here we have the long waiting period
    And "default via 192.168.99.1 dev testG\s+proto dhcp\s+metric 21" is visible with command "ip r"

test passes like this.

Comment 5 Beniamino Galvani 2020-02-11 17:04:44 UTC

The long timeout doesn't seem related to this bz. From the log the you provided:

 <info>  [1581414454.0367] audit: op="device-reapply" interface="testG" ifindex=93 args="ipv6.method,ipv6.routes,ipv6.addresses,ipv4.route-metric,ipv4.routes" pid=26382 uid=0 result="success"
 <info>  [1581414455.5971] device (testG): Activation: successful, device activated.
 <debug> [1581414455.6737] dhcp4 (testG): sent DISCOVER to 255.255.255.255
 <debug> [1581414459.9469] dhcp4 (testG): sent DISCOVER to 255.255.255.255
 <debug> [1581414468.7730] dhcp4 (testG): sent DISCOVER to 255.255.255.255
 ...
 <info>  [1581414481.5651] audit: op="connection-delete" uuid="06a90131-43ad-4529-913b-fefd0ca85ae1" name="con_general" pid=26537 uid=0 result="success"

So for some reason the server is not replying, or takes long time to respond. Maybe there are issues in the setup. I tried in a F30 VM and it succeeds immediately:

Scenario: NM - device - reapply just routes
    * Prepare simulated test "testG" device ... passed in 3.150s
    * Execute "ip netns exec testG_ns kill -SIGSTOP $(cat /tmp/testG_ns.pid)" ... passed in 0.306s
    * Add a new connection of type "ethernet" and options "ifname testG con-name con_general" ... passed in 0.157s
    * Modify connection "con_general" changing options "ipv4.routes '192.168.5.0/24 192.168.99.111 1' ipv4.route-metric 21 ipv6.method static ipv6.addresses 2000::2/126 ipv6.routes '1010::1/128 2000::1 1'" ... passed in 0.048s
    * "Error.*" is not visible with command "nmcli device reapply testG" in "1" seconds ... passed in 0.166s
    * Execute "ip netns exec testG_ns kill -SIGCONT $(cat /tmp/testG_ns.pid)" ... passed in 0.305s
    When "connected" is visible with command "nmcli -g GENERAL.STATE dev show testG" in "25" seconds ... passed in 2.372s
    Then "1010::1 via 2000::1 dev testG\s+proto static\s+metric 1" is visible with command "ip -6 route" in "5" seconds ... passed in 0.119s
    And "2000::/126 dev testG\s+proto kernel\s+metric 1" is visible with command "ip -6 route" ... passed in 0.118s
    And "192.168.5.0/24 via 192.168.99.111 dev testG\s+proto static\s+metric" is visible with command "ip route" ... passed in 0.123s
    And "routers = 192.168.99.1" is visible with command "nmcli con show con_general" in "70" seconds ... passed in 0.152s
    And "default via 192.168.99.1 dev testG\s+proto dhcp\s+metric 21" is visible with command "ip r" ... passed in 0.119s

Do you have a machine where the problem can be reproduced?

Comment 6 Vladimir Benes 2020-02-12 17:06:31 UTC

device_reapply_routes test altered to do reapply when the DHCP request is not yet done. This showed the bug and also shows the bug fix now.

Comment 8 errata-xmlrpc 2020-04-28 16:53:15 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1847

Note You need to log in before you can comment on or make changes to this bug.