Bug 1450219 - [NMCI] race in bond_rename test
Summary: [NMCI] race in bond_rename test
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: NetworkManager
Version: 7.4
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Beniamino Galvani
QA Contact: Desktop QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-05-11 21:24 UTC by Vladimir Benes
Modified: 2018-04-10 13:23 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-10 13:22:08 UTC
Target Upstream Version:


Attachments (Terms of Use)
debug log (159.06 KB, text/plain)
2017-05-11 21:24 UTC, Vladimir Benes
no flags Details
[PATCH] manager: avoid that auto-activations preempt user activations (2.57 KB, patch)
2017-06-16 15:51 UTC, Beniamino Galvani
no flags Details | Diff
[PATCH v2] manager: avoid that auto-activations preempt user activations (2.63 KB, patch)
2017-06-18 12:48 UTC, Beniamino Galvani
no flags Details | Diff


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0778 None None None 2018-04-10 13:23:39 UTC

Description Vladimir Benes 2017-05-11 21:24:01 UTC
Description of problem:
    Scenario: NM - bond - device rename
     * Add connection type "bond" named "bond0" for device "bondy"
     * Add slave connection for master "nm-bond" on device "eth1" named "bond0.0"
     * Add slave connection for master "nm-bond" on device "eth2" named "bond0.1"
     * Bring "down" connection "bond0"
     * Open editor for connection "bond0"
     * Set a property named "connection.interface-name" to "nm-bond" in editor
     * Save in editor
     Then Value saved message showed in editor
     * Quit editor
^^ often failing here with:
Error: Connection activation failed: New connection activation was enqueued

     * Bring "up" connection "bond0"
     * Bring "up" connection "bond0.0"
     * Bring "up" connection "bond0.1"
     Then Check bond "nm-bond" link state is "up"


Version-Release number of selected component (if applicable):
I think from 1.4 over 1.6 and still present in 1.8

Comment 1 Vladimir Benes 2017-05-11 21:24:30 UTC
Created attachment 1278036 [details]
debug log

Comment 2 Vladimir Benes 2017-05-11 21:25:12 UTC
workaround is to add few seconds sleep after     
 * Bring "down" connection "bond0"

Comment 3 Beniamino Galvani 2017-06-16 15:51:31 UTC
Created attachment 1288394 [details]
[PATCH] manager: avoid that auto-activations preempt user activations

Comment 4 Thomas Haller 2017-06-16 16:14:40 UTC
+    if (nm_auth_subject_is_internal (nm_active_connection_get_subject (active))) 

if (success &&




why the check for
+    && nm_streq0 (nm_active_connection_get_specific_object (candidate), 
                   nm_active_connection_get_specific_object (active))) 

? It seems that is not necessary? The specific-object is like the path to the WifiAP. Seems to me, it doesn't matter if they differ...



Should the check however consider the state of candidate? E.g. if candidate is already about to disconnect, it seems right to proceed with new activation? Dunno.


But good catch, for this issue!!

Comment 5 Beniamino Galvani 2017-06-18 12:48:30 UTC
Created attachment 1288844 [details]
[PATCH v2] manager: avoid that auto-activations preempt user activations

(In reply to Thomas Haller from comment #4)
> why the check for
> +    && nm_streq0 (nm_active_connection_get_specific_object (candidate), 
>                    nm_active_connection_get_specific_object (active))) 
> 
> ? It seems that is not necessary? The specific-object is like the path to
> the WifiAP. Seems to me, it doesn't matter if they differ...

Good point, fixed.

> Should the check however consider the state of candidate? E.g. if candidate
> is already about to disconnect, it seems right to proceed with new
> activation? Dunno.

Yes, makes sense.

Comment 6 Thomas Haller 2017-06-19 08:49:22 UTC

+    if (nm_auth_subject_is_internal (nm_active_connection_get_subject (active))) 

if (success && ...

Comment 7 Beniamino Galvani 2017-06-19 14:10:49 UTC
(In reply to Thomas Haller from comment #6)
> 
> +    if (nm_auth_subject_is_internal (nm_active_connection_get_subject
> (active))) 
> 
> if (success && ...

Ops, fixed.

Applied to master:

https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=0922a177385be188b9c9c8ad39c1068533f5a4b3

and nm-1-8:

https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?h=nm-1-8&id=2236c3c728c49d2ebd68e83f1096b5180b2f41dd


After this fix, the following CI test should work reliably without
the extra delay:

https://github.com/NetworkManager/NetworkManager-ci/blob/82dd537b29b5652dc269ef89ca229098877d9100/nmcli/features/bond.feature#L1267

Comment 9 Vladimir Benes 2017-12-06 08:23:11 UTC
New version of test for 1.8.1 introduced w/o the delay after bond down
     # VVV Workaround for rhbz1450219
     * Wait for at least "2" seconds

Comment 12 errata-xmlrpc 2018-04-10 13:22:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0778


Note You need to log in before you can comment on or make changes to this bug.