Bug 1438476
| Summary: | [NMCI] nmcli agent race | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Vladimir Benes <vbenes> | ||||||
| Component: | NetworkManager | Assignee: | Beniamino Galvani <bgalvani> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Desktop QE <desktop-qa-list> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | 7.4 | CC: | aloughla, atragler, bgalvani, fgiudici, lrintel, rkhan, sukulkar, thaller | ||||||
| Target Milestone: | rc | Keywords: | Reopened | ||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2018-04-10 13:22:08 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 1470965 | ||||||||
| Attachments: |
|
||||||||
This [NMCI] failure is old. Either (1) the issue has been fixed in the meantime (and CI tests are now green) (2) or the issue is still present, in which case we still have failing tests. Since our CI tests are currently in a good shape, I suspect this issue is long fixed, and I am not going to spend time verifying that it is. If the test is still failing, we anyway will fix it because there is the strong aim to have all green for CI. [NMCI] bugs are very useful to discuss/track an issue. But if they are stale for a while, they are obsolete. Closing. I don't think this is fixed, we have a workaround in place, autoconnect no. And a small delay to have everything settled. Reopening. (In reply to Vladimir Benes from comment #3) > I don't think this is fixed, we have a workaround in place, autoconnect no. > And a small delay to have everything settled. Reopening. I think, if the CI shows a real bug (probably it does), disabling the test only to get it green is wrong. Let's not do such workarounds. If a test fails, it fails. The failing test should be the reminder that this needs fixing. Not a 5 month old bug, where it takes me considerable effort to figure out how to remove the workaround and reproduce the issue. I have one more use case showing this symptom. Running this fails from time to time: nmcli device wifi connect wpa2-eap password secret123 Should I file another bug? (In reply to Vladimir Benes from comment #8) > I have one more use case showing this symptom. > > Running this fails from time to time: > nmcli device wifi connect wpa2-eap password secret123 > > Should I file another bug? If I remember well, this new failure was caused by bug 1490885. In nmcli, since we first register a secret agent and then activate the
connection, the agent receives a GetSecrets() request for the
autoactivation first and for the user activation after, like in the
following sequence:
NMCLI NM DAEMON
* connection add ------->
* start autoactivating the
connection
* autoactivation fails due to
missing secrets
* connection up:
- register the secret agent ------->
* unblock all failed connection
due to missing secrets
* start again autoactivation
* ask secrets to the agent
- activate connection
asynchronously ------->
- ask secrets for the
autoactivation ------->
* user activation preempts
autoactivation
* ask secrets to the agent
- show another prompt
The test only provides credentials once and thus the activation fails.
A possible solution to this would be to add a parameter to the
GetSecrets() call from NM to the agent specifying for which
ActiveConnection the secrets are needed. In this way nmcli could
filter out requests for the wrong ACs.
Since the D-Bus API doesn't allow that at the moment, we have either
to hijack the 'hints' parameter, or to change the API.
The plan to extend the API sounds right to me.
How about adding a "GetSecrets2", that is like "GetSecrets" but has two additional arguments:
"active_connection_path", "o"
"extra_args", "a{sv}"
the last argument is for future additions, so we can avoid "GetSecrets3".
Also, let's check with Lubomir. I think for pkcs11-remoting, he was missing that GetSecrets cannot pass a file descriptor. If we extend the API now, maybe we need a UNIX_FD argument (which we cannot later put inside "extra_args" (I think??).
also, the agent should "RegisterWithCapabilities" with a new capability NM_SECRET_AGENT_CAPABILITY_GETSECRETS2, so that the server knows whether to call v1 or v2. That of course makes it more annoying to implement a new agent, because the agent now must explicitly announce that he supports the non-deprecated function.
The alternatives would be, to first try calling GetSecrets2, and on failure retry with GetSecret. Maybe that's better??
Or, the other alternative might be, to introspect D-Bus to check whether the agent support GetSecrets2. But probably that is more expensive then just try-fail-retry.
while at it, maybe GetSecret2 should also have an
<arg name="extra_results" type="a{sv}}" direction="out"/>
argument.
Created attachment 1344329 [details]
[PATCH 1/2] clients: implement CancelGetSecrets() secret-agent API
Created attachment 1344330 [details]
[PATCH 2/2] cli: enable secret-agent only after activation
Scratch what I said in previous comments, how about these 2 patches instead? (In reply to Beniamino Galvani from comment #15) > Scratch what I said in previous comments, how about these 2 patches instead? lgtm Applied to master: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=4a9ec4d39bd912d925c86e38efd33b7f43e6a39c Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0778 |
Description of problem: @8021x_without_password Scenario: nmcli - ethernet - connect to 8021x - md5 - ask for password * Add a new connection of type "ethernet" and options "ifname testX con-name ethie 802-1x.eap md5 802-1x.identity user" * Spawn "nmcli -a con up ethie" command * Expect "identity.*user" * Enter in editor * Send "password" in editor * Enter in editor Then "testX:connected:ethie" is visible with command "nmcli -t -f DEVICE,STATE,CONNECTION device" in "20" seconds Version-Release number of selected component (if applicable): 1.8 How reproducible: race Steps to Reproduce: see above Actual results: sometimes, nmcli agent seems to be racing for two requests and second connection fails even if everything is entered correctly Expected results: nmcli -a con up should lead to connection successfully connected Additional info: