Bug 1446367
Summary: | New IPv6 DAD support lets activation without carrier hang indefinitely | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Thomas Haller <thaller> | ||||
Component: | NetworkManager | Assignee: | Lubomir Rintel <lrintel> | ||||
Status: | CLOSED ERRATA | QA Contact: | Desktop QE <desktop-qa-list> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 7.4 | CC: | aloughla, atragler, bgalvani, fgiudici, lrintel, rkhan, sukulkar, thaller, vbenes | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | NetworkManager-1.8.0-1.el7 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-08-01 09:27:08 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Thomas Haller
2017-04-27 18:33:54 UTC
Created attachment 1275293 [details] Proposed patch Well, I believe what we do is the correct thing to do; we ought not default to disabling DAD by default in an case (and currently have no way to disable it) and until DAD finishes the addresses are not useful (programs can't even bind to them). Thus pretending the connection is "activated" certainly is not the correct thing to do. Nevertheless the hanging nmcli and thus also poor interaction with network.service's ifup is something that needs fixing. I'd prefer to work around this the same way as we treat master connections that have no slaves -- don't bother waiting for states beyond IP_CONFIG. Sadly, currently there doesn't seem to be any way for the client to discover the pending IPv6 DAD is the only thing that blocks activation. Thus I've decided to revert the old behaviour and activate the connection if the carrier is not present. Hopefully whoever activates a connection without carrier knows what are they doing. At the very least this behavior is consistent with what we've been doing previously. (In reply to Thomas Haller from comment #0) > DAD can only complete if the device has carrier. Hence, with the new DAD > support, NM waits until the device has carrier before even trying to do DAD. > > As a result, activating a connection with static IPv6 addresses on a device > without carrier hangs, and nmcli fails after timeout. > > It's not clear what to do. > > - at the very least, if the device is set to have ipv6.may-fail=yes and the > device has some static IPv4 addresses, the fully activated state should > be > reached together with IPv4. Currently, the device hangs in IP config > state. > Note that ipv6.may-fail only helps if IPv4 completes. > > - the activation hangs indefinitely waiting for carrier. That may or may > not be correct, but at least it's different from what happens with > waiting for other address methods to complete. Why would waiting for > carrier block indefinitely, but waiting for a DHCP response time-out? Starting DHCP without carrier is silly and probably just done by accident. Nevertheless, I believe considering the auto methods (be it DHCP or SLAAC) is not too useful. Where this really matters is the manual configuration, which, with IPv4 succeeds immediately, but awaits DAD for IPv6. > - for SLAAC/DHCP mode, the activation request is rejected right away if > the device has no carrier (regardless of ignore-carrier setting). > Maybe, if there are static IPv6 addresses, NM should do the same. Which > basically would mean, you cannot activate a connection without carrier > anymore. That seems bad. Well, I can see why this would upset the users, especially those with existing configurations that don't care about the addresses being tentative until the carrier appears. _LOGI (LOGD_DEVICE | LOGD_IP6, "IPv6 DAD: carrier missing and ignored Seems a bit too much noise for info level. _LOGD()? Later, when carrier comes up, will DAD start? When I have 'ipv6.may-fail no' the connection up still hangs. Is it something we want to fix as well? Should I file separate bug? ipv4.may-fail no doesn't matter. and I can see no difference in NetworkManager-1.8.0-0.4.rc3.el7 where this shouldn't be fixed. with this test: @nmcli_general_finish_dad_without_carrier Scenario: nmcli - general - finish dad with no carrier * Add a new connection of type "ethernet" and options "ifname testX con-name ethernet0 autoconnect no" * Prepare simulated veth device "testX" wihout carrier * Execute "nmcli con modify ethernet0 ipv4.may-fail no ipv4.method manual ipv4.addresses 1.2.3.4/24" * Execute "nmcli con modify ethernet0 ipv6.method manual ipv6.addresses 2001::2/128" * Bring "up" connection "ethernet0" * "connected:ethernet0" is visible with command "nmcli -t -f STATE,CONNECTION device" in "60" seconds Then "1.2.3.4" is visible with command "ip a s testX" in "60" seconds Then "2001::2" is visible with command "ip a s testX" in "60" seconds (In reply to Vladimir Benes from comment #5) > When I have 'ipv6.may-fail no' the connection up still hangs. Is it > something we want to fix as well? Should I file separate bug? ipv4.may-fail > no doesn't matter. No, that is the correct behavior. The connection would proceed activating when the carrier appears and DAD can start. The scenario looks like this: * NetworkManager-config-server is installed (thus the carrier is ignored) * The carrier is off * The connection has manually configured ipv6 addressing * The connection has ipv6.may-fail yes With the older NetworkManager, the connection should hang while activating, while with the fixed one it should reach the active state. That pretty much looks like your test case. Does the test case succeed? Does DAD start? (you would see the "tentative" addresses being added on "testX" then the "tentative" flag disappear with "ip monitor"). It should not. Maybe the veth devices are different in this respect? Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2299 |