Bug 2096386
Summary: | crio fails to bind to tentative IP, causing service failure since RHOCS was rebased on RHEL 8.6 | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Dan Williams <dcbw> |
Component: | NetworkManager | Assignee: | Beniamino Galvani <bgalvani> |
Status: | CLOSED ERRATA | QA Contact: | David Jaša <djasa> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 8.5 | CC: | bgalvani, dcbw, derekh, ferferna, lrintel, mko, rkhan, sfaye, stbenjam, sukulkar, thaller, till, vbenes |
Target Milestone: | rc | Keywords: | Triaged |
Target Release: | --- | Flags: | mko:
needinfo-
|
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | NetworkManager-1.40.2-1.el8 | Doc Type: | No Doc Update |
Doc Text: | Story Points: | --- | |
Clone Of: | 2096226 | Environment: | |
Last Closed: | 2023-05-16 09:04:54 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 2096226 | ||
Bug Blocks: | |||
Attachments: |
Description
Dan Williams
2022-06-13 17:02:18 UTC
It appears that the DHCPv6 code doesn't wait for addresses DAD to complete before signaling the connection is activated... Seen in 1.36.0-5.el8_6 Testing using WIP scenarios, I tend to think this is FailedQA but I'd need a confirmation. When run in the loop, the scenarios I made occasionally fail for both DHCP-assigned address and for global EUI64 SLAAC address. Logs only mention DAD taking place for fe80:: link-local addresses even at trace level. The failure is not reliable unless run in a loop. Created attachment 1903249 [details]
DAD not effective for DHCP address
Created attachment 1903250 [details]
DAD not effective for EUI64 address
Setting as FailedQA. While less likely to happen, it still can happen and logs do not indicate DAD taking place for global addresses; and even when this happens and NM marks the connection as active/connected. Created attachment 1903934 [details]
EUI64 address - 'full' connectivity with link-local address only
One more thing. In this case, system ended up in state with testX6 device (con_ipv6 connection) having only link-local v6 address, yet NM reports this device as having 'full' v6 connectivity. I couldn't get system to this state in a 'regular' way (not involving duplicate address in network).
Created attachment 1904662 [details]
DAD not effective for EUI64 address
With fixed logging settings in the test system, I could only hit the failure in EUI scenario in ~5 % of runs.
Created attachment 1904674 [details]
EUI64 address - 'full' connectivity with link-local address only
... in the other 95 % of cases of EUI64 scenarios, the system reliably ends up in state with 'full' ipv6 connectivity over link-local address.
Created attachment 1904686 [details]
DAD not effective for DHCP address
Running it in a loop again, we're back to ratio of 5 failures for DHCP scenario per 100 runs as before even with proper level=TRACE logging. One of those runs is attached.
Hi David, I think there is a race condition in the test. The test activates con_ipv6 and then checks that the DHCPv6 address is not configured on the interface. But since IPv6 DAD is performed by kernel, the way it works in NM is that NM first adds the address to the interface; the address initially has flag "tentative". When DAD completes successfully, kernel removes the "tentative" flag and then NM considers the interface ready. In case of collision, the kernel sets the "dadfailed" flag on the address and immediately removes the address from the interface. So, there is a small interval in which the address is configured (as tentative) on the interface, and this causes the test failure. I think we need 2 tests: - one to check that DAD works. Basically, the one that you wrote, but it should wait some time before checking that the address is missing, to ensure that DAD detected the collision; - another test without address collision. In that case, we need to ensure that NM waits DAD to complete before the connection becomes "activated". To do that, it's sufficient to check that the address is non tentative when "nmcli con up" returns. What do you think? (In reply to Beniamino Galvani from comment #18) > Hi David, > > I think there is a race condition in the test. The test activates > con_ipv6 and then checks that the DHCPv6 address is not configured on > the interface. But since IPv6 DAD is performed by kernel, the way it > works in NM is that NM first adds the address to the interface; the > address initially has flag "tentative". > > When DAD completes successfully, kernel removes the "tentative" flag > and then NM considers the interface ready. > > In case of collision, the kernel sets the "dadfailed" flag on the > address and immediately removes the address from the interface. So, > there is a small interval in which the address is configured (as > tentative) on the interface, and this causes the test failure. > Yeah. The problem as of now is multi-layered: 1) as you say test doesn't give time for DAD done this way to take place 2) NM gives no explicit indication of DAD result (kernel does*) 3) NM accepts lease in spite of DAD failure and given just one address (it can be offered e.g ::1234:5678/126 range by DHCP server, then in case of DAD failure of ::1234:5678, it can choose from ::1234:5678[9ab]) I'd like to hear Dan Williams's opinion on what is the correct or least harmful behaviour after address collision of single DHCPv6 address. The interface ends up in state that - it has valid link-local address - it has valid stateless global address (derived from RA), but - it fails to get DHCPv6 address which I assume is being given to it so it gets known predictable addres that is to be used directly or to which is machine's DNS name pointed to? So from POV of the machine itself, the network on this interface is fully working, however for the outside world, the machine is invisible/down. What should NM do or report in such a case? Full connectivity, limited connectivity? Keep configuration as-is, try configuring again right away or after some wait, report connection as failed? * stuff like this in dmsg or journal: [ 5910.336066] IPv6: dup: IPv6 duplicate address 2620:dead:beaf::1234:5678 used by b2:a0:29:95:cc:84 detected! (In reply to Beniamino Galvani from comment #18) > I think we need 2 tests: > > - one to check that DAD works. Basically, the one that you wrote, but > it should wait some time before checking that the address is > missing, to ensure that DAD detected the collision; > > - another test without address collision. In that case, we need to > ensure that NM waits DAD to complete before the connection becomes > "activated". To do that, it's sufficient to check that the address > is non tentative when "nmcli con up" returns. > > What do you think? What you said, plus: - 3rd test verifying that in case of DHCPv6 address with prefix < 128, some other address is chosen (this works but should probably also be covered by the CI) - NM clearly indicating DAD result in the logs at sensible level (as you said, warning in case of collision and probably info or debug if nothing comes) - NM needs to act upon DAD collision: * it must at very least send DHCP DECLINE for that address if I read RFC correctly: https://datatracker.ietf.org/doc/html/rfc8415#page-70 * it must do some sensible action next. I however can't see what it really is when single address was offered and stateless address is present on the interface, see what I ask Dan Williams above. When scope of addresses was offered, rinse & repeat with available addresses... Created attachment 1906581 [details]
report where behaviour described in previous comment is clearly visible
Hi David, I updated the code to send a DHCPv6 decline when all addresses in the lease fail DAD: https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1363 At the moment only the dhclient plugin supports sending a DECLINE. For the internal one we reuse systemd code and it doesn't support declining the lease. In case of DAD failure we report a error in logs, and the transaction continues because it could possibly get another address. This issue has been very initially observed by the Assisted Installer team in OCP 4.11; my understanding is that fixing it in RHEL 8.7 makes that fix available only in OCP 4.12. If this is correct, can we please track the backport? > Fixed In Version: NetworkManager-1.39.10-1.el8 → NetworkManager-1.40.2-1.el8 > Status: POST → MODIFIED The follow up fix was https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/commit/fabefd9bd541345fbbb3b6a258e78405c818f346, which is in 1.40.2. Update the fixed-in-version and move forward to errata. (In reply to Mat Kowalski from comment #26) > This issue has been very initially observed by the Assisted Installer team > in OCP 4.11; my understanding is that fixing it in RHEL 8.7 makes that fix > available only in OCP 4.12. If this is correct, can we please track the > backport? It's "worse". The fix is slanted for the next rhel-release, which will be rhel-8.8. If a Z-stream update for an earlier RHEL release is required (rhel-8.6.z or rhel-8.7.z), then the Z-stream process needs to be followed. Where exactly do you need this fixed? What exact version of NetworkManager are you currently using (in that environment)? Sorry, I am not familiar, which package versions are in which OCP versions. bug 2099794 seems to be a duplicate of this one. And that bug already has Z-stream updates for rhel-8.6 and rhel-8.7 scheduled... the only open question is, that this issue was partly fixed a while ago. that fix is present in 8.7 already and no further action is would be necessary. However, this bug then FailedQA (for some test cases), this bug was reopened and will finally be fixed in 8.8. The question is, whether 8.7 now needs a Z-stream update for the remaining case. (In reply to Thomas Haller from comment #31) > bug 2099794 seems to be a duplicate of this one. Setting as VERIFIED for NM 1.40.2-1.el8 as the same test as in Bug 2099794 comment 37. >If a Z-stream update for an earlier RHEL release is required (rhel-8.6.z or rhel-8.7.z), then the Z-stream process needs to be followed. Where exactly do you need this fixed? What exact version of NetworkManager are you currently using (in that environment)? Sorry, I am not familiar, which package versions are in which OCP versions.
@thaller to get it propagated down to OCP 4.11 we need to have it in RHEL 8.6; this seems to be in line with what we initially observed, i.e. that the issue appeared for the first time in RHEL 8.6
(In reply to Mat Kowalski from comment #35) > >If a Z-stream update for an earlier RHEL release is required (rhel-8.6.z or rhel-8.7.z), then the Z-stream process needs to be followed. Where exactly do you need this fixed? What exact version of NetworkManager are you currently using (in that environment)? Sorry, I am not familiar, which package versions are in which OCP versions. > > @thaller to get it propagated down to OCP 4.11 we need to have it > in RHEL 8.6; this seems to be in line with what we initially observed, i.e. > that the issue appeared for the first time in RHEL 8.6 OK, thanks. As said in comment 31, there is a z-stream update for rhel-8.6 and rhel-8.7 already in progress. Please track those bugs, or comment if anything is missing. Thanks. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (NetworkManager bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2968 |