Bug 1592596
Summary: | Regression: with stacked VPNs active, second VPN to activate uses "incorrect" routing to connect to its server | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Dimitris <dimitris.on.linux> | ||||
Component: | NetworkManager-openvpn | Assignee: | David Sommerseth <dazo> | ||||
Status: | CLOSED EOL | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 30 | CC: | bgalvani, choeger, code, dazo, dcbw, dimitris.on.linux, huzaifas, klember, lkundrak, steve, thaller | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2020-05-26 17:44:06 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Dimitris
2018-06-18 23:55:57 UTC
How do you start the VPN tunnels? Via NetworkManager? systemd? Both are started with NetworkManager. I'm under the impression though that the remote host bypass route is added by the openvpn, is that correct? I'm fuzzy about the implementation details between OpenVPN and NetworkManager. I do know the NM-openvpn-plugin is picking up configuration details for the tun/tap device and lets NM do some of the work while some is handled by OpenVPN itself. In addition, F-26, F-27 and F-28 uses the exact same upstream OpenVPN version. The areas where OpenVPN does configuration changes, that is handled via iproute2 on Fedora. Since this worked fine on Fedora versions older than F-28 ... this smells a bit like either iproute2 changed (perhaps OpenVPN needs to do something slightly differently) and/or that NetworkManager-openvpn does something different. Can you please try to start your VPN tunnels from the command line (openvpn --config /path/to/config) and see if it behaves differently? Unfortunately under NetworkManager-openvpn there's no "real" openvpn config file. The NM config is translated into command line options, which include: --up /usr/libexec/nm-openvpn-service-openvpn-helper --management /var/run/NetworkManager/nm-openvpn-<uuid> so I think you're right, the interesting changes seem to be in NM and/or iproute2. That's right. You can extract most of what you need for the configuration file outside of NM via the command line. The --up and --management can be ignored for now (you might need to do DNS changes manually, though). It would be interesting to know if it is a NM issue, iproute2 or OpenVPN 2's implementation of iproute2. So if you have a chance to fully test this, it would be appreciated a lot! Hi, can you please provide NM journal logs for the issue, possibly at TRACE level [1]? Thanks. [1] https://cgit.freedesktop.org/NetworkManager/NetworkManager/tree/contrib/fedora/rpm/NetworkManager.conf#n28 Created attachment 1453630 [details]
NM log at TRACE level starting the devops (non-default-route) VPN
Attaching TRACE log while starting the second, non-default-route, VPN. I've sanitized some IP addresses:
<remote IP> is this VPN's remote IP address
<remote port> is the port for the same
<default VPN remote IP> is the already-connected VPN's remote IP address
The "offending" route is added after the VPN connection is established; 192.168.1.1 is the local WLAN gateway, even though the pre-existing tunnel's peer is in the routing table with a lower metric, and the UDP connection to the new remote started out over that route:
Jun 21 18:56:03 vimes NetworkManager[1602]: <trace> [1529632563.9994] platform: route: get IPv4 route for: <remote IP> oif 4
Jun 21 18:56:03 vimes NetworkManager[1602]: <trace> [1529632563.9995] platform-linux: event-notification: RTM_NEWROUTE, flags 0, seq 351: <remote IP>/32 via 192.168.1.1 dev 4 metric 0 mss 0 rt-src rt-unspec rtm_flags cloned scope global pref-src 192.168.1.169
Jun 21 18:56:03 vimes NetworkManager[1602]: <debug> [1529632563.9995] platform: route: get IPv4 route for: <remote IP> succeeded: <remote IP>/32 via 192.168.1.1 dev 4 metric 0 mss 0 rt-src rt-unspec rtm_flags cloned scope global pref-src 192.168.1.169
Jun 21 18:56:03 vimes NetworkManager[1602]: <debug> [1529632563.9996] device[0x55d5837b8050] (wlp3s0): ip4-config: update (commit=1, new-config=0x55d5838866c0)
Jun 21 18:56:03 vimes NetworkManager[1602]: <debug> [1529632563.9996] platform: address: adding or updating IPv4 address: 192.168.1.169/24 lft 28402sec pref 28402sec lifetime 36197-0[28402,28402] dev 4 flags noprefixroute src unknown
Jun 21 18:56:03 vimes NetworkManager[1602]: <trace> [1529632563.9997] platform-linux: event-notification: RTM_NEWADDR, flags 0, seq 352: 192.168.1.169/24 lft 28402sec pref 28402sec lifetime 36197-36197[28402,28402] dev 4 flags noprefixroute src kernel
Jun 21 18:56:03 vimes NetworkManager[1602]: <debug> [1529632563.9997] platform: signal: address 4 changed: 192.168.1.169/24 lft 28402sec pref 28402sec lifetime 36197-36197[28402,28402] dev 4 flags noprefixroute src kernel
Jun 21 18:56:03 vimes NetworkManager[1602]: <debug> [1529632563.9997] device[0x55d5837b8050] (wlp3s0): queued IP4 config change
Jun 21 18:56:03 vimes NetworkManager[1602]: <debug> [1529632563.9997] platform-linux: do-add-ip4-address[4: 192.168.1.169/24]: success
Jun 21 18:56:03 vimes NetworkManager[1602]: <debug> [1529632563.9998] platform: route: append IPv4 route: <remote IP>/32 via 192.168.1.1 dev 4 metric 600 mss 0 rt-src vpn
Jun 21 18:56:03 vimes NetworkManager[1602]: <trace> [1529632563.9998] platform-linux: event-notification: RTM_NEWROUTE, flags excl,create, seq 353: <remote IP>/32 via 192.168.1.1 dev 4 metric 600 mss 0 rt-src rt-static scope global
Jun 21 18:56:03 vimes NetworkManager[1602]: <debug> [1529632563.9998] platform: signal: route 4 added: <remote IP>/32 via 192.168.1.1 dev 4 metric 600 mss 0 rt-src rt-static scope global
Jun 21 18:56:03 vimes NetworkManager[1602]: <debug> [1529632563.9999] platform-linux: do-add-ip4-route[<remote IP>/32 via 192.168.1.1 dev 4 metric 600 mss 0 rt-src rt-static scope global]: success
Wwhen the first VPN activates, a in-memory connection for tun0 gets created. On NM 1.8 such connection became the primary connection and NM then used it to to reach the 2nd VPN gateway. Since 1.10, it seems we don't update the primary connection after first VPN connects, and so the 2nd VPN gateway is still reached through the Wi-Fi interface. But even if the primary connection was properly set to tun0, the tun0 connection is 'external' and so normally NM wouldn't add a new route through it. I'm investigating how to fix these two issues. Thanks for looking into this. For what it's worth, routing the second VPN's tunnel through the WLAN interface isn't bad - it actually improves performance (tunnel overhead and MTU). If it's possible to move the addition of the /32 route to the VPN server to *before* the VPN connection is made, I think this will let us have the best of both worlds. It should result in tun1 traffic also using the previous default route, despite its higher metric. What I'm seeing now is that, after the second VPN connection is made, outbound tun1 tunnel packets are going out of the WLAN interface, but the remote server sends back traffic over the route that has them arrive over tun0. I've tried adding --float to the tun1 client's config. The very first time I tried it, tun1 actually worked, and I thought that was a fix/workaround. However it does seem to have been a race where I got lucky just that once, every time after that one --float didn't help. (In reply to Dimitris from comment #9) > I've tried adding --float to the tun1 client's config. The very first time > I tried it, tun1 actually worked, and I thought that was a fix/workaround. > However it does seem to have been a race where I got lucky just that once, > every time after that one --float didn't help. --float is a fairly obscure feature. When used on the client side, it only makes the client a little bit more relaxed if the public server IP changes. When used on the server side, the server is a bit more relaxed if the public client IP changes. And with "relaxed" I mean that it won't do a full TLS reconnect, but will continue to use the negotiated TLS parameters and session keys. When connecting to OpenVPN 2.4 servers and more recent 2.3 and newer OpenVPN clients, the --float feature is even less "useful" as that implements something called peer-id per session. This peer-id is used as a key to reuse the negotiated TLS parameters and session keys if the client IP changes. This does not need to be explicitly enabled, it enables itself automatically if both sides supports it. Bottom line is, I would not expect --float to make much difference in this context ... as one of the client sessions would have to be "moved" from one public IP address to another one. (In reply to Dimitris from comment #9) > Thanks for looking into this. For what it's worth, routing the second VPN's > tunnel through the WLAN interface isn't bad - it actually improves > performance (tunnel overhead and MTU). If it's possible to move the > addition of the /32 route to the VPN server to *before* the VPN connection > is made, I think this will let us have the best of both worlds. It should > result in tun1 traffic also using the previous default route, despite its > higher metric. This is not possible at the moment due to how NM and plugins interact. The IP configuration, including the external gateway is known to NetworkManager only after the connection is established. I've pushed branch bg/stacked-vpn-rh1592596 that should restore the previous (NM 1.8) behavior. However there are still some issue, as the fact that we now manage the tun device causes the addition of duplicate routes with different metrics. FWIW, while waiting for this patch to hit release I can restore previous behavior by adding an explicit host route push to the "always on for privacy" VPN server config: push "route <devops VPN server IP> 255.255.255.255 vpn_gateway 50" Unfortunately, trying to use net_gateway instead, to avoid the double-tunneling and potential MTU issues, (i.e. having the cake and eating it too) doesn't work; even though the route is pushed, per journald: Static Route: <devops VPN server IP>/32Next Hop: 192.168.1.1 the route doesn't actually make it to the routing table. *Probably* a different bug though, correct? This message is a reminder that Fedora 28 is nearing its end of life. On 2019-May-28 Fedora will stop maintaining and issuing updates for Fedora 28. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '28'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 28 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Still happening with F29 This message is a reminder that Fedora 29 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 29 on 2019-11-26. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '29'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 29 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Still happening with F30 This message is a reminder that Fedora 30 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 30 on 2020-05-26. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '30'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 30 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 30 changed to end-of-life (EOL) status on 2020-05-26. Fedora 30 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. |