Description of problem: Tenant networks are sporadically leaking private IPv4 addresses to the external networks outside of the Network node. The customer has deployed the Network node in a VM and is tcp-dumping on the VM interface: ~~~ # tcpdump -ne -i bond1 net 10.1.0.0/16 19:38:24.743953 fa:16:3e:4d:f3:b2 > 00:22:bd:f8:19:ff, ethertype 802.1Q (0x8100), length 58: vlan 70, p 0, ethertype IPv4, 10.1.1.23.https > 10.47.125.8.59077: Flags [R], seq 441238831, win 0, length 0 19:38:36.395012 fa:16:3e:d9:8f:b5 > 00:22:bd:f8:19:ff, ethertype 802.1Q (0x8100), length 58: vlan 70, p 0, ethertype IPv4, 10.1.3.43.34497 > 144.136.251.168.ssh: Flags [R], seq 4213481779, win 0, length 0 19:38:51.400689 fa:16:3e:d9:8f:b5 > 00:22:bd:f8:19:ff, ethertype 802.1Q (0x8100), length 58: vlan 70, p 0, ethertype IPv4, 10.1.3.43.34503 > 144.136.251.168.ssh: Flags [R], seq 2250784302, win 0, length 0 19:39:18.061263 fa:16:3e:d9:8f:b5 > 00:22:bd:f8:19:ff, ethertype 802.1Q (0x8100), length 58: vlan 70, p 0, ethertype IPv4, 10.1.1.113.55861 > 144.136.12.179.8089: Flags [R], seq 2426279263, win 0, length 0 19:39:29.063175 fa:16:3e:4d:f3:b2 > 00:22:bd:f8:19:ff, ethertype 802.1Q (0x8100), length 58: vlan 70, p 0, ethertype IPv4, 10.1.1.14.squid > 10.60.108.107.41426: Flags [R], seq 3089189479, win 0, length 0 19:40:11.415589 fa:16:3e:d9:8f:b5 > 00:22:bd:f8:19:ff, ethertype 802.1Q (0x8100), length 58: vlan 70, p 0, ethertype IPv4, 10.1.3.43.34531 > 144.136.251.168.ssh: Flags [R], seq 288086706, win 0, length 0 19:40:28.661888 fa:16:3e:c2:27:81 > 00:22:bd:f8:19:ff, ethertype 802.1Q (0x8100), length 58: vlan 70, p 0, ethertype IPv4, 10.1.1.5.55865 > 10.126.163.10.ssh: Flags [R], seq 1681597371, win 0, length 0 ... ~~~ * The Openstack tenant networks are in 10.1.0.0/16 range. We see only TCP RESET (as shown in the output above). Version-Release number of selected component (if applicable): OSP 6 Additional info: We have tried to tweak the nf_conntrack_tcp_timeout_close_wait timeout (and additionally some of the other conntrack timeouts) without success. The thinking behind being that neutron is keeping half-closed TCP connections, and at some point (for example, on Instance network activity?) it has to send new data but since the connection is half-closed, conntrack is not keeping track of it anymore (due to a standard timeout, which we tried to tweak), the RST packages are not hitting the correct NAT rule in iptables and are instead being forwarded to another rule (which hypothetically causes the leakage as seen before in standard Application/Client to Application/Server communication). The conntrack timeout bump has been done for every namespace as doing it in the root namespace does not automatically propagate values to qdhcp- and qrouter- namespaces.
Note that RHEL 7.2 is not supported. Please have the customer update to RHEL 7.3 with latest kernel. I'm closing the bug as WONTFIX for now, please re-open if this reproduces on RHEL 7.3.
Ok, I got it reproduced on RHEL7.2 kernel + OSP6 AIO packstack, by creating a port on the private network: sudo yum install -y tcpdump source keystonerc_admin NETWORK_NAME=private NETWORK_ID=$(openstack network show -f value -c id $NETWORK_NAME) HOST_ID=$(hostname) PORT_ID=$(neutron port-create $NETWORK_ID --device-owner compute:container \ --binding:host_id=$HOST_ID -f value -c id) PORT_MAC=$(neutron port-show $PORT_ID -f value -c mac_address) neutron port-show $PORT_ID -f value -c status ovs-vsctl -- --may-exist add-port br-int test_interf0 \ -- set Interface test_interf0 type=internal \ -- set Interface test_interf0 external-ids:iface-status=active \ -- set Interface test_interf0 external-ids:attached-mac=$PORT_MAC \ -- set Interface test_interf0 external-ids:iface-id=$PORT_ID neutron port-show $PORT_ID -f value -c status ip link set dev test_interf0 address $PORT_MAC ip netns add test-ns ip link set test_interf0 netns test-ns ip netns exec test-ns ip link set dev test_interf0 up cp /etc/resolv.conf /tmp/resolv.conf ip netns exec test-ns dhclient -I test_interf0 --no-pid test_interf0 -v cp /tmp/resolv.conf /etc/resolv.conf IP_ON_BR_EX=$(ip a show dev br-ex | awk ' /inet / { print $2 }' | cut -d/ -f1) # this will start an Apache benchmark against the br-ex IP address, from # the port created on "private" network which resides on test-ns ip netns exec test-ns ab -c 10 -n 100000000 http://$IP_ON_BR_EX/ & # now... tcpdump -ne -i br-ex net 10.0.0.0/24 & service neutron-l3-agent restart 08:54:34.951996 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.46131 > 172.24.4.225.http: Flags [S], seq 2106238134, win 27200, options [mss 1360,sackOK,TS val 7438626 ecr 0,nop,wscale 7], length 0 08:54:34.952071 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.46132 > 172.24.4.225.http: Flags [S], seq 658608540, win 27200, options [mss 1360,sackOK,TS val 7438626 ecr 0,nop,wscale 7], length 0 08:54:34.952141 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.46133 > 172.24.4.225.http: Flags [S], seq 581455475, win 27200, options [mss 1360,sackOK,TS val 7438626 ecr 0,nop,wscale 7], length 0 08:54:34.952217 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.46134 > 172.24.4.225.http: Flags [S], seq 449604465, win 27200, options [mss 1360,sackOK,TS val 7438626 ecr 0,nop,wscale 7], length 0 08:54:34.952303 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.46135 > 172.24.4.225.http: Flags [S], seq 2599578450, win 27200, options [mss 1360,sackOK,TS val 7438626 ecr 0,nop,wscale 7], length 0 08:54:34.952477 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.46136 > 172.24.4.225.http: Flags [S], seq 733184804, win 27200, options [mss 1360,sackOK,TS val 7438626 ecr 0,nop,wscale 7], length 0 08:54:34.952971 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.46137 > 172.24.4.225.http: Flags [S], seq 2551253028, win 27200, options [mss 1360,sackOK,TS val 7438627 ecr 0,nop,wscale 7], length 0 08:54:34.953108 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.46138 > 172.24.4.225.http: Flags [S], seq 832821460, win 27200, options [mss 1360,sackOK,TS val 7438627 ecr 0,nop,wscale 7], length 0 08:54:34.953553 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.46139 > 172.24.4.225.http: Flags [S], seq 1257265829, win 27200, options [mss 1360,sackOK,TS val 7438627 ecr 0,nop,wscale 7], length 0 08:54:35.953724 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.46136 > 172.24.4.225.http: Flags [S], seq 733184804, win 27200, options [mss 1360,sackOK,TS val 7439628 ecr 0,nop,wscale 7], length 0 08:54:35.953724 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype # I have tried floating ip manipulation on the router, without getting to the same effect. # I will be trying the rhel 7.3 kernel in a minute.
This is also happening on the rhel7.3 kernel, apparently right after the 2017-06-14 09:25:54.158 18931 DEBUG neutron.agent.linux.utils [-] Command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-348c6c7a-26a8-4fc1-928e-a943d4b0f869', 'iptables-restore', '-c'] 09:25:54.358087 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.57902 > 172.24.4.225.http: Flags [S], seq 2777775089, win 27200, options [mss 1360,sackOK,TS val 1237038 ecr 0,nop,wscale 7], length 0 09:25:54.358091 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.57906 > 172.24.4.225.http: Flags [S], seq 3449881047, win 27200, options [mss 1360,sackOK,TS val 1237038 ecr 0,nop,wscale 7], length 0 09:25:54.358194 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.57904 > 172.24.4.225.http: Flags [S], seq 802128951, win 27200, options [mss 1360,sackOK,TS val 1237038 ecr 0,nop,wscale 7], length 0 09:25:54.358198 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.57908 > 172.24.4.225.http: Flags [S], seq 3417875851, win 27200, options [mss 1360,sackOK,TS val 1237038 ecr 0,nop,wscale 7], length 0 ... ...
ok, in an attempt to understand what could be the issue, I ran: ip netns exec test-ns ab -c 10 -n 10000000 http://172.24.4.225/ & ip netns exec test-ns conntrack -L > before.txt ; service neutron-l3-agent restart; sleep 4; ip netns exec test-ns conntrack -L > after.txt and I saw: 10:02:20.189322 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.41020 > 172.24.4.225.http: Flags [S], seq 2758578046, win 27200, options [mss 1360,sackOK,TS val 3422869 ecr 0,nop,wscale 7], length 0 10:02:20.189420 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.41022 > 172.24.4.225.http: Flags [S], seq 128484768, win 27200, options [mss 1360,sackOK,TS val 3422869 ecr 0,nop,wscale 7], length 0 10:02:20.194446 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.41024 > 172.24.4.225.http: Flags [S], seq 3757443744, win 27200, options [mss 1360,sackOK,TS val 3422874 ecr 0,nop,wscale 7], length 0 10:02:20.194568 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.41026 > 172.24.4.225.http: Flags [S], seq 1063508575, win 27200, options [mss 1360,sackOK,TS val 3422874 ecr 0,nop,wscale 7], length 0 10:02:20.194673 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.41028 > 172.24.4.225.http: Flags [S], seq 3768317291, win 27200, options [mss 1360,sackOK,TS val 3422874 ecr 0,nop,wscale 7], length 0 10:02:20.194774 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.41030 > 172.24.4.225.http: Flags [S], seq 1937854537, win 27200, options [mss 1360,sackOK,TS val 3422874 ecr 0,nop,wscale 7], length 0 10:02:20.194877 fa:16:3e:ba:9b:5f > de:e9:cd:b6:37:4a, ethertype IPv4 (0x0800), length 74: 10.0.0.3.41032 > 172.24.4.225.http: Flags [S], seq 588795097, win 27200, options [mss 1360,sackOK,TS val 3422874 ecr 0,nop,wscale 7], length 0 Then: # grep 41020 *.txt after.txt:tcp 6 117 SYN_SENT src=10.0.0.3 dst=172.24.4.225 sport=41020 dport=80 [UNREPLIED] src=172.24.4.225 dst=10.0.0.3 sport=80 dport=41020 mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1 before.txt:tcp 6 118 TIME_WAIT src=10.0.0.3 dst=172.24.4.225 sport=41020 dport=80 src=172.24.4.225 dst=10.0.0.3 sport=80 dport=41020 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1 # grep 41022 *.txt after.txt:tcp 6 117 SYN_SENT src=10.0.0.3 dst=172.24.4.225 sport=41022 dport=80 [UNREPLIED] src=172.24.4.225 dst=10.0.0.3 sport=80 dport=41022 mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1 before.txt:tcp 6 118 TIME_WAIT src=10.0.0.3 dst=172.24.4.225 sport=41022 dport=80 src=172.24.4.225 dst=10.0.0.3 sport=80 dport=41022 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1 # grep 41026 *.txt after.txt:tcp 6 117 SYN_SENT src=10.0.0.3 dst=172.24.4.225 sport=41026 dport=80 [UNREPLIED] src=172.24.4.225 dst=10.0.0.3 sport=80 dport=41026 mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1 before.txt:tcp 6 118 TIME_WAIT src=10.0.0.3 dst=172.24.4.225 sport=41026 dport=80 src=172.24.4.225 dst=10.0.0.3 sport=80 dport=41026 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1 I have also noticed that if I flush the table in the middle, I get traces of un-natted packets ip netns exec qrouter-348c6c7a-26a8-4fc1-928e-a943d4b0f869 conntrack -F
The related patch was for a DVR issue where packets were "leaking" out the wrong interface under load. The solution was to change the tcp_loose sysctl setting to zero in the snat namespace: https://review.openstack.org/#/c/366297/ https://bugs.launchpad.net/neutron/+bug/1620824 Since that only involved traffic using the default SNAT IP, changing that setting was OK. I'm not sure we can make that same change in a legacy router environment, since doing so could caused dropped connections to the floating IP is the router was ever migrated/failed-over to another network node - it won't start tracking an-flight connection, only those it has seen the SYN for. Do we know for sure whether using 7.3 solves the problem? I'm just trying to narrow this down to either a kernel or openstack bug.
Thanks @bhaley, it must be something else, I checked OSP9, where it didn't reproduce for me, and the setting wasn't set on legacy routers: ip netns exec qrouter-b44c1e26-6422-4386-961a-19d95ab1584f sysctl -a | grep loose 1 (1 is the default) btw, it does reproduce in OSP6 regardless of the RHEL version, it doesn't reproduce in OSP9, but I haven't checcked OSP7..OSP8 yet. I'll do that ASAP.
Thanks Miguel, it could just be an iptables and/or security group patch needs a backport.
*** Bug 1051615 has been marked as a duplicate of this bug. ***
Ok, I have deployed OSP6 ,7 ,8 ,9 and I'm trying to figure out when does it stop filing, so we scope where the fix was implemented.
ok OSP7 doesn't show the issue, I'm going to try git bisect to see if we can find the point where it started working.
Ok, this bug is being very elusive. Apparently, this is fixed somewhere in between OSP6 and OSP7, I've been trying to bisect the bug without success, because, as soon as you run the test in the history between OSP6 and OSP7 where it's failing, if you roll forward to OSP7... it will be hit So, something is left broken into the system that will require a reboot. I'm slowly bisecting git history through reboots...
Created attachment 1314742 [details] Reproducer for AIO
Created attachment 1329574 [details] OSP6 qrouter iptables
Created attachment 1329580 [details] OSP7 qrouter iptables
I have found that the OSP7 iptables rules are slightly different, using marks for packets under certain conditions, I'm going to look at related differences in the l3 agent across both versions to see if anything rings a bell.
A difference I see in the rules is this: -A neutron-l3-agent-snat -j neutron-l3-agent-float-snat -A neutron-l3-agent-snat -o qg-6ae915a1-08 -j SNAT --to-source 172.24.4.226 <--- which is not present in OSP6 This review introduces this change, and is likely to fix it: [1] https://review.openstack.org/#/c/131905/ But apparently is buggy and resolved in [3] which is dependent (at least) on [2] The changes related to the marks are: [2] https://review.openstack.org/#/c/133484/ [3] https://review.openstack.org/#/c/161947/ I'm going to give a try on those patches to see where backportability stands.
Added an internal review, but we need to check backportability for [2] or [3] because this backport as-is introduces other *more severe* bugs.
only [1] doesn't make any difference: 08:07:39.330827 fa:16:3e:9b:41:39 > ca:f7:1e:f4:68:49, ethertype IPv4 (0x0800), length 74: 10.0.0.3.43852 > 172.24.4.230.http: Flags [S], seq 974883762, win 27200, options [mss 1360,sackOK,TS val 20762 ecr 0,nop,wscale 7], length 0 08:07:39.330962 fa:16:3e:9b:41:39 > ca:f7:1e:f4:68:49, ethertype IPv4 (0x0800), length 74: 10.0.0.3.43854 > 172.24.4.230.http: Flags [S], seq 2849972098, win 27200, options [mss 1360,sackOK,TS val 20762 ecr 0,nop,wscale 7], length 0 08:07:39.331040 fa:16:3e:9b:41:39 > ca:f7:1e:f4:68:49, ethertype IPv4 (0x0800), length 74: 10.0.0.3.43856 > 172.24.4.230.http: Flags [S], seq 2982552839, win 27200, options [mss 1360,sackOK,TS val 20762 ecr 0,nop,wscale 7], length 0 08:07:39.331119 fa:16:3e:9b:41:39 > ca:f7:1e:f4:68:49, ethertype IPv4 (0x0800), length 74: 10.0.0.3.43858 > 172.24.4.230.http: Flags [S], seq 3875241932, win 27200, options [mss 1360,sackOK,TS val 20762 ecr 0,nop,wscale 7], length 0
Created attachment 1330587 [details] updated reproducer for AIO
I have an environment setup now to reproduce the problem but am still investigating.
I have hand-crafted a set of iptables chains and rules based on upstream code that seems to fix the issue. Since it looks like they were probably added in multiple patches, I will have to narrow things down a bit. From Comment 43 it seems the customer is still running OSP6 and needs this patch there, but hopefully given the above information I can figure this out.
@Brian, Any luck on the patches.
Brian, any timeframe for identifying the issue in the code?
I was able to solve the problem in my test environment by adding these two changes on-top of Miguel's existing change: https://code.engineering.redhat.com/gerrit/#/c/121929/ https://code.engineering.redhat.com/gerrit/#/c/121930/ That second one had to be hand-applied since the neutron code was completely re-factored between OSP6 and OSP7 upstream, so file names have changed, etc. I will let those two run CI tests and re-assess where they are, since it's tricky to know what I might have broken.
OSP6 has been retired, and will not receive further updates. See https://access.redhat.com/support/policy/updates/openstack/platform/ for details.