+++ This bug was initially created as a clone of Bug #1022980 +++ Description of problem: If bunch of floating IPs are created, assigned and then deleted recreated and reassigned, quantum often creates dup rules like -A quantum-l3-agent-OUTPUT -d 10.1.2.198/32 -j DNAT --to-destination 172.16.0.11 -A quantum-l3-agent-OUTPUT -d 10.1.2.198/32 -j DNAT --to-destination 172.16.0.10 or -A quantum-l3-agent-OUTPUT -d 10.1.2.207/32 -j DNAT --to-destination 172.16.0.11 -A quantum-l3-agent-OUTPUT -d 10.1.2.202/32 -j DNAT --to-destination 172.16.0.11 Version-Release number of selected component (if applicable): openstack-quantum-2013.1.3-1.el6ost.noarch How reproducible: 99% Steps to Reproduce: 1. reproducer attached 2. 3. Actual results: dup rules, vms unreachable. IIRC it was No route to host. Expected results: no dups, VMs connective Additional info: --- Additional comment from Jaroslav Henner on 2013-10-24 13:56:25 CEST --- iptables -tnat -S -P PREROUTING ACCEPT -P POSTROUTING ACCEPT -P OUTPUT ACCEPT -N quantum-l3-agent-OUTPUT -N quantum-l3-agent-POSTROUTING -N quantum-l3-agent-PREROUTING -N quantum-l3-agent-float-snat -N quantum-l3-agent-snat -N quantum-postrouting-bottom -A PREROUTING -j quantum-l3-agent-PREROUTING -A POSTROUTING -j quantum-l3-agent-POSTROUTING -A POSTROUTING -j quantum-postrouting-bottom -A OUTPUT -j quantum-l3-agent-OUTPUT -A quantum-l3-agent-OUTPUT -d 10.1.2.203/32 -j DNAT --to-destination 172.16.0.10 -A quantum-l3-agent-OUTPUT -d 10.1.2.207/32 -j DNAT --to-destination 172.16.0.11 -A quantum-l3-agent-OUTPUT -d 10.1.2.198/32 -j DNAT --to-destination 172.16.0.11 -A quantum-l3-agent-OUTPUT -d 10.1.2.198/32 -j DNAT --to-destination 172.16.0.10 -A quantum-l3-agent-OUTPUT -d 10.1.2.202/32 -j DNAT --to-destination 172.16.0.11 -A quantum-l3-agent-OUTPUT -d 10.1.2.196/32 -j DNAT --to-destination 172.16.0.12 -A quantum-l3-agent-OUTPUT -d 10.1.2.237/32 -j DNAT --to-destination 172.16.0.16 -A quantum-l3-agent-POSTROUTING ! -i qg-19f7ff97-cc ! -o qg-19f7ff97-cc -m conntrack ! --ctstate DNAT -j ACCEPT -A quantum-l3-agent-PREROUTING -d 169.254.169.254/32 -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697 -A quantum-l3-agent-PREROUTING -d 10.1.2.203/32 -j DNAT --to-destination 172.16.0.10 -A quantum-l3-agent-PREROUTING -d 10.1.2.207/32 -j DNAT --to-destination 172.16.0.11 -A quantum-l3-agent-PREROUTING -d 10.1.2.198/32 -j DNAT --to-destination 172.16.0.11 -A quantum-l3-agent-PREROUTING -d 10.1.2.198/32 -j DNAT --to-destination 172.16.0.10 -A quantum-l3-agent-PREROUTING -d 10.1.2.202/32 -j DNAT --to-destination 172.16.0.11 -A quantum-l3-agent-PREROUTING -d 10.1.2.196/32 -j DNAT --to-destination 172.16.0.12 -A quantum-l3-agent-PREROUTING -d 10.1.2.237/32 -j DNAT --to-destination 172.16.0.16 -A quantum-l3-agent-PREROUTING -d 172.16.0.1/32 -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697 -A quantum-l3-agent-float-snat -s 172.16.0.10/32 -j SNAT --to-source 10.1.2.203 -A quantum-l3-agent-float-snat -s 172.16.0.11/32 -j SNAT --to-source 10.1.2.207 -A quantum-l3-agent-float-snat -s 172.16.0.11/32 -j SNAT --to-source 10.1.2.198 -A quantum-l3-agent-float-snat -s 172.16.0.10/32 -j SNAT --to-source 10.1.2.198 -A quantum-l3-agent-float-snat -s 172.16.0.11/32 -j SNAT --to-source 10.1.2.202 -A quantum-l3-agent-float-snat -s 172.16.0.12/32 -j SNAT --to-source 10.1.2.196 -A quantum-l3-agent-float-snat -s 172.16.0.16/32 -j SNAT --to-source 10.1.2.237 -A quantum-l3-agent-snat -j quantum-l3-agent-float-snat -A quantum-l3-agent-snat -s 172.16.0.0/16 -j SNAT --to-source 10.1.2.204 -A quantum-postrouting-bottom -j quantum-l3-agent-snat --- Additional comment from Jaroslav Henner on 2013-10-24 21:26:15 CEST --- This is quite serious issue. It can be used for DOS. Sorry for not reporting this as private, I didn't realize it at first. --- Additional comment from Lon Hohberger on 2013-11-01 19:30:47 CET --- Bob - reassign as required; wasn't sure if you or Terry had seen this one. --- Additional comment from Jaroslav Henner on 2013-11-05 12:12:51 CET --- After restart of L3 agent, all the NAT rules disappears and only the correct ones seems to re-appear after more seconds. Therefore I think the rules are correctly stored to the database, but the L3 agent misses some delete event or fails to delete the old rules. --- Additional comment from Scott Lewis on 2013-11-11 15:23:03 CET --- not completed in time for 3.0.z3 release; moved to next async --- Additional comment from Terry Wilson on 2013-11-14 00:06:16 CET --- I had to modify the test script somewhat to get it to run on my setup. Diff: --- stress 2013-08-06 02:32:51.377999955 -0500 +++ stress_terry 2013-08-06 02:32:47.614999956 -0500 @@ -1,7 +1,6 @@ #!/bin/bash -x vm_count=5 -nova_boot_cmdline="--flavor=m1.tiny --image=d09e66b8-6ddb-468c-912e-a7acd34a8d32 floating_bug" -floatingip_create_cmdline="notrouted-shared" +nova_boot_cmdline="--flavor=m1.tiny --image=cirros floating_bug" vm_ids="" fips="" @@ -16,7 +15,7 @@ function create_assing_fips { for vm_id in $vm_ids; do - fip=`nova floating-ip-create | awk '/ None / { print $2 }'` + fip=`nova floating-ip-create public | awk '/ None / { print $2 }'` echo $fip fips="$fips $fip" done @@ -24,9 +23,10 @@ function randomly_assign_fips { newline=$'\n' - shuffled_vm_ids=`echo $vm_ids | replace ' ' "$newline" | sort -r` - for vm_id in $shuffled_vm_ids; do - nova add-floating-ip "$vm_id" "$fip" + shuffled_vm_ids=(`echo $vm_ids | replace ' ' "$newline" | sort -R`) + fips_arr=($fips) + for ((i=0;i<$vm_count;i++));do + nova add-floating-ip "${shuffled_vm_ids[${i}]}" "${fips_arr[${i}]}" done } @@ -34,6 +34,7 @@ for fip in $fips; do nova floating-ip-delete "$fip" done + fips="" } I have run this many, many times now and not been able to reproduce it (against openstack-quantum-2013.1.4-3.el6ost.noarch) installed via packstack --allinone. My output after running: [root@rhel-6 ~(keystone_demo)]# ip netns exec qrouter-44384dbc-81dc-44e3-8c30-7ed6301b1873 iptables -tnat -S|grep quantum-l3-agent-OUTPUT -N quantum-l3-agent-OUTPUT -A OUTPUT -j quantum-l3-agent-OUTPUT -A quantum-l3-agent-OUTPUT -d 172.24.4.232/32 -j DNAT --to-destination 10.0.0.4 -A quantum-l3-agent-OUTPUT -d 172.24.4.233/32 -j DNAT --to-destination 10.0.0.2 -A quantum-l3-agent-OUTPUT -d 172.24.4.234/32 -j DNAT --to-destination 10.0.0.7 -A quantum-l3-agent-OUTPUT -d 172.24.4.235/32 -j DNAT --to-destination 10.0.0.6 -A quantum-l3-agent-OUTPUT -d 172.24.4.236/32 -j DNAT --to-destination 10.0.0.5 jhenner: I notice that the example shown shows iptables -tnat -S to get the list of iptables rules without an ip netns. Do you have network namespace support disabled, or did you leave that out in the name of brevity? I'm pretty sure we only support using network namespaces now. Also, can you see if you can replicate this with version openstack-quantum-2013.1.4-3.el6ost.noarch on your setup? --- Additional comment from Pavel Sedlák on 2013-11-20 16:12:36 CET --- Maybe this could be related to (already fixed for Havana) https://bugzilla.redhat.com/show_bug.cgi?id=971518 - https://review.openstack.org/#/c/33254/ ? --- Additional comment from Terry Wilson on 2013-11-20 19:35:06 CET --- Looking at the attached launchpad bug: https://bugs.launchpad.net/neutron/+bug/1191768 it looks like someone reported afterwards that they were still seeing the issue. --- Additional comment from Jaroslav Henner on 2013-12-05 14:46:42 CET --- (In reply to Terry Wilson from comment #6) > I had to modify the test script somewhat to get it to run on my setup. Diff: > > --- stress 2013-08-06 02:32:51.377999955 -0500 > +++ stress_terry 2013-08-06 02:32:47.614999956 -0500 > @@ -1,7 +1,6 @@ > #!/bin/bash -x > vm_count=5 > -nova_boot_cmdline="--flavor=m1.tiny > --image=d09e66b8-6ddb-468c-912e-a7acd34a8d32 floating_bug" > -floatingip_create_cmdline="notrouted-shared" > +nova_boot_cmdline="--flavor=m1.tiny --image=cirros floating_bug" > > vm_ids="" > fips="" > @@ -16,7 +15,7 @@ > > function create_assing_fips { > for vm_id in $vm_ids; do > - fip=`nova floating-ip-create | awk '/ None / { print $2 }'` > + fip=`nova floating-ip-create public | awk '/ None / { print $2 }'` > echo $fip > fips="$fips $fip" > done > @@ -24,9 +23,10 @@ > > function randomly_assign_fips { > newline=$'\n' > - shuffled_vm_ids=`echo $vm_ids | replace ' ' "$newline" | sort -r` > - for vm_id in $shuffled_vm_ids; do > - nova add-floating-ip "$vm_id" "$fip" > + shuffled_vm_ids=(`echo $vm_ids | replace ' ' "$newline" | sort -R`) > + fips_arr=($fips) > + for ((i=0;i<$vm_count;i++));do > + nova add-floating-ip "${shuffled_vm_ids[${i}]}" "${fips_arr[${i}]}" > done > } > > @@ -34,6 +34,7 @@ > for fip in $fips; do > nova floating-ip-delete "$fip" > done > + fips="" > } > > > I have run this many, many times now and not been able to reproduce it > (against openstack-quantum-2013.1.4-3.el6ost.noarch) installed via packstack > --allinone. > My output after running: > > [root@rhel-6 ~(keystone_demo)]# ip netns exec > qrouter-44384dbc-81dc-44e3-8c30-7ed6301b1873 iptables -tnat -S|grep > quantum-l3-agent-OUTPUT > -N quantum-l3-agent-OUTPUT > -A OUTPUT -j quantum-l3-agent-OUTPUT > -A quantum-l3-agent-OUTPUT -d 172.24.4.232/32 -j DNAT --to-destination > 10.0.0.4 > -A quantum-l3-agent-OUTPUT -d 172.24.4.233/32 -j DNAT --to-destination > 10.0.0.2 > -A quantum-l3-agent-OUTPUT -d 172.24.4.234/32 -j DNAT --to-destination > 10.0.0.7 > -A quantum-l3-agent-OUTPUT -d 172.24.4.235/32 -j DNAT --to-destination > 10.0.0.6 > -A quantum-l3-agent-OUTPUT -d 172.24.4.236/32 -j DNAT --to-destination > 10.0.0.5 > > jhenner: I notice that the example shown shows iptables -tnat -S to get the > list of iptables rules without an ip netns. Do you have network namespace > support disabled, or did you leave that out in the name of brevity? I'm > pretty sure we only support using network namespaces now. Also, can you see > if you can replicate this with version > openstack-quantum-2013.1.4-3.el6ost.noarch on your setup? We _are_ using namespaces. The namespace selection was not copy-pasted here, so it looks like it wasn't used. I cannot reproduce with openstack-quantum-2013.1.4-3.el6ost.noarch. --- Additional comment from Attila Darazs on 2013-12-05 18:15:39 CET --- I experienced the issue the other day with python-neutron-2013.2-9.el6ost.noarch too, so I'm cloning the bug to RHOS 4.0.
The version of neutron of course is openstack-neutron-2013.2-9.el6ost.noarch. We use namespaces, the double NAT rule was in the namespace of the relevant router. I couldn't find a way to reproduce it but it happens often when the system is allocating and assigning Floating IPs more heavily. A way I imagine that it could be checked is creating several tiny instances and allocating/associating IPs with it and meanwhile checking the iptables nat table in the relevant router's namespace.
I'll keep trying to find a way to reproduce. The other 3.0.z issue the last comment from the reporter was that they couldn't reproduce it on 3.0 anymore. So, I'll switch to 4.0 and see if I can hammer on it until it happens. Until then, if someone could post actual logs from when this happens for them (l3 logs, api logs, qpid logs, also the packstack answer file so I can see if this is multi-host, etc.) it would be helpful. Especially since I've run the above script for 24 hours solid w/o getting the issue.
I have just hit it on the Grizzly openstack-quantum-2013.1.4-3.el6ost.noarch: -A quantum-l3-agent-PREROUTING -d 10.34.68.207/32 -j DNAT --to-destination 172.16.0.13 -A quantum-l3-agent-PREROUTING -d 10.34.68.207/32 -j DNAT --to-destination 172.16.0.15 -A quantum-l3-agent-OUTPUT -d 10.34.68.207/32 -j DNAT --to-destination 172.16.0.13 -A quantum-l3-agent-OUTPUT -d 10.34.68.207/32 -j DNAT --to-destination 172.16.0.15 -A quantum-l3-agent-float-snat -s 172.16.0.13/32 -j SNAT --to-source 10.34.68.207 -A quantum-l3-agent-float-snat -s 172.16.0.15/32 -j SNAT --to-source 10.34.68.207 I don't know why I was unable to reproduce it with the script above.
I believe this was fixed by https://github.com/openstack/neutron/commit/a65188fab01f29d095031abbc8d1d194548cd8be#diff-0b4d77c924888b648beb73c622bf5869, which we pulled in with the latest re-base. I can no longer reproduce the bug with my test script.
Verified on openstack-neutron-2013.2.1-2.el6ost.noarch I have created 10 VMs and created new floating ip assign them and check iptables later dissociate them delete and repeat the steps, no duplicate appear
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2014-0091.html