Bug 1022980 - quantum is creating dup NAT rules when under stress
quantum is creating dup NAT rules when under stress
Status: CLOSED NEXTRELEASE
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-quantum (Show other bugs)
3.0
x86_64 Linux
unspecified Severity urgent
: z4
: 3.0
Assigned To: Terry Wilson
Ofer Blaut
network
: TestBlocker, ZStream
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-24 07:53 EDT by Jaroslav Henner
Modified: 2016-04-26 17:43 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1038737 (view as bug list)
Environment:
Last Closed: 2014-01-22 09:59:25 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
just a dirty stressing thingy in bash (927 bytes, application/x-shellscript)
2013-10-24 07:53 EDT, Jaroslav Henner
no flags Details

  None (edit)
Description Jaroslav Henner 2013-10-24 07:53:55 EDT
Created attachment 815744 [details]
just a dirty stressing thingy in bash

Description of problem:
If bunch of floating IPs are created, assigned and then deleted recreated and reassigned, quantum often creates dup rules like

   -A quantum-l3-agent-OUTPUT -d 10.1.2.198/32 -j DNAT --to-destination 172.16.0.11 
    -A quantum-l3-agent-OUTPUT -d 10.1.2.198/32 -j DNAT --to-destination 172.16.0.10 

or 

    -A quantum-l3-agent-OUTPUT -d 10.1.2.207/32 -j DNAT --to-destination 172.16.0.11 
    -A quantum-l3-agent-OUTPUT -d 10.1.2.202/32 -j DNAT --to-destination 172.16.0.11 


Version-Release number of selected component (if applicable):
openstack-quantum-2013.1.3-1.el6ost.noarch

How reproducible:
99%

Steps to Reproduce:
1. reproducer attached
2.
3.


Actual results:
dup rules, vms unreachable. IIRC it was No route to host.


Expected results:
no dups, VMs connective

Additional info:
Comment 1 Jaroslav Henner 2013-10-24 07:56:25 EDT
   iptables -tnat -S
    -P PREROUTING ACCEPT
    -P POSTROUTING ACCEPT
    -P OUTPUT ACCEPT
    -N quantum-l3-agent-OUTPUT
    -N quantum-l3-agent-POSTROUTING
    -N quantum-l3-agent-PREROUTING
    -N quantum-l3-agent-float-snat
    -N quantum-l3-agent-snat
    -N quantum-postrouting-bottom
    -A PREROUTING -j quantum-l3-agent-PREROUTING 
    -A POSTROUTING -j quantum-l3-agent-POSTROUTING 
    -A POSTROUTING -j quantum-postrouting-bottom 
    -A OUTPUT -j quantum-l3-agent-OUTPUT 
    -A quantum-l3-agent-OUTPUT -d 10.1.2.203/32 -j DNAT --to-destination 172.16.0.10 
    -A quantum-l3-agent-OUTPUT -d 10.1.2.207/32 -j DNAT --to-destination 172.16.0.11 
    -A quantum-l3-agent-OUTPUT -d 10.1.2.198/32 -j DNAT --to-destination 172.16.0.11 
    -A quantum-l3-agent-OUTPUT -d 10.1.2.198/32 -j DNAT --to-destination 172.16.0.10 
    -A quantum-l3-agent-OUTPUT -d 10.1.2.202/32 -j DNAT --to-destination 172.16.0.11 
    -A quantum-l3-agent-OUTPUT -d 10.1.2.196/32 -j DNAT --to-destination 172.16.0.12 
    -A quantum-l3-agent-OUTPUT -d 10.1.2.237/32 -j DNAT --to-destination 172.16.0.16 
    -A quantum-l3-agent-POSTROUTING ! -i qg-19f7ff97-cc ! -o qg-19f7ff97-cc -m conntrack ! --ctstate DNAT -j ACCEPT 
    -A quantum-l3-agent-PREROUTING -d 169.254.169.254/32 -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697 
    -A quantum-l3-agent-PREROUTING -d 10.1.2.203/32 -j DNAT --to-destination 172.16.0.10 
    -A quantum-l3-agent-PREROUTING -d 10.1.2.207/32 -j DNAT --to-destination 172.16.0.11 
    -A quantum-l3-agent-PREROUTING -d 10.1.2.198/32 -j DNAT --to-destination 172.16.0.11 
    -A quantum-l3-agent-PREROUTING -d 10.1.2.198/32 -j DNAT --to-destination 172.16.0.10 
    -A quantum-l3-agent-PREROUTING -d 10.1.2.202/32 -j DNAT --to-destination 172.16.0.11 
    -A quantum-l3-agent-PREROUTING -d 10.1.2.196/32 -j DNAT --to-destination 172.16.0.12 
    -A quantum-l3-agent-PREROUTING -d 10.1.2.237/32 -j DNAT --to-destination 172.16.0.16 
    -A quantum-l3-agent-PREROUTING -d 172.16.0.1/32 -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697 
    -A quantum-l3-agent-float-snat -s 172.16.0.10/32 -j SNAT --to-source 10.1.2.203 
    -A quantum-l3-agent-float-snat -s 172.16.0.11/32 -j SNAT --to-source 10.1.2.207 
    -A quantum-l3-agent-float-snat -s 172.16.0.11/32 -j SNAT --to-source 10.1.2.198 
    -A quantum-l3-agent-float-snat -s 172.16.0.10/32 -j SNAT --to-source 10.1.2.198 
    -A quantum-l3-agent-float-snat -s 172.16.0.11/32 -j SNAT --to-source 10.1.2.202 
    -A quantum-l3-agent-float-snat -s 172.16.0.12/32 -j SNAT --to-source 10.1.2.196 
    -A quantum-l3-agent-float-snat -s 172.16.0.16/32 -j SNAT --to-source 10.1.2.237 
    -A quantum-l3-agent-snat -j quantum-l3-agent-float-snat 
    -A quantum-l3-agent-snat -s 172.16.0.0/16 -j SNAT --to-source 10.1.2.204 
    -A quantum-postrouting-bottom -j quantum-l3-agent-snat
Comment 4 Jaroslav Henner 2013-11-05 06:12:51 EST
After restart of L3 agent, all the NAT rules disappears and only the correct ones seems to re-appear after more seconds. Therefore I think the rules are correctly stored to the database, but the L3 agent misses some delete event or fails to delete the old rules.
Comment 6 Terry Wilson 2013-11-13 18:06:16 EST
I had to modify the test script somewhat to get it to run on my setup. Diff:

--- stress	2013-08-06 02:32:51.377999955 -0500
+++ stress_terry	2013-08-06 02:32:47.614999956 -0500
@@ -1,7 +1,6 @@
 #!/bin/bash -x
 vm_count=5
-nova_boot_cmdline="--flavor=m1.tiny --image=d09e66b8-6ddb-468c-912e-a7acd34a8d32 floating_bug"
-floatingip_create_cmdline="notrouted-shared"
+nova_boot_cmdline="--flavor=m1.tiny --image=cirros floating_bug"
 
 vm_ids=""
 fips=""
@@ -16,7 +15,7 @@
 
 function create_assing_fips {
 	for vm_id in $vm_ids; do
-		fip=`nova floating-ip-create | awk '/ None / { print $2 }'`
+		fip=`nova floating-ip-create public | awk '/ None / { print $2 }'`
 		echo $fip
 		fips="$fips $fip"
 	done
@@ -24,9 +23,10 @@
 
 function randomly_assign_fips {
 	newline=$'\n'
-	shuffled_vm_ids=`echo $vm_ids | replace ' ' "$newline" | sort -r`
-	for vm_id in $shuffled_vm_ids; do
-		nova add-floating-ip "$vm_id" "$fip"
+	shuffled_vm_ids=(`echo $vm_ids | replace ' ' "$newline" | sort -R`)
+	fips_arr=($fips)
+	for ((i=0;i<$vm_count;i++));do 
+		nova add-floating-ip "${shuffled_vm_ids[${i}]}" "${fips_arr[${i}]}"
 	done
 }
 
@@ -34,6 +34,7 @@
 	for fip in $fips; do
 		nova floating-ip-delete "$fip"
 	done
+	fips=""
 }


I have run this many, many times now and not been able to reproduce it (against openstack-quantum-2013.1.4-3.el6ost.noarch) installed via packstack --allinone. 
My output after running:

[root@rhel-6 ~(keystone_demo)]# ip netns exec qrouter-44384dbc-81dc-44e3-8c30-7ed6301b1873 iptables -tnat -S|grep quantum-l3-agent-OUTPUT
-N quantum-l3-agent-OUTPUT
-A OUTPUT -j quantum-l3-agent-OUTPUT 
-A quantum-l3-agent-OUTPUT -d 172.24.4.232/32 -j DNAT --to-destination 10.0.0.4 
-A quantum-l3-agent-OUTPUT -d 172.24.4.233/32 -j DNAT --to-destination 10.0.0.2 
-A quantum-l3-agent-OUTPUT -d 172.24.4.234/32 -j DNAT --to-destination 10.0.0.7 
-A quantum-l3-agent-OUTPUT -d 172.24.4.235/32 -j DNAT --to-destination 10.0.0.6 
-A quantum-l3-agent-OUTPUT -d 172.24.4.236/32 -j DNAT --to-destination 10.0.0.5 

jhenner: I notice that the example shown shows iptables -tnat -S to get the list of iptables rules without an ip netns. Do you have network namespace support disabled, or did you leave that out in the name of brevity? I'm pretty sure we only support using network namespaces now. Also, can you see if you can replicate this with version openstack-quantum-2013.1.4-3.el6ost.noarch on your setup?
Comment 7 Pavel Sedlák 2013-11-20 10:12:36 EST
Maybe this could be related to (already fixed for Havana) https://bugzilla.redhat.com/show_bug.cgi?id=971518 - https://review.openstack.org/#/c/33254/ ?
Comment 8 Terry Wilson 2013-11-20 13:35:06 EST
Looking at the attached launchpad bug: https://bugs.launchpad.net/neutron/+bug/1191768 it looks like someone reported afterwards that they were still seeing the issue.
Comment 9 Jaroslav Henner 2013-12-05 08:46:42 EST
(In reply to Terry Wilson from comment #6)
> I had to modify the test script somewhat to get it to run on my setup. Diff:
> 
> --- stress	2013-08-06 02:32:51.377999955 -0500
> +++ stress_terry	2013-08-06 02:32:47.614999956 -0500
> @@ -1,7 +1,6 @@
>  #!/bin/bash -x
>  vm_count=5
> -nova_boot_cmdline="--flavor=m1.tiny
> --image=d09e66b8-6ddb-468c-912e-a7acd34a8d32 floating_bug"
> -floatingip_create_cmdline="notrouted-shared"
> +nova_boot_cmdline="--flavor=m1.tiny --image=cirros floating_bug"
>  
>  vm_ids=""
>  fips=""
> @@ -16,7 +15,7 @@
>  
>  function create_assing_fips {
>  	for vm_id in $vm_ids; do
> -		fip=`nova floating-ip-create | awk '/ None / { print $2 }'`
> +		fip=`nova floating-ip-create public | awk '/ None / { print $2 }'`
>  		echo $fip
>  		fips="$fips $fip"
>  	done
> @@ -24,9 +23,10 @@
>  
>  function randomly_assign_fips {
>  	newline=$'\n'
> -	shuffled_vm_ids=`echo $vm_ids | replace ' ' "$newline" | sort -r`
> -	for vm_id in $shuffled_vm_ids; do
> -		nova add-floating-ip "$vm_id" "$fip"
> +	shuffled_vm_ids=(`echo $vm_ids | replace ' ' "$newline" | sort -R`)
> +	fips_arr=($fips)
> +	for ((i=0;i<$vm_count;i++));do 
> +		nova add-floating-ip "${shuffled_vm_ids[${i}]}" "${fips_arr[${i}]}"
>  	done
>  }
>  
> @@ -34,6 +34,7 @@
>  	for fip in $fips; do
>  		nova floating-ip-delete "$fip"
>  	done
> +	fips=""
>  }
> 
> 
> I have run this many, many times now and not been able to reproduce it
> (against openstack-quantum-2013.1.4-3.el6ost.noarch) installed via packstack
> --allinone. 
> My output after running:
> 
> [root@rhel-6 ~(keystone_demo)]# ip netns exec
> qrouter-44384dbc-81dc-44e3-8c30-7ed6301b1873 iptables -tnat -S|grep
> quantum-l3-agent-OUTPUT
> -N quantum-l3-agent-OUTPUT
> -A OUTPUT -j quantum-l3-agent-OUTPUT 
> -A quantum-l3-agent-OUTPUT -d 172.24.4.232/32 -j DNAT --to-destination
> 10.0.0.4 
> -A quantum-l3-agent-OUTPUT -d 172.24.4.233/32 -j DNAT --to-destination
> 10.0.0.2 
> -A quantum-l3-agent-OUTPUT -d 172.24.4.234/32 -j DNAT --to-destination
> 10.0.0.7 
> -A quantum-l3-agent-OUTPUT -d 172.24.4.235/32 -j DNAT --to-destination
> 10.0.0.6 
> -A quantum-l3-agent-OUTPUT -d 172.24.4.236/32 -j DNAT --to-destination
> 10.0.0.5 
> 
> jhenner: I notice that the example shown shows iptables -tnat -S to get the
> list of iptables rules without an ip netns. Do you have network namespace
> support disabled, or did you leave that out in the name of brevity? I'm
> pretty sure we only support using network namespaces now. Also, can you see
> if you can replicate this with version
> openstack-quantum-2013.1.4-3.el6ost.noarch on your setup?

We _are_ using namespaces. The namespace selection was not copy-pasted here, so it looks like it wasn't used.
I cannot reproduce with openstack-quantum-2013.1.4-3.el6ost.noarch.
Comment 10 Attila Darazs 2013-12-05 12:15:39 EST
I experienced the issue the other day with python-neutron-2013.2-9.el6ost.noarch too, so I'm cloning the bug to RHOS 4.0.
Comment 11 Jaroslav Henner 2013-12-09 11:36:29 EST
Created attachment 834390 [details]
server.log

I have just hit it on the Grizzly openstack-quantum-2013.1.4-3.el6ost.noarch:

    -A quantum-l3-agent-PREROUTING -d 10.34.68.207/32 -j DNAT --to-destination 172.16.0.13 
    -A quantum-l3-agent-PREROUTING -d 10.34.68.207/32 -j DNAT --to-destination 172.16.0.15 

    -A quantum-l3-agent-OUTPUT -d 10.34.68.207/32 -j DNAT --to-destination 172.16.0.13 
    -A quantum-l3-agent-OUTPUT -d 10.34.68.207/32 -j DNAT --to-destination 172.16.0.15 

    -A quantum-l3-agent-float-snat -s 172.16.0.13/32 -j SNAT --to-source 10.34.68.207 
    -A quantum-l3-agent-float-snat -s 172.16.0.15/32 -j SNAT --to-source 10.34.68.207 

I don't know why I was unable to reproduce it with the script I have posted before.
Comment 12 lpeer 2014-01-22 09:59:25 EST
After discussing with Jhenner we agreed that since there are many issues with races that are being handled in Icehouse and we don't have customer using Grizzly and Neutron we can closing this bug.
If the issue can be reproduced in havana or icehouse we'll report a new bug with the relevant logs.

Note You need to log in before you can comment on or make changes to this bug.