Bug 877792

Summary: LinuxBridge plugin doesn't preserve the order of routing table after rebooting the network node.
Product: Red Hat OpenStack Reporter: Etsuji Nakai <enakai>
Component: openstack-quantumAssignee: Gary Kotton <gkotton>
Status: CLOSED ERRATA QA Contact: Nir Magnezi <nmagnezi>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 2.0 (Folsom)CC: apevec, maurizio.antillon
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-quantum-2012.2.1-1.el6ost Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-12-11 13:08:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
part1 of patches for routing table order.
none
part2 of patches for routing table order.
none
Replacement: part2 of patches for routing table order.
none
sosreport from netowrk-node none

Description Etsuji Nakai 2012-11-18 20:39:38 UTC
Description of problem:
With Quantum's LinuxBridge plugin, the order of routing table entries is not preserved after rebooting the network node. It causes the problem with dnsmasq when the virtual network is attached to a virtual router.


Version-Release number of selected component (if applicable):
$ rpm -qa | grep quantum
openstack-quantum-linuxbridge-2012.2-2.1.el6.noarch
python-quantumclient-2.1.1-0.el6.noarch
python-quantum-2012.2-2.1.el6.noarch
openstack-quantum-2012.2-2.1.el6.noarch


How reproducible:

The setup is one network node and one compute node connected through a VLAN switch.
Use Quantum's LinuxBridge plugin. Network namespace is disabled.

1. Create a VLAN network and attache it to a virtual router. (router01 is supposed to be defined in advance.)

# quantum net-create --tenant-id XXXX net01 --provider:network_type vlan --provider:physical_network physnet1 --provider:segmentation_id 101
# quantum subnet-create --tenant-id XXXX --name subnet01 net0 192.168.101.0/24
# quantum router-interface-add router01 subnet01

2. Check the routing table of the network node.

# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.101.0   0.0.0.0         255.255.255.0   U     0      0        0 ns-416260ac-5f
192.168.101.0   0.0.0.0         255.255.255.0   U     0      0        0 qr-55dcd739-f8
...

The dnsmasq port (ns-416260ac-5f) comes before the router port(qr-55dcd739-f8). This works well.

3. Reboot the network node and check the routing table again.

# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.101.0   0.0.0.0         255.255.255.0   U     0      0        0 qr-55dcd739-f8
192.168.101.0   0.0.0.0         255.255.255.0   U     0      0        0 ns-416260ac-5f
...

Now the router port(qr-55dcd739-f8) is on top of the table. With this, BOOTP reply from the dnsmasq does not go back to the compute node. I have to remove the first entry by hand for dnsmasq to work well.


Expected results:
The dnsmasq port (ns-416260ac-5f) comes before the router port(qr-55dcd739-f8) on the routing table even after rebooting the network node.

Comment 2 Etsuji Nakai 2012-11-21 06:45:01 UTC
Created attachment 649079 [details]
part1 of patches for routing table order.

Comment 3 Etsuji Nakai 2012-11-21 06:45:30 UTC
Created attachment 649081 [details]
part2 of patches for routing table order.

Comment 4 Etsuji Nakai 2012-11-21 06:49:47 UTC
Hi, I did some investigation on this.

1. Why the routing table order affects the dnsmasq.

I'm not sure but it probably comes from dnsmasq's reason. When another interface (sharing the same subnet) comes before dnsmasq's binding interface in the routing table, dnsmasq fails to send back DHCPOFFER packets.

2. Why the routing table order is not preserved after reboot.

It's due to the order of agent start-ups. Even if the dhcp-agent starts before l3-agent, it takes some time untile the dnsmasq's binding interface is created. At that time, l3-agent has already created a router interface and its entry is at the top of routing tables.

As a workaround, I added patches to dhcp-agent (quantum/agent/linux/dhcp.py and quantum/agent/linux/ip_lib.py) which reshuffles the routing table order when dnsmasq has started. Please take a look at the attched in Comment 2 and 3.

I'm not sure this is the right way to resolve the problem. At least, I tested it with the following case:

$ quantum net-list
+--------------------------------------+----------+--------------------------------------+
| id                                   | name     | subnets                              |
+--------------------------------------+----------+--------------------------------------+
| 313c7d4e-f649-41c0-bfd3-738af79b4f14 | net01    | dd9dd15c-8d99-47f2-9135-014982f8ef95 |
| 634356a4-ce08-44f2-98da-812197f8bca9 | public01 | 13432bbd-cfb6-4c72-9be9-d8bf93f31227 |
| f9371645-638d-41e6-a8d1-a78067db68de | net02    | 6cbe87c0-1a47-4d18-87da-aff7258903ec |
|                                      |          | ab36549c-086a-45c9-a353-df0c2cd09359 |
+--------------------------------------+----------+--------------------------------------+

net01 has one subnet(192.168.101.0/24). net02 has two subnets(192.168.102.0/24, 192.168.103.0/24).

After reboot:
# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.101.0   0.0.0.0         255.255.255.0   U     0      0        0 ns-416260ac-5f
192.168.101.0   0.0.0.0         255.255.255.0   U     0      0        0 qr-55dcd739-f8
192.168.102.0   0.0.0.0         255.255.255.0   U     0      0        0 ns-329eaa5e-5b
192.168.102.0   0.0.0.0         255.255.255.0   U     0      0        0 qr-346a9ee4-d7
192.168.103.0   0.0.0.0         255.255.255.0   U     0      0        0 ns-329eaa5e-5b
192.168.103.0   0.0.0.0         255.255.255.0   U     0      0        0 qr-a49aa99c-ab

ns-* comes before qr-* for each subnet.

Comment 5 Etsuji Nakai 2012-11-21 10:46:46 UTC
Created attachment 649138 [details]
Replacement: part2 of patches for routing table order.

Added exception handling for dhcp-disabled interfaces.

Comment 6 Gary Kotton 2012-11-22 09:21:21 UTC
Hi,
Thanks for all of the useful information. Are you using RHEL or Fedora. Following your description - 

[openstack@openstack devstack]$ route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.35.70.254    0.0.0.0         UG    0      0        0 eth0
10.0.0.0        0.0.0.0         255.255.255.0   U     0      0        0 ns-377c1e0b-71
10.0.0.0        0.0.0.0         255.255.255.0   U     0      0        0 qr-2d8fd41e-9e
10.35.70.0      0.0.0.0         255.255.255.0   U     0      0        0 eth0
192.168.122.0   0.0.0.0         255.255.255.0   U     0      0        0 virbr0

The problem does not reproduce for me when the qr is before the ns (l3 agent configurs it tap device prior to dhcp agent) routing entry and also when it is after the routing entry (dhcp agent configures it tap device prior to the l3 agent).

In both cases when I deploy a VM it receives the IP address from the DHCP agent.

I think that it is importnat for us to understand which dnsmasq version is running. This may provide a clue.

Thanks
Gary

Comment 7 Etsuji Nakai 2012-11-22 10:35:45 UTC
Created attachment 649670 [details]
sosreport from netowrk-node

Comment 8 Etsuji Nakai 2012-11-22 10:40:28 UTC
Hi Gary,

I'm using RHEL6.3.

[root@opst01 ~]# rpm -q dnsmasq
dnsmasq-2.48-6.el6.x86_64

[root@opst01 ~]# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 6.3 (Santiago)

[root@opst01 ~]# uname -a
Linux opst01 2.6.32-279.14.1.el6.x86_64 #1 SMP Mon Oct 15 13:44:51 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux

Please see the attached (Comment 7), a sosreport from the network-node for more details.

If you need additional information (specific to dnsmasq or anything else), pls let me know.

Thanks.

Comment 9 Gary Kotton 2012-11-22 11:39:57 UTC
Thanks for the info. Let me dig around and I'll get back to you.
A few questions:
1. How many networks do you have defined?
2. Did you define the external network to have dhcsp disabled?
3. Does the problem reproduce if there is only one network?
Thanks
Gary

Comment 10 Etsuji Nakai 2012-11-22 15:58:24 UTC
Hi Gary,

Here's the answer to your questions and the result of reproduction testing in my setup.

>1. How many networks do you have defined?
One external(public) network and one private network.

>2. Did you define the external network to have dhcsp disabled?
Yes. It's disabled.

>3. Does the problem reproduce if there is only one network?
Yes, I succeeded to reporoduce the problem with more simplified setting. It's one private network(net01) attached to a virtual router(router01), and no external network. Please see the follwing log.

1. Starting from a router "router01" without any networks.
# quantum router-list
+--------------------------------------+----------+-----------------------+
| id                                   | name     | external_gateway_info |
+--------------------------------------+----------+-----------------------+
| ccf3a044-56e9-40c3-85bf-eacc059606d6 | router01 | null                  |
+--------------------------------------+----------+-----------------------+
# grep ccf3a044-56e9-40c3-85bf-eacc059606d6 /etc/quantum/l3_agent.ini 
router_id = ccf3a044-56e9-40c3-85bf-eacc059606d6

2. Create private netowrk "net01" and its subnet "subnet01"
# tenant=$(keystone tenant-list|awk '/redhat/ {print $2}')
# quantum net-create --tenant-id $tenant  net01 --provider:network_type vlan --provider:physical_network physnet2 --provider:segmentation_id 101
Created a new network:
+---------------------------+--------------------------------------+
| Field                     | Value                                |
+---------------------------+--------------------------------------+
| admin_state_up            | True                                 |
| id                        | 89f6f5d9-f642-4861-834b-9a5adf153d9a |
| name                      | net01                                |
| provider:network_type     | vlan                                 |
| provider:physical_network | physnet2                             |
| provider:segmentation_id  | 101                                  |
| router:external           | False                                |
| shared                    | False                                |
| status                    | ACTIVE                               |
| subnets                   |                                      |
| tenant_id                 | 5e308a4f4a73488d9facbc3fb23c7d38     |
+---------------------------+--------------------------------------+

# quantum subnet-create --tenant-id $tenant --name subnet01 net01 192.168.101.0/24
Created a new subnet:
+------------------+------------------------------------------------------+
| Field            | Value                                                |
+------------------+------------------------------------------------------+
| allocation_pools | {"start": "192.168.101.2", "end": "192.168.101.254"} |
| cidr             | 192.168.101.0/24                                     |
| dns_nameservers  |                                                      |
| enable_dhcp      | True                                                 |
| gateway_ip       | 192.168.101.1                                        |
| host_routes      |                                                      |
| id               | f990bd89-2190-459d-a496-2624585d4193                 |
| ip_version       | 4                                                    |
| name             | subnet01                                             |
| network_id       | 89f6f5d9-f642-4861-834b-9a5adf153d9a                 |
| tenant_id        | 5e308a4f4a73488d9facbc3fb23c7d38                     |
+------------------+------------------------------------------------------+

3. Attach the subnet to the router.
# quantum router-interface-add router01 subnet01
Added interface to router router01

4. Now the routing table and bridge are as below:
# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.101.0   0.0.0.0         255.255.255.0   U     0      0        0 ns-fd37d7a5-97
192.168.101.0   0.0.0.0         255.255.255.0   U     0      0        0 qr-77db9df0-43
192.168.1.0     0.0.0.0         255.255.255.0   U     0      0        0 em2
10.64.200.0     0.0.0.0         255.255.254.0   U     0      0        0 em1
169.254.0.0     0.0.0.0         255.255.0.0     U     1007   0        0 em2
0.0.0.0         10.64.201.254   0.0.0.0         UG    99     0        0 em1

# brctl show
bridge name	bridge id		STP enabled	interfaces
brq89f6f5d9-f6		8000.e89a8fbe1f79	no		em2.101
							tap77db9df0-43
							tapfd37d7a5-97

5. Reboot the node and check the routing table again.
# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.101.0   0.0.0.0         255.255.255.0   U     0      0        0 qr-77db9df0-43
192.168.101.0   0.0.0.0         255.255.255.0   U     0      0        0 ns-fd37d7a5-97
192.168.1.0     0.0.0.0         255.255.255.0   U     0      0        0 em2
10.64.200.0     0.0.0.0         255.255.254.0   U     0      0        0 em1
169.254.0.0     0.0.0.0         255.255.0.0     U     1006   0        0 em1
169.254.0.0     0.0.0.0         255.255.0.0     U     1007   0        0 em2
0.0.0.0         10.64.201.254   0.0.0.0         UG    0      0        0 em1

Router port(qr-*) is above the dnsmasq's binding port(ns-*).

6. Launch a VM attached to net01.

In /var/log/messages, dnsmasq says it's replying DHCPOFFER.

Nov 22 23:15:39 opst01 dnsmasq-dhcp[2964]: DHCPDISCOVER(ns-fd37d7a5-97) fa:16:3e:95:66:bb 
Nov 22 23:15:39 opst01 dnsmasq-dhcp[2964]: DHCPOFFER(ns-fd37d7a5-97) 192.168.101.3 fa:16:3e:95:66:bb 
Nov 22 23:15:47 opst01 dnsmasq-dhcp[2964]: DHCPDISCOVER(ns-fd37d7a5-97) fa:16:3e:95:66:bb 
Nov 22 23:15:47 opst01 dnsmasq-dhcp[2964]: DHCPOFFER(ns-fd37d7a5-97) 192.168.101.3 fa:16:3e:95:66:bb 

But tcpdump on the physical port (for private network) records no reply from dnsmasq.

# tcpdump -nlSi em2
tcpdump: WARNING: em2: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em2, link-type EN10MB (Ethernet), capture size 65535 bytes
23:15:26.192047 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:95:66:bb, length 300
23:15:26.193082 ARP, Request who-has 192.168.101.3 tell 192.168.101.1, length 28
23:15:27.193081 ARP, Request who-has 192.168.101.3 tell 192.168.101.1, length 28
23:15:28.193027 ARP, Request who-has 192.168.101.3 tell 192.168.101.1, length 28
23:15:33.141672 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:95:66:bb, length 300
23:15:33.142025 ARP, Request who-has 192.168.101.3 tell 192.168.101.1, length 28
23:15:34.142083 ARP, Request who-has 192.168.101.3 tell 192.168.101.1, length 28
23:15:35.142090 ARP, Request who-has 192.168.101.3 tell 192.168.101.1, length 28

7. Remove the routing table entry for qr-* by hand.
# ip route del 192.168.101.0/24 dev qr-77db9df0-43

Now tcpdump records the actual reply packets.

23:16:51.373285 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:95:66:bb, length 300
23:16:51.373504 IP 192.168.101.2.bootps > 192.168.101.3.bootpc: BOOTP/DHCP, Reply, length 323
23:16:51.374385 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:95:66:bb, length 300
23:16:51.374600 IP 192.168.101.2.bootps > 192.168.101.3.bootpc: BOOTP/DHCP, Reply, length 323
23:16:56.374078 ARP, Request who-has 192.168.101.3 tell 192.168.101.2, length 28
23:16:56.374509 ARP, Reply 192.168.101.3 is-at fa:16:3e:95:66:bb, length 42

It matches with the /var/log/messages.

Nov 22 23:16:51 opst01 dnsmasq-dhcp[2964]: DHCPDISCOVER(ns-fd37d7a5-97) fa:16:3e:95:66:bb 
Nov 22 23:16:51 opst01 dnsmasq-dhcp[2964]: DHCPOFFER(ns-fd37d7a5-97) 192.168.101.3 fa:16:3e:95:66:bb 
Nov 22 23:16:51 opst01 dnsmasq-dhcp[2964]: DHCPREQUEST(ns-fd37d7a5-97) 192.168.101.3 fa:16:3e:95:66:bb 
Nov 22 23:16:51 opst01 dnsmasq-dhcp[2964]: DHCPACK(ns-fd37d7a5-97) 192.168.101.3 fa:16:3e:95:66:bb 192-168-101-3

And the VM has been successfuly assigned an IP from dnsmasq.

Comment 11 Etsuji Nakai 2012-11-22 16:13:04 UTC
One thing to add. As RHEL6.3's iproute2 doesn't support netowrk namespace, use_namesapces is disabled in my setup.

/etc/quantum/dhcp_agent.ini:use_namespaces = False
/etc/quantum/l3_agent.ini:use_namespaces = False

Comment 12 Gary Kotton 2012-11-26 13:38:05 UTC
Hi,
Thanks for the information. I am currently investigating.
I wonder if this also affects the nova flat networking. It also uses the dnsmasq in the same way that Quantum uses it.
It would be best to find the actual problem and then correct it at the source.
Thanks
Gary

Comment 13 Gary Kotton 2012-11-26 14:00:14 UTC
Hi,
I was looking in the nova code and came upon the following comment:

    # NOTE(vish): The ip for dnsmasq has to be the first address on the
    #             bridge for it to respond to reqests properly

The code then proceeds to ensure that the IP address of the dnsmasq is the first added.
I'll take care of this upstream.
Thanks
Gary

Comment 14 Gary Kotton 2012-11-26 17:12:43 UTC
Upstream patch based on Etsuji Nakai code:) 
(https://review.openstack.org/#/c/16907/)

Comment 16 Gary Kotton 2012-12-05 15:15:56 UTC
Please see comment #10. This has all of the information required.
Etsuji Nakai did brilliant work her isolating, edscribing and even fixing the problem. 
Kudos

Comment 20 errata-xmlrpc 2012-12-11 13:08:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-1561.html