Bug 1271777 - router gateway is not reachable after tenant router reschdules to new L3 agent
Summary: router gateway is not reachable after tenant router reschdules to new L3 agent
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: z5
: 7.0 (Kilo)
Assignee: lpeer
QA Contact: Ofer Blaut
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-10-14 16:39 UTC by bigswitch
Modified: 2023-09-14 03:06 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-06-04 16:18:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description bigswitch 2015-10-14 16:39:39 UTC
Description of problem:
After tenant router reschedules to new L3 agent , router gateway is not reachable

Version-Release number of selected component (if applicable):
RHOSP 7.1

How reproducible:


Steps to Reproduce:
1.create a tenant router , network , create a VM on the network , make sure all are rechable
2. reboot the openstack controller/L3 agent which hosts the router
3.Router gets rescheduled to new L3 agent , router namespace is created in the new L3 agent.

Ping gateway from VM , you see ARP packets are getting received , How ever no response sent from name space port.

Actual results:


Expected results:


Additional info:

[root@overcloud-controller-0 heat-admin]# ip netns exec qrouter-7f7d64da-79ac-45cd-849a-835c6cb510be tcpdump -i qr-b76cd758-da
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on qr-b76cd758-da, link-type EN10MB (Ethernet), capture size 65535 bytes
17:41:03.149508 IP 10.17.0.1 > 224.0.0.5: OSPFv2, Hello, length 44
17:41:06.868205 ARP, Request who-has 1.1.1.1 tell 1.1.1.5, length 46
17:41:07.866440 ARP, Request who-has 1.1.1.1 tell 1.1.1.5, length 46
17:41:08.866271 ARP, Request who-has 1.1.1.1 tell 1.1.1.5, length 46
17:41:09.868217 ARP, Request who-has 1.1.1.1 tell 1.1.1.5, length 46
17:41:10.866293 ARP, Request who-has 1.1.1.1 tell 1.1.1.5, length 46
17:41:11.866308 ARP, Request who-has 1.1.1.1 tell 1.1.1.5, length 46
17:41:12.868542 ARP, Request who-has 1.1.1.1 tell 1.1.1.5, length 46
17:41:13.229388 IP 10.17.0.1 > 224.0.0.5: OSPFv2, Hello, length 44
17:41:13.866297 ARP, Request who-has 1.1.1.1 tell 1.1.1.5, length 46
17:41:14.866301 ARP, Request who-has 1.1.1.1 tell 1.1.1.5, length 46
^C
11 packets captured
11 packets received by filter
0 packets dropped by kernel
[root@overcloud-controller-0 heat-admin]# ip netns exec qrouter-7f7d64da-79ac-45cd-849a-835c6cb510be tcpdump -i qr-b76cd758-da -xxx
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on qr-b76cd758-da, link-type EN10MB (Ethernet), capture size 65535 bytes
17:41:18.869140 ARP, Request who-has 1.1.1.1 tell 1.1.1.5, length 46
        0x0000:  ffff ffff ffff fa16 3e8c 7e34 8100 0006
        0x0010:  0806 0001 0800 0604 0001 fa16 3e8c 7e34
        0x0020:  0101 0105 0000 0000 0000 0101 0101 0000
        0x0030:  0000 0000 0000 0000 0000 0000 0000 0000
17:41:19.866283 ARP, Request who-has 1.1.1.1 tell 1.1.1.5, length 46
        0x0000:  ffff ffff ffff fa16 3e8c 7e34 8100 0006
        0x0010:  0806 0001 0800 0604 0001 fa16 3e8c 7e34
        0x0020:  0101 0105 0000 0000 0000 0101 0101 0000
        0x0030:  0000 0000 0000 0000 0000 0000 0000 0000

[root@overcloud-controller-0 heat-admin]# ip netns exec qrouter-7f7d64da-79ac-45cd-849a-835c6cb510be route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.17.0.5       0.0.0.0         UG    0      0        0 qg-d92fd1ef-10
1.1.1.0         0.0.0.0         255.255.255.0   U     0      0        0 qr-b76cd758-da
1.1.2.0         0.0.0.0         255.255.255.0   U     0      0        0 qr-8ea964bd-d6
10.17.0.0       0.0.0.0         255.255.192.0   U     0      0        0 qg-d92fd1ef-10

[root@overcloud-controller-0 heat-admin]# ip netns exec qrouter-7f7d64da-79ac-45cd-849a-835c6cb510be ifconfig
lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 0  (Local Loopback)
        RX packets 92  bytes 7652 (7.4 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 92  bytes 7652 (7.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

qg-d92fd1ef-10: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.17.0.14  netmask 255.255.192.0  broadcast 10.17.63.255
        inet6 fe80::f816:3eff:fe06:1bef  prefixlen 64  scopeid 0x20<link>
        ether fa:16:3e:06:1b:ef  txqueuelen 1000  (Ethernet)
        RX packets 5003  bytes 331632 (323.8 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 19  bytes 1326 (1.2 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

qr-8ea964bd-d6: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 1.1.2.1  netmask 255.255.255.0  broadcast 1.1.2.255
        inet6 fe80::f816:3eff:fe93:efaf  prefixlen 64  scopeid 0x20<link>
        ether fa:16:3e:93:ef:af  txqueuelen 1000  (Ethernet)
        RX packets 74  bytes 6218 (6.0 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 10  bytes 864 (864.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

qr-b76cd758-da: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 1.1.1.1  netmask 255.255.255.0  broadcast 1.1.1.255
        inet6 fe80::f816:3eff:fec8:e3f8  prefixlen 64  scopeid 0x20<link>
        ether fa:16:3e:c8:e3:f8  txqueuelen 1000  (Ethernet)
        RX packets 5036  bytes 334168 (326.3 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 10  bytes 864 (864.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Comment 2 bigswitch 2015-10-14 18:17:39 UTC
After router reschedule , All the router port state set to "BUILD" which is why the port is not responding to any packets.

[stack@rhel-dell-71 ~]$ neutron router-port-list 7f7d64da-79ac-45cd-849a-835c6cb510be
+--------------------------------------+------+-------------------+-----------------------------------------------------------------------------------+
| id                                   | name | mac_address       | fixed_ips                                                                         |
+--------------------------------------+------+-------------------+-----------------------------------------------------------------------------------+
| 8ea964bd-d6e5-4b34-9b97-ab49557b976a |      | fa:16:3e:93:ef:af | {"subnet_id": "647233bd-1406-4e73-ad03-436a07b7fcc5", "ip_address": "1.1.2.1"}    |
| b76cd758-da75-439a-831d-737912a5e03f |      | fa:16:3e:c8:e3:f8 | {"subnet_id": "69323284-51f4-4b71-8567-8e7f376fb63e", "ip_address": "1.1.1.1"}    |
| d92fd1ef-10b5-4b3e-af5f-d3741ea563ae |      | fa:16:3e:06:1b:ef | {"subnet_id": "a94db0a8-c137-41b1-bac4-f4d98676c25f", "ip_address": "10.17.0.14"} |
+--------------------------------------+------+-------------------+-----------------------------------------------------------------------------------+
[stack@rhel-dell-71 ~]$ neutron port-show b76cd758-da75-439a-831d-737912a5e03f
+-----------------------+--------------------------------------------------------------------------------+
| Field                 | Value                                                                          |
+-----------------------+--------------------------------------------------------------------------------+
| admin_state_up        | True                                                                           |
| allowed_address_pairs |                                                                                |
| binding:host_id       | overcloud-controller-0.localdomain                                             |
| binding:profile       | {}                                                                             |
| binding:vif_details   | {"port_filter": true, "ovs_hybrid_plug": true}                                 |
| binding:vif_type      | ovs                                                                            |
| binding:vnic_type     | normal                                                                         |
| device_id             | 7f7d64da-79ac-45cd-849a-835c6cb510be                                           |
| device_owner          | network:router_interface                                                       |
| extra_dhcp_opts       |                                                                                |
| fixed_ips             | {"subnet_id": "69323284-51f4-4b71-8567-8e7f376fb63e", "ip_address": "1.1.1.1"} |
| id                    | b76cd758-da75-439a-831d-737912a5e03f                                           |
| mac_address           | fa:16:3e:c8:e3:f8                                                              |
| name                  |                                                                                |
| network_id            | f5da3160-37dd-4609-b3ee-2268b1e5d9ba                                           |
| security_groups       |                                                                                |
| status                | BUILD                                                                          |
| tenant_id             | d5e56f3c0c4d4474aeba54eb79d754a8                                               |
+-----------------------+--------------------------------------------------------------------------------+
[stack@rhel-dell-71 ~]$

Comment 3 Assaf Muller 2015-10-14 18:39:22 UTC
I gather you're not using HA routers, and that you enabled allow_automatic_l3agent_failover in neutron.conf? Can I ask why?

Comment 4 bigswitch 2015-10-14 21:23:59 UTC
The reason is because of existing known bug which is mentioned in 

https://bugzilla.redhat.com/show_bug.cgi?id=1260298

Comment 5 Nir Yechiel 2015-10-15 07:12:32 UTC
Can we mark this one closed as a duplicate of 
https://bugzilla.redhat.com/show_bug.cgi?id=1260298?

Comment 6 bigswitch 2015-10-16 15:46:42 UTC
Is router HA must configuration for multiple L3 agent setup?

Comment 12 Assaf Muller 2016-02-06 17:39:53 UTC
Did you try HA routers with the fix in https://bugzilla.redhat.com/show_bug.cgi?id=1253953?

Comment 13 Assaf Muller 2016-06-04 16:18:14 UTC
Waiting for needinfo from February, closing for now. Please re-open if still relevant.

Comment 14 Red Hat Bugzilla 2023-09-14 03:06:42 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.