Bug 1692861 - [OSP 14] DVR+L3HA External conectivity is lost sometimes after a switch over between controllers
Summary: [OSP 14] DVR+L3HA External conectivity is lost sometimes after a switch over ...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 14.0 (Rocky)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Rodolfo Alonso
QA Contact: Roee Agiman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-26 15:17 UTC by Candido Campos
Modified: 2019-07-22 14:02 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-07-22 14:02:39 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Candido Campos 2019-03-26 15:17:24 UTC
Description of problem:

[OSP 14] DVR+L3HA External conectivity is lost sometimes after a switch over between controllers


Version-Release number of selected component (if applicable):

(undercloud) [stack@undercloud-0 ~]$ cat /etc/rhosp-release 
Red Hat OpenStack Platform release 14.0.1 RC (Rocky)
(undercloud) [stack@undercloud-0 ~]$ cat core_puddle_version 
2019-03-06.1(undercloud) [stack@undercloud-0 ~]$ 



How reproducible:

Deploy a test bed with DVR and 3 comtroller + at least 1 compute

Steps to Reproduce:
1.Create a network and a router with a external gw
2.create a vm(execute a ping 8.8.8.8)
3.Force a switch over between controllers, doing for example:

[root@controller-0 heat-admin]# ip netns exec snat-91f1de98-c00d-4645-a1f8-e680354e932d ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
53: ha-d714c74c-df: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:81:96:4a brd ff:ff:ff:ff:ff:ff
    inet 169.254.192.2/18 brd 169.254.255.255 scope global ha-d714c74c-df
       valid_lft forever preferred_lft forever
    inet 169.254.0.1/24 scope global ha-d714c74c-df
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe81:964a/64 scope link 
       valid_lft forever preferred_lft forever
56: sg-29eff4c6-b0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:cd:07:34 brd ff:ff:ff:ff:ff:ff
    inet 10.1.0.19/24 scope global sg-29eff4c6-b0
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fecd:734/64 scope link nodad 
       valid_lft forever preferred_lft forever
57: qg-2de4affb-d1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:79:c1:7d brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.210/24 scope global qg-2de4affb-d1
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe79:c17d/64 scope link nodad 
       valid_lft forever preferred_lft forever
[root@controller-0 heat-admin]# ip netns exec snat-91f1de98-c00d-4645-a1f8-e680354e932d ip link set ha-d714c74c-df down




Actual results:

External conectivity is lost

Expected results:

No impact

Additional info:
If you cannot reproduce the issue then you can try to disable logs and restart neutron components:

docker restart neutron_l3_agent neutron_ovs_agent neutron_dhcp

Issue seems be a race condition in ovs agent/l3 agent/ovs

the problem is that snat qg interface is not configured in ovs after the switch over:

[root@controller-0 heat-admin]# ip netns exec snat-91f1de98-c00d-4645-a1f8-e680354e932d ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
53: ha-d714c74c-df: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:81:96:4a brd ff:ff:ff:ff:ff:ff
    inet 169.254.192.2/18 brd 169.254.255.255 scope global ha-d714c74c-df
       valid_lft forever preferred_lft forever
    inet 169.254.0.1/24 scope global ha-d714c74c-df
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe81:964a/64 scope link 
       valid_lft forever preferred_lft forever
56: sg-29eff4c6-b0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:cd:07:34 brd ff:ff:ff:ff:ff:ff
    inet 10.1.0.19/24 scope global sg-29eff4c6-b0
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fecd:734/64 scope link nodad 
       valid_lft forever preferred_lft forever
57: qg-2de4affb-d1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:79:c1:7d brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.210/24 scope global qg-2de4affb-d1
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe79:c17d/64 scope link nodad 
       valid_lft forever preferred_lft forever
[root@controller-0 heat-admin]# ovs-vsctl show | grep -A 5 qg-2de4affb-d1                                                                                                                                          
        Port "qg-2de4affb-d1"
            tag: 4095
            Interface "qg-2de4affb-d1"
                type: internal
        Port "sg-29eff4c6-b0"
            tag: 1
            Interface "sg-29eff4c6-b0"
                type: internal
[root@controller-0 heat-admin]# 


If the l3 agent is reloaded the conectivity is recovered(no reloadeing ovs agent):

[root@controller-0 heat-admin]# ip netns exec snat-91f1de98-c00d-4645-a1f8-e680354e932d ip a^C
[root@controller-0 heat-admin]# ip netns exec snat-91f1de98-c00d-4645-a1f8-e680354e932d ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
^C
--- 8.8.8.8 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1000ms

[root@controller-0 heat-admin]# docker restart neutron_ovs_agent
neutron_ovs_agent
[root@controller-0 heat-admin]# ip netns exec snat-91f1de98-c00d-4645-a1f8-e680354e932d ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
From 10.0.0.210 icmp_seq=1 Destination Host Unreachable
From 10.0.0.210 icmp_seq=2 Destination Host Unreachable
From 10.0.0.210 icmp_seq=3 Destination Host Unreachable
From 10.0.0.210 icmp_seq=4 Destination Host Unreachable
^C
--- 8.8.8.8 ping statistics ---
6 packets transmitted, 0 received, +4 errors, 100% packet loss, time 5001ms
pipe 4
[root@controller-0 heat-admin]# docker restart neutron_l3_agent
neutron_l3_agent
[root@controller-0 heat-admin]# ip netns exec snat-91f1de98-c00d-4645-a1f8-e680354e932d ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
From 10.0.0.210 icmp_seq=1 Destination Host Unreachable
From 10.0.0.210 icmp_seq=2 Destination Host Unreachable
From 10.0.0.210 icmp_seq=3 Destination Host Unreachable
From 10.0.0.210 icmp_seq=4 Destination Host Unreachable
From 10.0.0.210 icmp_seq=5 Destination Host Unreachable
From 10.0.0.210 icmp_seq=6 Destination Host Unreachable
From 10.0.0.210 icmp_seq=7 Destination Host Unreachable
64 bytes from 8.8.8.8: icmp_seq=8 ttl=117 time=52.8 ms
^C
--- 8.8.8.8 ping statistics ---
8 packets transmitted, 1 received, +7 errors, 87% packet loss, time 7001ms
rtt min/avg/max/mdev = 52.839/52.839/52.839/0.000 ms, pipe 4
[root@controller-0 heat-admin]# ip netns exec snat-91f1de98-c00d-4645-a1f8-e680354e932d ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=117 time=52.1 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=117 time=52.1 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=117 time=52.1 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=117 time=52.1 ms
64 bytes from 8.8.8.8: icmp_seq=5 ttl=117 time=52.1 ms
64 bytes from 8.8.8.8: icmp_seq=6 ttl=117 time=52.1 ms
64 bytes from 8.8.8.8: icmp_seq=7 ttl=117 time=52.1 ms
^C
--- 8.8.8.8 ping statistics ---
7 packets transmitted, 7 received, 0% packet loss, time 6007ms
rtt min/avg/max/mdev = 52.136/52.173/52.195/0.212 ms
[root@controller-0 heat-admin]# 

The problem is reproduced if qg interfaqces are dsable in the ovs of the slavaes and it is not reproduced if it is configured:

Not configured:

[root@controller-0 heat-admin]# ovs-vsctl show | grep -A 5 qg-2de4affb-d1                                                                                                                                          
        Port "qg-2de4affb-d1"
            tag: 4095
            Interface "qg-2de4affb-d1"
                type: internal
        Port "sg-29eff4c6-b0"
            tag: 1
            Interface "sg-29eff4c6-b0"
                type: internal
[root@controller-0 heat-admin]# 


Configured:
[root@controller-0 heat-admin]# ovs-vsctl show | grep -A 5 qg-2de4affb-d1
        Port "qg-2de4affb-d1"
            tag: 3
            Interface "qg-2de4affb-d1"
                type: internal
        Port "sg-29eff4c6-b0"
            tag: 1
            Interface "sg-29eff4c6-b0"
                type: internal

what should be the state in a controller in slave mode?





ovs agent logs:

use it has port security disabled
2019-03-26 10:44:57.239 261225 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-9a9d52eb-953c-4f7c-84a9-d8db4332dbc3 - - - - -] Skipping ARP spoofing rules for port 'fg-2f2cf670-05' because it has port security disabled
2019-03-26 10:45:00.990 261225 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-9a9d52eb-953c-4f7c-84a9-d8db4332dbc3 - - - - -] Configuration for devices up [u'2de4affb-d1ee-4bb6-b071-0b2b99f25f8a', u'0ec07e9e-9706-47bf-8b22-f0c9eea06219', u'4e5d6150-eb31-41ad-8b53-904c4e09d9eb', u'29eff4c6-b0bd-4d0a-b536-73e0d96a6091', u'e904d135-aca6-4acd-af9b-21fdc1d5a791', u'2f2cf670-05b9-41d8-beaa-f20004c341f0'] and devices down [] completed.
2019-03-26 10:49:05.025 261225 WARNING neutron.agent.rpc [req-9a9d52eb-953c-4f7c-84a9-d8db4332dbc3 - - - - -] Device Port(admin_state_up=True,allowed_address_pairs=[],binding_levels=[],bindings=[PortBinding],created_at=2019-03-25T15:58:12Z,data_plane_status=<?>,description='',device_id='91f1de98-c00d-4645-a1f8-e680354e932d',device_owner='network:router_gateway',dhcp_options=[],distributed_bindings=[],dns=None,fixed_ips=[IPAllocation],id=2de4affb-d1ee-4bb6-b071-0b2b99f25f8a,mac_address=fa:16:3e:79:c1:7d,name='',network_id=8f9bf8be-9e57-4b35-8f0c-453ab43bd8da,project_id='',qos_policy_id=None,revision_number=115,security=PortSecurity(2de4affb-d1ee-4bb6-b071-0b2b99f25f8a),security_group_ids=set([]),status='DOWN',updated_at=2019-03-26T10:49:04Z) is not bound.
2019-03-26 10:49:05.028 261225 WARNING neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-9a9d52eb-953c-4f7c-84a9-d8db4332dbc3 - - - - -] Device 2de4affb-d1ee-4bb6-b071-0b2b99f25f8a not defined on plugin or binding failed
2019-03-26 10:49:05.040 261225 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-9a9d52eb-953c-4f7c-84a9-d8db4332dbc3 - - - - -] Port 29eff4c6-b0bd-4d0a-b536-73e0d96a6091 updated. Details: {'profile': {}, 'network_qos_policy_id': None, 'qos_policy_id': None, 'allowed_address_pairs': [], 'admin_state_up': True, 'network_id': 'd04e5f2f-3ab2-4cc9-ae5e-5f11b0424c29', 'segmentation_id': 13, 'fixed_ips': [{'subnet_id': 'ac738500-d85a-4b4e-bda9-18c5a6bf472a', 'ip_address': '10.1.0.19'}], 'device_owner': u'network:router_centralized_snat', 'physical_network': None, 'mac_address': 'fa:16:3e:cd:07:34', 'device': '29eff4c6-b0bd-4d0a-b536-73e0d96a6091', 'port_security_enabled': False, 'port_id': '29eff4c6-b0bd-4d0a-b536-73e0d96a6091', 'network_type': u'vxlan', 'security_groups': []}
2019-03-26 10:49:05.046 261225 INFO neutron.agent.securitygroups_rpc [req-9a9d52eb-953c-4f7c-84a9-d8db4332dbc3 - - - - -] Refresh firewall rules
2019-03-26 10:49:05.078 261225 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-9a9d52eb-953c-4f7c-84a9-d8db4332dbc3 - - - - -] Skipping ARP spoofing rules for port 'sg-29eff4c6-b0' because it has port security disabled
2019-03-26 10:49:06.443 261225 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-9a9d52eb-953c-4f7c-84a9-d8db4332dbc3 - - - - -] Configuration for devices up ['29eff4c6-b0bd-4d0a-b536-73e0d96a6091'] and devices down [] completed.
2019-03-26 10:49:07.021 261225 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-9a9d52eb-953c-4f7c-84a9-d8db4332dbc3 - - - - -] Port 'qg-2de4affb-d1' has lost its vlan tag '3'!
2019-03-26 10:49:07.835 261225 WARNING neutron.agent.rpc [req-9a9d52eb-953c-4f7c-84a9-d8db4332dbc3 - - - - -] Device Port(admin_state_up=True,allowed_address_pairs=[],binding_levels=[],bindings=[PortBinding],created_at=2019-03-25T15:58:12Z,data_plane_status=<?>,description='',device_id='91f1de98-c00d-4645-a1f8-e680354e932d',device_owner='network:router_gateway',dhcp_options=[],distributed_bindings=[],dns=None,fixed_ips=[IPAllocation],id=2de4affb-d1ee-4bb6-b071-0b2b99f25f8a,mac_address=fa:16:3e:79:c1:7d,name='',network_id=8f9bf8be-9e57-4b35-8f0c-453ab43bd8da,project_id='',qos_policy_id=None,revision_number=120,security=PortSecurity(2de4affb-d1ee-4bb6-b071-0b2b99f25f8a),security_group_ids=set([]),status='DOWN',updated_at=2019-03-26T10:49:07Z) is not bound.
2019-03-26 10:49:07.838 261225 WARNING neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-9a9d52eb-953c-4f7c-84a9-d8db4332dbc3 - - - - -] Device 2de4affb-d1ee-4bb6-b071-0b2b99f25f8a not defined on plugin or binding failed
2019-03-26 10:49:07.839 261225 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-9a9d52eb-953c-4f7c-84a9-d8db4332dbc3 - - - - -] Port 29eff4c6-b0bd-4d0a-b536-73e0d96a6091 updated. Details: {'profile': {}, 'network_qos_policy_id': None, 'qos_policy_id': None, 'allowed_address_pairs': [], 'admin_state_up': True, 'network_id': 'd04e5f2f-3ab2-4cc9-ae5e-5f11b0424c29', 'segmentation_id': 13, 'fixed_ips': [{'subnet_id': 'ac738500-d85a-4b4e-bda9-18c5a6bf472a', 'ip_address': '10.1.0.19'}], 'device_owner': u'network:router_centralized_snat', 'physical_network': None, 'mac_address': 'fa:16:3e:cd:07:34', 'device': '29eff4c6-b0bd-4d0a-b536-73e0d96a6091', 'port_security_enabled': False, 'port_id': '29eff4c6-b0bd-4d0a-b536-73e0d96a6091', 'network_type': u'vxlan', 'security_groups': []}
2019-03-26 10:49:07.846 261225 INFO neutron.agent.securitygroups_rpc [req-9a9d52eb-953c-4f7c-84a9-d8db4332dbc3 - - - - -] Refresh firewall rules
2019-03-26 10:49:07.869 261225 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-9a9d52eb-953c-4f7c-84a9-d8db4332dbc3 - - - - -] Skipping ARP spoofing rules for port 'sg-29eff4c6-b0' because it has port security disabled

Comment 1 Assaf Muller 2019-03-26 15:32:54 UTC
Hi Candido,

Note that we don't support enabling L3 HA + DVR at the same time, only one or the other.

If you cannot reproduce on an environment that has either but not both, please close this bug.

Comment 2 Candido Campos 2019-03-26 15:56:46 UTC
Hi, 

 I can reproduce the same issue withou ha, if the controller with the qrouter is rebooted. But this other reproduction method is affected for other bug related with rabbit:

 https://bugzilla.redhat.com/show_bug.cgi?id=1661806

 this reproduction methos is more clear and the bug seems be in the switch over mechanism with DVR, then if it is ok for you, we can investigate more in deep the issue and if it is only a bug related to DVR+HA then I close it:

This problen is affecting the test with DVR.
Sometimes after the reboot of the controllers the external connectivity is lost:

the qr router is bad configure by ovs agent

[root@controller-1 heat-admin]# ip netns exec qrouter-68a2f0fb-170a-4ae9-8c4f-c154875a237d ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
18: qr-42dd88ef-f7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:04:d3:ab brd ff:ff:ff:ff:ff:ff
    inet 10.1.0.1/24 brd 10.1.0.255 scope global qr-42dd88ef-f7
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe04:d3ab/64 scope link tentative dadfailed 
       valid_lft forever preferred_lft forever
19: qg-f41b8b30-78: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:37:27:4f brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.228/24 brd 10.0.0.255 scope global qg-f41b8b30-78
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe37:274f/64 scope link 
       valid_lft forever preferred_lft forever


the qg-XX port is not connected with br-ex and it is the reason of the connectivity lost.

[root@controller-1 heat-admin]# ovs-vsctl show 
5b1def1a-50bd-417f-97f0-e92f61d9f274
    Manager "ptcp:6640:127.0.0.1"
        is_connected: true
    Bridge br-int
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port "qg-f41b8b30-78"
            tag: 4095
            Interface "qg-f41b8b30-78"
                type: internal
        Port "qr-42dd88ef-f7"
            tag: 1
            Interface "qr-42dd88ef-f7"
                type: internal
        Port br-int
            Interface br-int
                type: internal
        Port "tapbd74ad5a-25"
            tag: 1
            Interface "tapbd74ad5a-25"
                type: internal
        Port int-br-isolated
            Interface int-br-isolated
                type: patch
                options: {peer=phy-br-isolated}
        Port int-br-ex
            Interface int-br-ex
                type: patch
                options: {peer=phy-br-ex}


 The problem appears when the rabbit problem starts, but we are investigating if there are some other issue in ovs-agente because the connectivity is recovered if the ovs  agent is restarted:

Logs of the problem:

[root@controller-1 heat-admin]# egrep "42dd88ef-f751-4306-82e8-75fd43611ddd|68a2f0fb-170a-4ae9-8c4f-c154875a237d" openvswitch-agent.log_error | tail -20
2019-03-11 17:41:25.289 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Processing port: 42dd88ef-f751-4306-82e8-75fd43611ddd treat_devices_added_or_updated /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:1568
2019-03-11 17:41:25.289 13164 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Port 42dd88ef-f751-4306-82e8-75fd43611ddd updated. Details: {'profile': {}, 'network_qos_policy_id': None, 'qos_policy_id': None, 'allowed_address_pairs': [], 'admin_state_up': True, 'network_id': 'c545f670-df99-4ce7-9089-6215d0326afa', 'segmentation_id': 1099, 'fixed_ips': [{'subnet_id': '2bfccf3a-4243-44a5-87c8-bbff345e38e0', 'ip_address': '10.1.0.1'}], 'device_owner': u'network:router_interface', 'physical_network': u'tenant', 'mac_address': 'fa:16:3e:04:d3:ab', 'device': u'42dd88ef-f751-4306-82e8-75fd43611ddd', 'port_security_enabled': False, 'port_id': '42dd88ef-f751-4306-82e8-75fd43611ddd', 'network_type': u'vlan', 'security_groups': []}
2019-03-11 17:41:25.297 13164 DEBUG neutron.agent.l2.extensions.qos [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] QoS extension did not have information on port 42dd88ef-f751-4306-82e8-75fd43611ddd clean_by_port /usr/lib/python2.7/site-packages/neutron/agent/l2/extensions/qos.py:190
2019-03-11 17:41:25.297 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.extension_drivers.qos_driver [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] delete_dscp_marking was received for port 42dd88ef-f751-4306-82e8-75fd43611ddd but no port information was stored to be deleted delete_dscp_marking /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/extension_drivers/qos_driver.py:146
2019-03-11 17:41:25.297 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.extension_drivers.qos_driver [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] delete_bandwidth_limit was received for port 42dd88ef-f751-4306-82e8-75fd43611ddd but port was not found. It seems that bandwidth_limit is already deleted delete_bandwidth_limit /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/extension_drivers/qos_driver.py:74
2019-03-11 17:41:25.298 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.extension_drivers.qos_driver [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] delete_bandwidth_limit_ingress was received for port 42dd88ef-f751-4306-82e8-75fd43611ddd but port was not found. It seems that bandwidth_limit is already deleted delete_bandwidth_limit_ingress /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/extension_drivers/qos_driver.py:88
2019-03-11 17:41:25.308 13164 INFO neutron.agent.securitygroups_rpc [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Preparing filters for devices set([u'42dd88ef-f751-4306-82e8-75fd43611ddd', u'f41b8b30-7825-45d6-a891-1e2889778d07'])
2019-03-11 17:41:25.358 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Setting status for 42dd88ef-f751-4306-82e8-75fd43611ddd to UP _bind_devices /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:905
2019-03-11 17:41:25.824 13164 DEBUG neutron.agent.resource_cache [req-fc0c4159-35b4-48a1-8283-bcb71aa4124a - - - - -] Resource Port 42dd88ef-f751-4306-82e8-75fd43611ddd updated (revision_number 32->33). Old fields: {'status': u'DOWN'} New fields: {'status': u'ACTIVE'} record_resource_update /usr/lib/python2.7/site-packages/neutron/agent/resource_cache.py:185
2019-03-11 17:41:38.019 13164 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Configuration for devices up [u'42dd88ef-f751-4306-82e8-75fd43611ddd'] and devices down [] completed.
2019-03-11 17:41:38.030 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Starting to process devices in:{'current': set([u'bd74ad5a-25c1-4660-8243-5a59701bed95', u'f41b8b30-7825-45d6-a891-1e2889778d07', u'42dd88ef-f751-4306-82e8-75fd43611ddd']), 'removed': set([]), 'added': set([]), 'updated': set(['42dd88ef-f751-4306-82e8-75fd43611ddd'])} rpc_loop /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:2159
2019-03-11 17:41:38.031 13164 DEBUG neutron.agent.rpc [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Returning: {'profile': {}, 'network_qos_policy_id': None, 'qos_policy_id': None, 'allowed_address_pairs': [], 'admin_state_up': True, 'network_id': 'c545f670-df99-4ce7-9089-6215d0326afa', 'segmentation_id': 1099, 'fixed_ips': [{'subnet_id': '2bfccf3a-4243-44a5-87c8-bbff345e38e0', 'ip_address': '10.1.0.1'}], 'device_owner': u'network:router_interface', 'physical_network': u'tenant', 'mac_address': 'fa:16:3e:04:d3:ab', 'device': '42dd88ef-f751-4306-82e8-75fd43611ddd', 'port_security_enabled': False, 'port_id': '42dd88ef-f751-4306-82e8-75fd43611ddd', 'network_type': u'vlan', 'security_groups': []} get_device_details /usr/lib/python2.7/site-packages/neutron/agent/rpc.py:328
2019-03-11 17:41:38.034 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Processing port: 42dd88ef-f751-4306-82e8-75fd43611ddd treat_devices_added_or_updated /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:1568
2019-03-11 17:41:38.034 13164 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Port 42dd88ef-f751-4306-82e8-75fd43611ddd updated. Details: {'profile': {}, 'network_qos_policy_id': None, 'qos_policy_id': None, 'allowed_address_pairs': [], 'admin_state_up': True, 'network_id': 'c545f670-df99-4ce7-9089-6215d0326afa', 'segmentation_id': 1099, 'fixed_ips': [{'subnet_id': '2bfccf3a-4243-44a5-87c8-bbff345e38e0', 'ip_address': '10.1.0.1'}], 'device_owner': u'network:router_interface', 'physical_network': u'tenant', 'mac_address': 'fa:16:3e:04:d3:ab', 'device': '42dd88ef-f751-4306-82e8-75fd43611ddd', 'port_security_enabled': False, 'port_id': '42dd88ef-f751-4306-82e8-75fd43611ddd', 'network_type': u'vlan', 'security_groups': []}
2019-03-11 17:41:38.037 13164 DEBUG neutron.agent.l2.extensions.qos [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] QoS extension did not have information on port 42dd88ef-f751-4306-82e8-75fd43611ddd clean_by_port /usr/lib/python2.7/site-packages/neutron/agent/l2/extensions/qos.py:190
2019-03-11 17:41:38.037 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.extension_drivers.qos_driver [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] delete_dscp_marking was received for port 42dd88ef-f751-4306-82e8-75fd43611ddd but no port information was stored to be deleted delete_dscp_marking /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/extension_drivers/qos_driver.py:146
2019-03-11 17:41:38.038 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.extension_drivers.qos_driver [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] delete_bandwidth_limit was received for port 42dd88ef-f751-4306-82e8-75fd43611ddd but port was not found. It seems that bandwidth_limit is already deleted delete_bandwidth_limit /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/extension_drivers/qos_driver.py:74
2019-03-11 17:41:38.038 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.extension_drivers.qos_driver [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] delete_bandwidth_limit_ingress was received for port 42dd88ef-f751-4306-82e8-75fd43611ddd but port was not found. It seems that bandwidth_limit is already deleted delete_bandwidth_limit_ingress /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/extension_drivers/qos_driver.py:88
2019-03-11 17:41:38.078 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Setting status for 42dd88ef-f751-4306-82e8-75fd43611ddd to UP _bind_devices /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:905
2019-03-11 17:41:38.610 13164 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Configuration for devices up ['42dd88ef-f751-4306-82e8-75fd43611ddd'] and devices down [] completed.


the qg-port seems be unknown in some momento for the agent:

2019-03-11 17:41:25.282 13164 DEBUG neutron.agent.resource_cache [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Received new resource Port: Port(admin_state_up=True,allowed_address_pairs=[],binding_levels=[PortBindingLevel],bindings=[PortBinding],created_at=2019-03-11T14:11:58Z,data_plane_status=<?>,description='',device_id='68a2f0fb-170a-4ae9-8c4f-c154875a237d',device_owner='network:router_interface',dhcp_options=[],distributed_bindings=[],dns=None,fixed_ips=[IPAllocation],id=42dd88ef-f751-4306-82e8-75fd43611ddd,mac_address=fa:16:3e:04:d3:ab,name='',network_id=c545f670-df99-4ce7-9089-6215d0326afa,project_id='b50b9f21ca0f41d488637498aae7ffa4',qos_policy_id=None,revision_number=32,security=PortSecurity(42dd88ef-f751-4306-82e8-75fd43611ddd),security_group_ids=set([]),status='DOWN',updated_at=2019-03-11T17:41:23Z) record_resource_update /usr/lib/python2.7/site-packages/neutron/agent/resource_cache.py:187
2019-03-11 17:41:25.284 13164 WARNING neutron.agent.rpc [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Device Port(admin_state_up=True,allowed_address_pairs=[],binding_levels=[],bindings=[PortBinding],created_at=2019-03-11T14:12:08Z,data_plane_status=<?>,description='',device_id='68a2f0fb-170a-4ae9-8c4f-c154875a237d',device_owner='network:router_gateway',dhcp_options=[],distributed_bindings=[],dns=None,fixed_ips=[IPAllocation],id=f41b8b30-7825-45d6-a891-1e2889778d07,mac_address=fa:16:3e:37:27:4f,name='',network_id=11e4f8b8-b6cc-40ea-86c0-57e21d0e969a,project_id='',qos_policy_id=None,revision_number=32,security=PortSecurity(f41b8b30-7825-45d6-a891-1e2889778d07),security_group_ids=set([]),status='DOWN',updated_at=2019-03-11T17:41:19Z) is not bound.

Logs in the cae of the restart:

[root@controller-1 heat-admin]# grep "68a2f0fb-170a-4ae9-8c4f-c154875a237d" openvswitch-agent.log_start | tail -20f
2019-03-11 17:48:32.150 93016 DEBUG neutron.agent.resource_cache [req-c3801057-1743-4dea-8b7f-e3b908b59c54 - - - - -] Received new resource Port: Port(admin_state_up=True,allowed_address_pairs=[],binding_levels=[PortBindingLevel],bindings=[PortBinding],created_at=2019-03-11T14:12:08Z,data_plane_status=<?>,description='',device_id='68a2f0fb-170a-4ae9-8c4f-c154875a237d',device_owner='network:router_gateway',dhcp_options=[],distributed_bindings=[],dns=None,fixed_ips=[IPAllocation],id=f41b8b30-7825-45d6-a891-1e2889778d07,mac_address=fa:16:3e:37:27:4f,name='',network_id=11e4f8b8-b6cc-40ea-86c0-57e21d0e969a,project_id='',qos_policy_id=None,revision_number=33,security=PortSecurity(f41b8b30-7825-45d6-a891-1e2889778d07),security_group_ids=set([]),status='DOWN',updated_at=2019-03-11T17:41:22Z) record_resource_update /usr/lib/python2.7/site-packages/neutron/agent/resource_cache.py:187
2019-03-11 17:48:32.355 93016 DEBUG neutron.agent.resource_cache [req-c3801057-1743-4dea-8b7f-e3b908b59c54 - - - - -] Received new resource Port: Port(admin_state_up=True,allowed_address_pairs=[],binding_levels=[PortBindingLevel],bindings=[PortBinding],created_at=2019-03-11T14:11:58Z,data_plane_status=<?>,description='',device_id='68a2f0fb-170a-4ae9-8c4f-c154875a237d',device_owner='network:router_interface',dhcp_options=[],distributed_bindings=[],dns=None,fixed_ips=[IPAllocation],id=42dd88ef-f751-4306-82e8-75fd43611ddd,mac_address=fa:16:3e:04:d3:ab,name='',network_id=c545f670-df99-4ce7-9089-6215d0326afa,project_id='b50b9f21ca0f41d488637498aae7ffa4',qos_policy_id=None,revision_number=33,security=PortSecurity(42dd88ef-f751-4306-82e8-75fd43611ddd),security_group_ids=set([]),status='ACTIVE',updated_at=2019-03-11T17:41:25Z) record_resource_update /usr/lib/python2.7/site-packages/neutron/agent/resource_cache.py:187
2019-03-11 17:49:15.493 93016 DEBUG neutron.agent.resource_cache [req-3f05f0a6-f286-4239-96c5-d7fe98434e9b - - - - -] Ignoring stale update for Port: Port(admin_state_up=True,allowed_address_pairs=[],binding_levels=[],bindings=[PortBinding],created_at=2019-03-11T14:12:08Z,data_plane_status=<?>,description='',device_id='68a2f0fb-170a-4ae9-8c4f-c154875a237d',device_owner='network:router_gateway',dhcp_options=[],distributed_bindings=[],dns=None,fixed_ips=[IPAllocation],id=f41b8b30-7825-45d6-a891-1e2889778d07,mac_address=fa:16:3e:37:27:4f,name='',network_id=11e4f8b8-b6cc-40ea-86c0-57e21d0e969a,project_id='',qos_policy_id=None,revision_number=32,security=PortSecurity(f41b8b30-7825-45d6-a891-1e2889778d07),security_group_ids=set([]),status='DOWN',updated_at=2019-03-11T17:41:19Z) record_resource_update /usr/lib/python2.7/site-packages/neutron/agent/resource_cache.py:170

(overcloud) [stack@undercloud-0 ~]$ openstack port show f41b8b30-7825-45d6-a891-1e2889778d07
+-----------------------+---------------------------------------------------------------------------+
| Field                 | Value                                                                     |
+-----------------------+---------------------------------------------------------------------------+
| admin_state_up        | UP                                                                        |

Comment 3 Nate Johnston 2019-04-11 15:42:59 UTC
Candido is testing to see if this also happens without L3HA

Comment 4 Brian Haley 2019-04-15 13:49:58 UTC
Have started looking, just need confirmation from Candido.

Comment 5 Ewald van Geffen 2019-06-14 14:00:19 UTC
Can confirm this bug w/ L3HA+DVR on Rocky/CentOS7 manual config. 
Is the intention to make DVR+L3HA compatible? Ifso, has it been assigned a release yet?

Comment 6 Candido Campos 2019-06-14 14:16:36 UTC
No, DVR and HA is not supported. 
And the bug only goes to fixed if it can be reproduced without DVR.


Note You need to log in before you can comment on or make changes to this bug.