Bug 1692861
Summary: | [OSP 14] DVR+L3HA External conectivity is lost sometimes after a switch over between controllers | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Candido Campos <ccamposr> |
Component: | openstack-neutron | Assignee: | Rodolfo Alonso <ralonsoh> |
Status: | CLOSED NOTABUG | QA Contact: | Roee Agiman <ragiman> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 14.0 (Rocky) | CC: | amuller, bhaley, chrisw, ewald.vangeffen, njohnston, scohen |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-07-22 14:02:39 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Candido Campos
2019-03-26 15:17:24 UTC
Hi Candido, Note that we don't support enabling L3 HA + DVR at the same time, only one or the other. If you cannot reproduce on an environment that has either but not both, please close this bug. Hi, I can reproduce the same issue withou ha, if the controller with the qrouter is rebooted. But this other reproduction method is affected for other bug related with rabbit: https://bugzilla.redhat.com/show_bug.cgi?id=1661806 this reproduction methos is more clear and the bug seems be in the switch over mechanism with DVR, then if it is ok for you, we can investigate more in deep the issue and if it is only a bug related to DVR+HA then I close it: This problen is affecting the test with DVR. Sometimes after the reboot of the controllers the external connectivity is lost: the qr router is bad configure by ovs agent [root@controller-1 heat-admin]# ip netns exec qrouter-68a2f0fb-170a-4ae9-8c4f-c154875a237d ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 18: qr-42dd88ef-f7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether fa:16:3e:04:d3:ab brd ff:ff:ff:ff:ff:ff inet 10.1.0.1/24 brd 10.1.0.255 scope global qr-42dd88ef-f7 valid_lft forever preferred_lft forever inet6 fe80::f816:3eff:fe04:d3ab/64 scope link tentative dadfailed valid_lft forever preferred_lft forever 19: qg-f41b8b30-78: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether fa:16:3e:37:27:4f brd ff:ff:ff:ff:ff:ff inet 10.0.0.228/24 brd 10.0.0.255 scope global qg-f41b8b30-78 valid_lft forever preferred_lft forever inet6 fe80::f816:3eff:fe37:274f/64 scope link valid_lft forever preferred_lft forever the qg-XX port is not connected with br-ex and it is the reason of the connectivity lost. [root@controller-1 heat-admin]# ovs-vsctl show 5b1def1a-50bd-417f-97f0-e92f61d9f274 Manager "ptcp:6640:127.0.0.1" is_connected: true Bridge br-int Controller "tcp:127.0.0.1:6633" is_connected: true fail_mode: secure Port "qg-f41b8b30-78" tag: 4095 Interface "qg-f41b8b30-78" type: internal Port "qr-42dd88ef-f7" tag: 1 Interface "qr-42dd88ef-f7" type: internal Port br-int Interface br-int type: internal Port "tapbd74ad5a-25" tag: 1 Interface "tapbd74ad5a-25" type: internal Port int-br-isolated Interface int-br-isolated type: patch options: {peer=phy-br-isolated} Port int-br-ex Interface int-br-ex type: patch options: {peer=phy-br-ex} The problem appears when the rabbit problem starts, but we are investigating if there are some other issue in ovs-agente because the connectivity is recovered if the ovs agent is restarted: Logs of the problem: [root@controller-1 heat-admin]# egrep "42dd88ef-f751-4306-82e8-75fd43611ddd|68a2f0fb-170a-4ae9-8c4f-c154875a237d" openvswitch-agent.log_error | tail -20 2019-03-11 17:41:25.289 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Processing port: 42dd88ef-f751-4306-82e8-75fd43611ddd treat_devices_added_or_updated /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:1568 2019-03-11 17:41:25.289 13164 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Port 42dd88ef-f751-4306-82e8-75fd43611ddd updated. Details: {'profile': {}, 'network_qos_policy_id': None, 'qos_policy_id': None, 'allowed_address_pairs': [], 'admin_state_up': True, 'network_id': 'c545f670-df99-4ce7-9089-6215d0326afa', 'segmentation_id': 1099, 'fixed_ips': [{'subnet_id': '2bfccf3a-4243-44a5-87c8-bbff345e38e0', 'ip_address': '10.1.0.1'}], 'device_owner': u'network:router_interface', 'physical_network': u'tenant', 'mac_address': 'fa:16:3e:04:d3:ab', 'device': u'42dd88ef-f751-4306-82e8-75fd43611ddd', 'port_security_enabled': False, 'port_id': '42dd88ef-f751-4306-82e8-75fd43611ddd', 'network_type': u'vlan', 'security_groups': []} 2019-03-11 17:41:25.297 13164 DEBUG neutron.agent.l2.extensions.qos [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] QoS extension did not have information on port 42dd88ef-f751-4306-82e8-75fd43611ddd clean_by_port /usr/lib/python2.7/site-packages/neutron/agent/l2/extensions/qos.py:190 2019-03-11 17:41:25.297 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.extension_drivers.qos_driver [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] delete_dscp_marking was received for port 42dd88ef-f751-4306-82e8-75fd43611ddd but no port information was stored to be deleted delete_dscp_marking /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/extension_drivers/qos_driver.py:146 2019-03-11 17:41:25.297 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.extension_drivers.qos_driver [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] delete_bandwidth_limit was received for port 42dd88ef-f751-4306-82e8-75fd43611ddd but port was not found. It seems that bandwidth_limit is already deleted delete_bandwidth_limit /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/extension_drivers/qos_driver.py:74 2019-03-11 17:41:25.298 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.extension_drivers.qos_driver [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] delete_bandwidth_limit_ingress was received for port 42dd88ef-f751-4306-82e8-75fd43611ddd but port was not found. It seems that bandwidth_limit is already deleted delete_bandwidth_limit_ingress /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/extension_drivers/qos_driver.py:88 2019-03-11 17:41:25.308 13164 INFO neutron.agent.securitygroups_rpc [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Preparing filters for devices set([u'42dd88ef-f751-4306-82e8-75fd43611ddd', u'f41b8b30-7825-45d6-a891-1e2889778d07']) 2019-03-11 17:41:25.358 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Setting status for 42dd88ef-f751-4306-82e8-75fd43611ddd to UP _bind_devices /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:905 2019-03-11 17:41:25.824 13164 DEBUG neutron.agent.resource_cache [req-fc0c4159-35b4-48a1-8283-bcb71aa4124a - - - - -] Resource Port 42dd88ef-f751-4306-82e8-75fd43611ddd updated (revision_number 32->33). Old fields: {'status': u'DOWN'} New fields: {'status': u'ACTIVE'} record_resource_update /usr/lib/python2.7/site-packages/neutron/agent/resource_cache.py:185 2019-03-11 17:41:38.019 13164 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Configuration for devices up [u'42dd88ef-f751-4306-82e8-75fd43611ddd'] and devices down [] completed. 2019-03-11 17:41:38.030 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Starting to process devices in:{'current': set([u'bd74ad5a-25c1-4660-8243-5a59701bed95', u'f41b8b30-7825-45d6-a891-1e2889778d07', u'42dd88ef-f751-4306-82e8-75fd43611ddd']), 'removed': set([]), 'added': set([]), 'updated': set(['42dd88ef-f751-4306-82e8-75fd43611ddd'])} rpc_loop /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:2159 2019-03-11 17:41:38.031 13164 DEBUG neutron.agent.rpc [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Returning: {'profile': {}, 'network_qos_policy_id': None, 'qos_policy_id': None, 'allowed_address_pairs': [], 'admin_state_up': True, 'network_id': 'c545f670-df99-4ce7-9089-6215d0326afa', 'segmentation_id': 1099, 'fixed_ips': [{'subnet_id': '2bfccf3a-4243-44a5-87c8-bbff345e38e0', 'ip_address': '10.1.0.1'}], 'device_owner': u'network:router_interface', 'physical_network': u'tenant', 'mac_address': 'fa:16:3e:04:d3:ab', 'device': '42dd88ef-f751-4306-82e8-75fd43611ddd', 'port_security_enabled': False, 'port_id': '42dd88ef-f751-4306-82e8-75fd43611ddd', 'network_type': u'vlan', 'security_groups': []} get_device_details /usr/lib/python2.7/site-packages/neutron/agent/rpc.py:328 2019-03-11 17:41:38.034 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Processing port: 42dd88ef-f751-4306-82e8-75fd43611ddd treat_devices_added_or_updated /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:1568 2019-03-11 17:41:38.034 13164 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Port 42dd88ef-f751-4306-82e8-75fd43611ddd updated. Details: {'profile': {}, 'network_qos_policy_id': None, 'qos_policy_id': None, 'allowed_address_pairs': [], 'admin_state_up': True, 'network_id': 'c545f670-df99-4ce7-9089-6215d0326afa', 'segmentation_id': 1099, 'fixed_ips': [{'subnet_id': '2bfccf3a-4243-44a5-87c8-bbff345e38e0', 'ip_address': '10.1.0.1'}], 'device_owner': u'network:router_interface', 'physical_network': u'tenant', 'mac_address': 'fa:16:3e:04:d3:ab', 'device': '42dd88ef-f751-4306-82e8-75fd43611ddd', 'port_security_enabled': False, 'port_id': '42dd88ef-f751-4306-82e8-75fd43611ddd', 'network_type': u'vlan', 'security_groups': []} 2019-03-11 17:41:38.037 13164 DEBUG neutron.agent.l2.extensions.qos [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] QoS extension did not have information on port 42dd88ef-f751-4306-82e8-75fd43611ddd clean_by_port /usr/lib/python2.7/site-packages/neutron/agent/l2/extensions/qos.py:190 2019-03-11 17:41:38.037 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.extension_drivers.qos_driver [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] delete_dscp_marking was received for port 42dd88ef-f751-4306-82e8-75fd43611ddd but no port information was stored to be deleted delete_dscp_marking /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/extension_drivers/qos_driver.py:146 2019-03-11 17:41:38.038 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.extension_drivers.qos_driver [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] delete_bandwidth_limit was received for port 42dd88ef-f751-4306-82e8-75fd43611ddd but port was not found. It seems that bandwidth_limit is already deleted delete_bandwidth_limit /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/extension_drivers/qos_driver.py:74 2019-03-11 17:41:38.038 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.extension_drivers.qos_driver [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] delete_bandwidth_limit_ingress was received for port 42dd88ef-f751-4306-82e8-75fd43611ddd but port was not found. It seems that bandwidth_limit is already deleted delete_bandwidth_limit_ingress /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/extension_drivers/qos_driver.py:88 2019-03-11 17:41:38.078 13164 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Setting status for 42dd88ef-f751-4306-82e8-75fd43611ddd to UP _bind_devices /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:905 2019-03-11 17:41:38.610 13164 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Configuration for devices up ['42dd88ef-f751-4306-82e8-75fd43611ddd'] and devices down [] completed. the qg-port seems be unknown in some momento for the agent: 2019-03-11 17:41:25.282 13164 DEBUG neutron.agent.resource_cache [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Received new resource Port: Port(admin_state_up=True,allowed_address_pairs=[],binding_levels=[PortBindingLevel],bindings=[PortBinding],created_at=2019-03-11T14:11:58Z,data_plane_status=<?>,description='',device_id='68a2f0fb-170a-4ae9-8c4f-c154875a237d',device_owner='network:router_interface',dhcp_options=[],distributed_bindings=[],dns=None,fixed_ips=[IPAllocation],id=42dd88ef-f751-4306-82e8-75fd43611ddd,mac_address=fa:16:3e:04:d3:ab,name='',network_id=c545f670-df99-4ce7-9089-6215d0326afa,project_id='b50b9f21ca0f41d488637498aae7ffa4',qos_policy_id=None,revision_number=32,security=PortSecurity(42dd88ef-f751-4306-82e8-75fd43611ddd),security_group_ids=set([]),status='DOWN',updated_at=2019-03-11T17:41:23Z) record_resource_update /usr/lib/python2.7/site-packages/neutron/agent/resource_cache.py:187 2019-03-11 17:41:25.284 13164 WARNING neutron.agent.rpc [req-13e9768f-bc14-4ff9-a933-910cf78481e6 - - - - -] Device Port(admin_state_up=True,allowed_address_pairs=[],binding_levels=[],bindings=[PortBinding],created_at=2019-03-11T14:12:08Z,data_plane_status=<?>,description='',device_id='68a2f0fb-170a-4ae9-8c4f-c154875a237d',device_owner='network:router_gateway',dhcp_options=[],distributed_bindings=[],dns=None,fixed_ips=[IPAllocation],id=f41b8b30-7825-45d6-a891-1e2889778d07,mac_address=fa:16:3e:37:27:4f,name='',network_id=11e4f8b8-b6cc-40ea-86c0-57e21d0e969a,project_id='',qos_policy_id=None,revision_number=32,security=PortSecurity(f41b8b30-7825-45d6-a891-1e2889778d07),security_group_ids=set([]),status='DOWN',updated_at=2019-03-11T17:41:19Z) is not bound. Logs in the cae of the restart: [root@controller-1 heat-admin]# grep "68a2f0fb-170a-4ae9-8c4f-c154875a237d" openvswitch-agent.log_start | tail -20f 2019-03-11 17:48:32.150 93016 DEBUG neutron.agent.resource_cache [req-c3801057-1743-4dea-8b7f-e3b908b59c54 - - - - -] Received new resource Port: Port(admin_state_up=True,allowed_address_pairs=[],binding_levels=[PortBindingLevel],bindings=[PortBinding],created_at=2019-03-11T14:12:08Z,data_plane_status=<?>,description='',device_id='68a2f0fb-170a-4ae9-8c4f-c154875a237d',device_owner='network:router_gateway',dhcp_options=[],distributed_bindings=[],dns=None,fixed_ips=[IPAllocation],id=f41b8b30-7825-45d6-a891-1e2889778d07,mac_address=fa:16:3e:37:27:4f,name='',network_id=11e4f8b8-b6cc-40ea-86c0-57e21d0e969a,project_id='',qos_policy_id=None,revision_number=33,security=PortSecurity(f41b8b30-7825-45d6-a891-1e2889778d07),security_group_ids=set([]),status='DOWN',updated_at=2019-03-11T17:41:22Z) record_resource_update /usr/lib/python2.7/site-packages/neutron/agent/resource_cache.py:187 2019-03-11 17:48:32.355 93016 DEBUG neutron.agent.resource_cache [req-c3801057-1743-4dea-8b7f-e3b908b59c54 - - - - -] Received new resource Port: Port(admin_state_up=True,allowed_address_pairs=[],binding_levels=[PortBindingLevel],bindings=[PortBinding],created_at=2019-03-11T14:11:58Z,data_plane_status=<?>,description='',device_id='68a2f0fb-170a-4ae9-8c4f-c154875a237d',device_owner='network:router_interface',dhcp_options=[],distributed_bindings=[],dns=None,fixed_ips=[IPAllocation],id=42dd88ef-f751-4306-82e8-75fd43611ddd,mac_address=fa:16:3e:04:d3:ab,name='',network_id=c545f670-df99-4ce7-9089-6215d0326afa,project_id='b50b9f21ca0f41d488637498aae7ffa4',qos_policy_id=None,revision_number=33,security=PortSecurity(42dd88ef-f751-4306-82e8-75fd43611ddd),security_group_ids=set([]),status='ACTIVE',updated_at=2019-03-11T17:41:25Z) record_resource_update /usr/lib/python2.7/site-packages/neutron/agent/resource_cache.py:187 2019-03-11 17:49:15.493 93016 DEBUG neutron.agent.resource_cache [req-3f05f0a6-f286-4239-96c5-d7fe98434e9b - - - - -] Ignoring stale update for Port: Port(admin_state_up=True,allowed_address_pairs=[],binding_levels=[],bindings=[PortBinding],created_at=2019-03-11T14:12:08Z,data_plane_status=<?>,description='',device_id='68a2f0fb-170a-4ae9-8c4f-c154875a237d',device_owner='network:router_gateway',dhcp_options=[],distributed_bindings=[],dns=None,fixed_ips=[IPAllocation],id=f41b8b30-7825-45d6-a891-1e2889778d07,mac_address=fa:16:3e:37:27:4f,name='',network_id=11e4f8b8-b6cc-40ea-86c0-57e21d0e969a,project_id='',qos_policy_id=None,revision_number=32,security=PortSecurity(f41b8b30-7825-45d6-a891-1e2889778d07),security_group_ids=set([]),status='DOWN',updated_at=2019-03-11T17:41:19Z) record_resource_update /usr/lib/python2.7/site-packages/neutron/agent/resource_cache.py:170 (overcloud) [stack@undercloud-0 ~]$ openstack port show f41b8b30-7825-45d6-a891-1e2889778d07 +-----------------------+---------------------------------------------------------------------------+ | Field | Value | +-----------------------+---------------------------------------------------------------------------+ | admin_state_up | UP | Candido is testing to see if this also happens without L3HA Have started looking, just need confirmation from Candido. Can confirm this bug w/ L3HA+DVR on Rocky/CentOS7 manual config. Is the intention to make DVR+L3HA compatible? Ifso, has it been assigned a release yet? No, DVR and HA is not supported. And the bug only goes to fixed if it can be reproduced without DVR. |