| Summary: |
[Neutron][OVN] - VLAN-aware instances - sub-ports are taking 15-30 min to be available |
| Product: |
Red Hat OpenStack
|
Reporter: |
Matt Flusche <mflusche> |
| Component: |
python-networking-ovn | Assignee: |
Slawek Kaplonski <skaplons> |
| Status: |
CLOSED
EOL
|
QA Contact: |
Eran Kuris <ekuris> |
| Severity: |
high
|
Docs Contact: |
|
| Priority: |
high
|
|
|
| Version: |
16.1 (Train) | CC: |
apevec, bcafarel, chrisw, dhill, dhruv, fesilva, froyo, gthiemon, lhh, majopela, mburns, nalmond, ralonsoh, rpawlik, scohen, skaplons
|
| Target Milestone: |
async | Keywords: |
Triaged |
| Target Release: |
16.1 (Train on RHEL 8.2) | |
|
| Hardware: |
x86_64 | |
|
| OS: |
Linux | |
|
| Whiteboard: |
|
|
Fixed In Version:
|
python-networking-ovn-7.3.1-1.20230331143541.4e24f4c.el8ost
|
Doc Type:
|
No Doc Update
|
|
Doc Text:
|
|
Story Points:
|
---
|
|
Clone Of:
|
|
|
:
|
2169673
(view as bug list) |
Environment:
|
|
|
Last Closed:
|
2023-07-24 09:59:58 UTC
|
Type:
|
Bug
|
|
Regression:
|
---
|
Mount Type:
|
---
|
|
Documentation:
|
---
|
CRM:
|
|
|
Verified Versions:
|
|
Category:
|
---
|
|
oVirt Team:
|
---
|
RHEL 7.3 requirements from Atomic Host:
|
|
|
Cloudforms Team:
|
---
|
Target Upstream Version:
|
|
|
Embargoed:
|
|
| |
| Bug Depends On: |
|
|
|
| Bug Blocks: |
2169673, 2169676
|
|
|
Description of problem: This issues only occurs in larger environments. Not able to reproduce in lab or smaller deployments. Standard OVN ports seem to work fine and are available immediately. VLAN sub-ports take up to 30 min to be available. This setup is deployed via Heat. I'll provide more details, debug logs, and example heat template in private comments. From the neutron server debug logs, I don't see obvious error; however, we see the neutron maintenance task fix the sub-port which correlates to the port functioning (ping test works). Here is an example of this occurrence: Main port: da4b8088-0f94-48d2-ad7b-ee86a524eeb9 sub-ports: 339c394d-f133-4d68-988b-83ec39cc165d (vlan 10) c5d7e07f-fc6f-4881-bbcd-4ccfc2590183 (vlan 20) Trunk port: dc394e91-41ca-4d27-9c0d-44952973569a From the Heat stack, note the timestamps for the trunk & ports: | servera-1_trunk | dc394e91-41ca-4d27-9c0d-44952973569a | OS::Neutron::Trunk | CREATE_COMPLETE | 2023-01-30T23:08:17Z | novello-admins-0001 | | servera-1_vlan20_port | c5d7e07f-fc6f-4881-bbcd-4ccfc2590183 | OS::Neutron::Port | CREATE_COMPLETE | 2023-01-30T23:08:17Z | novello-admins-0001 | | servera-1_vlan10_port | 339c394d-f133-4d68-988b-83ec39cc165d | OS::Neutron::Port | CREATE_COMPLETE | 2023-01-30T23:08:17Z | novello-admins-0001 | | servera-1_port | da4b8088-0f94-48d2-ad7b-ee86a524eeb9 | OS::Neutron::Port | CREATE_COMPLETE | 2023-01-30T23:08:18Z | novello-admins-0001 | From the neutron debug log we see the port is ACTIVE with vlan sub-ports 2023-01-30 23:09:30.669 32 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): CheckRevisionNumberCommand(name=da4b8088-0f94-48d2-ad7b-ee86a524eeb9, resource={'id': 'da4b8088-0f94-48d2-ad7b-ee86a524eeb9', 'name': 'novello-admins-0001-servera-1_port-qkmeent3pb5c', 'network_id': '688bc77f-5a59-4288-9583-65a9cee9bdd8', 'tenant_id': '5419d14db9aa45599c0458b71b237046', 'mac_address': '52:54:00:00:FA:FA', 'admin_state_up': True, 'status': 'ACTIVE', 'device_id': '1b93b197-a200-47d5-b864-f7f7d4fccc5b', 'device_owner': 'compute:FAIL', 'fixed_ips': [{'subnet_id': '800b1dd1-8efd-4c2a-8db0-7ddb7bf7de07', 'ip_address': '172.25.252.115'}], 'allowed_address_pairs': [], 'extra_dhcp_opts': [], 'security_groups': [], 'description': '', 'binding:vnic_type': 'normal', 'binding:profile': {}, 'binding:host_id': 'REMOVED', 'binding:vif_type': 'ovs', 'binding:vif_details': {'port_filter': True}, 'port_security_enabled': False, 'dns_name': 'servera', 'dns_assignment': [{'ip_address': '172.25.252.115', 'hostname': 'servera', 'fqdn': 'servera.example.com.'}], 'trunk_details': {'trunk_id': 'dc394e91-41ca-4d27-9c0d-44952973569a', 'sub_ports': [{'segmentation_id': 10, 'segmentation_type': 'vlan', 'port_id': '339c394d-f133-4d68-988b-83ec39cc165d', 'mac_address': '52:54:00:00:FA:FA'}, {'segmentation_id': 20, 'segmentation_type': 'vlan', 'port_id': 'c5d7e07f-fc6f-4881-bbcd-4ccfc2590183', 'mac_address': '52:54:00:00:FA:FA'}]}, 'ip_allocation': 'immediate', 'tags': [], 'created_at': '2023-01-30T23:08:23Z', 'updated_at': '2023-01-30T23:09:30Z', 'revision_number': 4, 'project_id': '5419d14db9aa45599c0458b71b237046', 'network': {'id': '688bc77f-5a59-4288-9583-65a9cee9bdd8', 'name': 'classroom_network', 'tenant_id': '5419d14db9aa45599c0458b71b237046', 'admin_state_up': True, 'mtu': 8942, 'status': 'ACTIVE', 'subnets': ['800b1dd1-8efd-4c2a-8db0-7ddb7bf7de07'], 'shared': False, 'availability_zone_hints': [], 'availability_zones': [], 'ipv4_address_scope': None, 'ipv6_address_scope': None, 'router:external': False, 'vlan_transparent': None, 'description': '', 'port_security_enabled': True, 'dns_domain': '', 'l2_adjacency': True, 'tags': [], 'created_at': '2023-01-30T23:08:21Z', 'updated_at': '2023-01-30T23:08:22Z', 'revision_number': 2, 'project_id': '5419d14db9aa45599c0458b71b237046', 'provider:network_type': 'geneve', 'provider:physical_network': None, 'provider:segmentation_id': 171}}, resource_type=ports, if_exists=True) do_commit /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84 However, the sub-ports dont work until this maintenance task completes about 15 min later. 2023-01-30 23:24:29.476 39 DEBUG networking_ovn.common.maintenance [req-fe4556bb-84bf-41ba-afbd-434888fe8730 - - - - -] Maintenance task: Fixing resource 339c394d-f133-4d68-988b-83ec39cc165d (type: ports) at create/update check_for_inconsistencies /usr/lib/python3.6/site-packages/networking_ovn/common/maintenance.py:353 2023-01-30 23:24:29.558 31 DEBUG neutron.wsgi [-] (31) accepted ('10.212.200.5', 57878) server /usr/lib/python3.6/site-packages/eventlet/wsgi.py:985 2023-01-30 23:24:29.602 39 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): CheckRevisionNumberCommand(name=339c394d-f133-4d68-988b-83ec39cc165d, resource={'id': '339c394d-f133-4d68-988b-83ec39cc165d', 'name': 'REMOVED-admins-0001-servera-1_vlan10_port-aeisn2s5skp6', 'network_id': '389d732f-a6b9-49eb-a51f-d450f9a7b37e', 'tenant_id': '5419d14db9aa45599c0458b71b237046', 'mac_address': '52:54:00:00:FA:FA', 'admin_state_up': True, 'status': 'ACTIVE', 'device_id': '', 'device_owner': 'trunk:subport', 'fixed_ips': [], 'allowed_address_pairs': [], 'extra_dhcp_opts': [], 'security_groups': [], 'description': '', 'binding:vnic_type': 'normal', 'binding:profile': {'parent_name': 'da4b8088-0f94-48d2-ad7b-ee86a524eeb9', 'tag': 10}, 'binding:host_id': '', 'binding:vif_type': 'ovs', 'binding:vif_details': {}, 'port_security_enabled': False, 'dns_name': '', 'dns_assignment': [], 'ip_allocation': 'immediate', 'tags': [], 'created_at': '2023-01-30T23:08:27Z', 'updated_at': '2023-01-30T23:08:33Z', 'revision_number': 2, 'project_id': '5419d14db9aa45599c0458b71b237046'}, resource_type=ports, if_exists=True) do_commit /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84 2023-01-30 23:24:29.602 39 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=1): SetLSwitchPortCommand(lport=339c394d-f133-4d68-988b-83ec39cc165d, columns={'external_ids': {'neutron:port_name': 'novello-admins-0001-servera-1_vlan10_port-aeisn2s5skp6', 'neutron:device_id': '', 'neutron:project_id': '5419d14db9aa45599c0458b71b237046', 'neutron:cidrs': '', 'neutron:device_owner': 'trunk:subport', 'neutron:network_name': 'neutron-389d732f-a6b9-49eb-a51f-d450f9a7b37e', 'neutron:security_group_ids': '', 'neutron:revision_number': '2'}, 'parent_name': 'da4b8088-0f94-48d2-ad7b-ee86a524eeb9', 'tag': 10, 'options': {'requested-chassis': '', 'mcast_flood_reports': 'true'}, 'enabled': True, 'port_security': [], 'dhcpv4_options': [], 'dhcpv6_options': [], 'type': '', 'addresses': ['52:54:00:00:FA:FA', 'unknown'], 'ha_chassis_group': []}, if_exists=False) do_commit /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84 I can't pinpoint an obvious issue specific to the sub-ports; however, there are a lot of failing maintenance tasks in the logs, example: ERROR networking_ovn.common.maintenance [req-UUID - - - - -] Maintenance task: Failed to fix deleted resource UUID (type: subnets): KeyError: 'uuid' Version-Release number of selected component (if applicable): 16.1.8 How reproducible: These environments Steps to Reproduce: 1. heat template attached in these specific environments 2. 3. Additional info: Provided in additional comments