Bug 2166012 - [Neutron][OVN] - VLAN-aware instances - sub-ports are taking 15-30 min to be available
Summary: [Neutron][OVN] - VLAN-aware instances - sub-ports are taking 15-30 min to be ...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-networking-ovn
Version: 16.1 (Train)
Hardware: x86_64
OS: Linux
high
high
Target Milestone: async
: 16.1 (Train on RHEL 8.2)
Assignee: Slawek Kaplonski
QA Contact: Eran Kuris
URL:
Whiteboard:
Depends On:
Blocks: 2169673 2169676
TreeView+ depends on / blocked
 
Reported: 2023-01-31 17:09 UTC by Matt Flusche
Modified: 2023-07-24 09:59 UTC (History)
16 users (show)

Fixed In Version: python-networking-ovn-7.3.1-1.20230331143541.4e24f4c.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2169673 (view as bug list)
Environment:
Last Closed: 2023-07-24 09:59:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 2006734 0 None None None 2023-02-09 11:42:42 UTC
Launchpad 2006735 0 None None None 2023-02-09 11:42:42 UTC
OpenStack gerrit 832759 0 None MERGED Check subnet in "_remove_subnet_dhcp_options" 2023-02-09 11:42:42 UTC
OpenStack gerrit 853779 0 None MERGED [Trunk] Update the trunk status with the parent status 2023-02-09 11:42:42 UTC
Red Hat Issue Tracker OSP-21928 0 None None None 2023-01-31 17:11:59 UTC

Description Matt Flusche 2023-01-31 17:09:04 UTC
Description of problem:

This issues only occurs in larger environments.  Not able to reproduce in lab or smaller deployments.

Standard OVN ports seem to work fine and are available immediately.

VLAN sub-ports take up to 30 min to be available.

This setup is deployed via Heat.  I'll provide more details, debug logs, and example heat template in private comments.

From the neutron server debug logs, I don't see obvious error; however, we see the neutron maintenance task fix the sub-port which correlates to the port functioning (ping test works).

Here is an example of this occurrence:

Main port: da4b8088-0f94-48d2-ad7b-ee86a524eeb9

sub-ports: 339c394d-f133-4d68-988b-83ec39cc165d (vlan 10)
           c5d7e07f-fc6f-4881-bbcd-4ccfc2590183 (vlan 20)

Trunk port: dc394e91-41ca-4d27-9c0d-44952973569a

From the Heat stack, note the timestamps for the trunk & ports:

| servera-1_trunk        | dc394e91-41ca-4d27-9c0d-44952973569a | OS::Neutron::Trunk    | CREATE_COMPLETE | 2023-01-30T23:08:17Z | novello-admins-0001 |
| servera-1_vlan20_port  | c5d7e07f-fc6f-4881-bbcd-4ccfc2590183 | OS::Neutron::Port     | CREATE_COMPLETE | 2023-01-30T23:08:17Z | novello-admins-0001 |
| servera-1_vlan10_port  | 339c394d-f133-4d68-988b-83ec39cc165d | OS::Neutron::Port     | CREATE_COMPLETE | 2023-01-30T23:08:17Z | novello-admins-0001 |
| servera-1_port         | da4b8088-0f94-48d2-ad7b-ee86a524eeb9 | OS::Neutron::Port     | CREATE_COMPLETE | 2023-01-30T23:08:18Z | novello-admins-0001 |

From the neutron debug log we see the port is ACTIVE with vlan sub-ports

2023-01-30 23:09:30.669 32 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): CheckRevisionNumberCommand(name=da4b8088-0f94-48d2-ad7b-ee86a524eeb9, resource={'id': 'da4b8088-0f94-48d2-ad7b-ee86a524eeb9', 'name': 'novello-admins-0001-servera-1_port-qkmeent3pb5c', 'network_id': '688bc77f-5a59-4288-9583-65a9cee9bdd8', 'tenant_id': '5419d14db9aa45599c0458b71b237046', 'mac_address': '52:54:00:00:FA:FA', 'admin_state_up': True, 'status': 'ACTIVE', 'device_id': '1b93b197-a200-47d5-b864-f7f7d4fccc5b', 'device_owner': 'compute:FAIL', 'fixed_ips': [{'subnet_id': '800b1dd1-8efd-4c2a-8db0-7ddb7bf7de07', 'ip_address': '172.25.252.115'}], 'allowed_address_pairs': [], 'extra_dhcp_opts': [], 'security_groups': [], 'description': '', 'binding:vnic_type': 'normal', 'binding:profile': {}, 'binding:host_id': 'REMOVED', 'binding:vif_type': 'ovs', 'binding:vif_details': {'port_filter': True}, 'port_security_enabled': False, 'dns_name': 'servera', 'dns_assignment': [{'ip_address': '172.25.252.115', 'hostname': 'servera', 'fqdn': 'servera.example.com.'}], 'trunk_details': {'trunk_id': 'dc394e91-41ca-4d27-9c0d-44952973569a', 'sub_ports': [{'segmentation_id': 10, 'segmentation_type': 'vlan', 'port_id': '339c394d-f133-4d68-988b-83ec39cc165d', 'mac_address': '52:54:00:00:FA:FA'}, {'segmentation_id': 20, 'segmentation_type': 'vlan', 'port_id': 'c5d7e07f-fc6f-4881-bbcd-4ccfc2590183', 'mac_address': '52:54:00:00:FA:FA'}]}, 'ip_allocation': 'immediate', 'tags': [], 'created_at': '2023-01-30T23:08:23Z', 'updated_at': '2023-01-30T23:09:30Z', 'revision_number': 4, 'project_id': '5419d14db9aa45599c0458b71b237046', 'network': {'id': '688bc77f-5a59-4288-9583-65a9cee9bdd8', 'name': 'classroom_network', 'tenant_id': '5419d14db9aa45599c0458b71b237046', 'admin_state_up': True, 'mtu': 8942, 'status': 'ACTIVE', 'subnets': ['800b1dd1-8efd-4c2a-8db0-7ddb7bf7de07'], 'shared': False, 'availability_zone_hints': [], 'availability_zones': [], 'ipv4_address_scope': None, 'ipv6_address_scope': None, 'router:external': False, 'vlan_transparent': None, 'description': '', 'port_security_enabled': True, 'dns_domain': '', 'l2_adjacency': True, 'tags': [], 'created_at': '2023-01-30T23:08:21Z', 'updated_at': '2023-01-30T23:08:22Z', 'revision_number': 2, 'project_id': '5419d14db9aa45599c0458b71b237046', 'provider:network_type': 'geneve', 'provider:physical_network': None, 'provider:segmentation_id': 171}}, resource_type=ports, if_exists=True) do_commit /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84 

However, the sub-ports dont work until this maintenance task completes about 15 min later.

2023-01-30 23:24:29.476 39 DEBUG networking_ovn.common.maintenance [req-fe4556bb-84bf-41ba-afbd-434888fe8730 - - - - -] Maintenance task: Fixing resource 339c394d-f133-4d68-988b-83ec39cc165d
 (type: ports) at create/update check_for_inconsistencies /usr/lib/python3.6/site-packages/networking_ovn/common/maintenance.py:353
2023-01-30 23:24:29.558 31 DEBUG neutron.wsgi [-] (31) accepted ('10.212.200.5', 57878) server /usr/lib/python3.6/site-packages/eventlet/wsgi.py:985
2023-01-30 23:24:29.602 39 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): CheckRevisionNumberCommand(name=339c394d-f133-4d68-988b-83ec39cc165d, resource={'id': '339c394d-f133-4d68-988b-83ec39cc165d', 'name': 'REMOVED-admins-0001-servera-1_vlan10_port-aeisn2s5skp6', 'network_id': '389d732f-a6b9-49eb-a51f-d450f9a7b37e', 'tenant_id': '5419d14db9aa45599c0458b71b237046', 'mac_address': '52:54:00:00:FA:FA', 'admin_state_up': True, 'status': 'ACTIVE', 'device_id': '', 'device_owner': 'trunk:subport', 'fixed_ips': [], 'allowed_address_pairs': [], 'extra_dhcp_opts': [], 'security_groups': [], 'description': '', 'binding:vnic_type': 'normal', 'binding:profile': {'parent_name': 'da4b8088-0f94-48d2-ad7b-ee86a524eeb9', 'tag': 10}, 'binding:host_id': '', 'binding:vif_type': 'ovs', 'binding:vif_details': {}, 'port_security_enabled': False, 'dns_name': '', 'dns_assignment': [], 'ip_allocation': 'immediate', 'tags': [], 'created_at': '2023-01-30T23:08:27Z', 'updated_at': '2023-01-30T23:08:33Z', 'revision_number': 2, 'project_id': '5419d14db9aa45599c0458b71b237046'}, resource_type=ports, if_exists=True) do_commit /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84
2023-01-30 23:24:29.602 39 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=1): SetLSwitchPortCommand(lport=339c394d-f133-4d68-988b-83ec39cc165d, columns={'external_ids': {'neutron:port_name': 'novello-admins-0001-servera-1_vlan10_port-aeisn2s5skp6', 'neutron:device_id': '', 'neutron:project_id': '5419d14db9aa45599c0458b71b237046', 'neutron:cidrs': '', 'neutron:device_owner': 'trunk:subport', 'neutron:network_name': 'neutron-389d732f-a6b9-49eb-a51f-d450f9a7b37e', 'neutron:security_group_ids': '', 'neutron:revision_number': '2'}, 'parent_name': 'da4b8088-0f94-48d2-ad7b-ee86a524eeb9', 'tag': 10, 'options': {'requested-chassis': '', 'mcast_flood_reports': 'true'}, 'enabled': True, 'port_security': [], 'dhcpv4_options': [], 'dhcpv6_options': [], 'type': '', 'addresses': ['52:54:00:00:FA:FA', 'unknown'], 'ha_chassis_group': []}, if_exists=False) do_commit /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84

I can't pinpoint an obvious issue specific to the sub-ports; however, there are a lot of failing maintenance tasks in the logs, example:

ERROR networking_ovn.common.maintenance [req-UUID - - - - -] Maintenance task: Failed to fix deleted resource UUID (type: subnets): KeyError: 'uuid'


Version-Release number of selected component (if applicable):
16.1.8

How reproducible:
These environments

Steps to Reproduce:
1. heat template attached in these specific environments
2.
3.

Additional info:
Provided in additional comments


Note You need to log in before you can comment on or make changes to this bug.