Bug 2120141

Summary: Unable to delete an a tenant stack with delete failing on trunk deletion
Product: Red Hat OpenStack Reporter: Jacob Ansari <jansari>
Component: openstack-neutronAssignee: Rodolfo Alonso <ralonsoh>
Status: ASSIGNED --- QA Contact: Eran Kuris <ekuris>
Severity: medium Docs Contact:
Priority: medium    
Version: 16.2 (Train)CC: chrisw, coldford, fpalin, froyo, jansari, ralonsoh, rosingh, scohen, ykarel
Target Milestone: z5Keywords: Triaged
Target Release: 16.2 (Train on RHEL 8.4)Flags: coldford: needinfo? (jansari)
ralonsoh: needinfo? (jansari)
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jacob Ansari 2022-08-22 00:02:51 UTC
Description of problem:

If a VM is attached to the parent network port of a trunk and the VM is deleted, the trunk and port become undeletable through normal OpenStack commands. When these are part of a stack, attempting to delete the stack will also consequently fail .

Running the following query against the overcloud DB has been useful to quickly identify such problematic situations where the VM has been previously deleted .

SELECT ports.id as port_id, ports.device_id as instance_id ,ports.status as port_status, ml2pb.status as binding_status, ml2pb.vif_type, ml2pb.vnic_type, ont.id AS trunk_id, ont.status as trunk_status , ml2pb.vif_details FROM ovs_neutron.ports ports, ovs_neutron.ml2_port_bindings ml2pb , ovs_neutron.trunks ont where ports.id=ml2pb.port_id AND ml2pb.port_id=ont.port_id AND ports.device_id!='' AND (ports.device_id NOT IN (select uuid from nova.instances) OR ports.device_id IN (select uuid from nova.instances where vm_state='deleted')) \G


Version-Release number of selected component (if applicable):

OpenStack 16.1 and 16.2 . Similar issues for OSP 13 and 16.x reported in BZ 1812433 


How reproducible:
If the steps below are followed, it should be consistently reproducible. Though all recent customer cases reported have been opened with the VM already deleted (step 2 below), it is  not certain that step 2 will always be successful, based on previous similar cases referenced in BZ 1812433 .

Steps to Reproduce:
1.Spawn a tenant stack with a trunk whose parent port has a VM attached to it
2.Delete the aforementioned VM
3.Try to Delete the stack 

Actual results:

While attempting stack deletion, we get a failure for reason [1] which is correlated with [2] in Neutron logs .

Manually deleting the port via OpenStack CLI fails with [3].

Manually deleting the trunk via OpenStack CLI fails with [4].

[1]
" Conflict: resources.[obfuscated]-L-HP-X-01-trunk-parent-TRUNK-VIO-v6-instance02_Provider_Trunk: Trunk [obfuscated] is currently in use "

[2]
"Trunk driver does not consider trunk [obfuscated] untrunkable"

[3]
Failed to delete port with name or ID [obfuscated]: ConflictException: 409: Client Error for url: http://[obfuscated] is currently a parent port for trunk [obfuscated].
1 of 1 ports failed to delete.

[4]
Failed to delete trunk with name or ID [obfuscated]: Trunk [obfuscated] is currently in use.
Neutron server returns request_ids: [[obfuscated]]


Expected results:

Stack and associated trunks and ports delete successfully


Additional info:

Multiple customer cases have been reported very recently and fixed through article https://access.redhat.com/solutions/6826221
and previously through
https://access.redhat.com/solutions/4993961

See also previous BZ https://bugzilla.redhat.com/show_bug.cgi?id=1812433