Bug 2135363

Summary: The multiple VM creation via templates fail with error ` nova.exception.VirtualInterfaceCreateException: Virtual Interface creation failed`
Product: Red Hat OpenStack Reporter: Shailesh Chhabdiya <schhabdi>
Component: openstack-neutronAssignee: Slawek Kaplonski <skaplons>
Status: CLOSED CURRENTRELEASE QA Contact: Eran Kuris <ekuris>
Severity: high Docs Contact:
Priority: urgent    
Version: 16.2 (Train)CC: averdagu, bshephar, chrisw, ekuris, jlibosva, mtomaska, ralonsoh, rdiwakar, scohen, skaplons
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-06-29 06:11:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Shailesh Chhabdiya 2022-10-17 11:37:02 UTC
Description of problem:
Multiple VM creation in a single go fails with error mentioned in summary

The failure is not constant its always random

Version-Release number of selected component (if applicable):
RHOSP16.2.2
CISCO APIC SDN

How reproducible:
ALways 

Steps to Reproduce:
1. Create VM via templates 
2. atleast 2 VM creation fails always
3.

Actual results:
VM creation fails

Expected results:
VM creation should be completed successfully

Comment 5 Brendan Shephard 2022-10-18 06:34:00 UTC
To summarise my understanding here. From Cisco's perspective, it seems their investigation shows that the port was configured, added to the DB and then we reached this point at which a message should have been published on the Message bus (RabbitMQ):
https://github.com/openstack/neutron/blob/stable/train/neutron/db/provisioning_blocks.py#L140-L145

Which is done by neutron-lib:
https://github.com/openstack/neutron-lib/blob/stable/train/neutron_lib/callbacks/registry.py#L59-L60

So assuming that all worked up until that point, then there could have been some issue with RabbitMQ during that time that prevented the message being published or received by Nova.

Can you verify if there are any RabbitMQ issues seen during the time of the error? Last time this issue come up, I called out errors that were seen with RMQ as well. So it could be the case here too:
https://bugzilla.redhat.com/show_bug.cgi?id=2082009#c4


Adding in PIDONE as FYI if we need to go down the RMQ route.