Bug 2083244 - Unable to reliably create just one resource
Summary: Unable to reliably create just one resource
Keywords:
Status: CLOSED DUPLICATE of bug 2083245
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 16.2 (Train)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: OSP Team
QA Contact: Eran Kuris
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-05-09 14:43 UTC by Michał Dulko
Modified: 2022-05-09 15:06 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-05-09 15:06:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-15083 0 None None None 2022-05-09 14:45:52 UTC

Description Michał Dulko 2022-05-09 14:43:01 UTC
Description of problem:
This is more or less an iteration of [1]. We've run into similar problems with 504 being returned from the HAProxy after 2 minutes of neutron-server processing the request, but now on calls to create networks or subnets. The main issue here is that then Kuryr attempts to retry the creation. In order to do that it queries Neutron to check if network or subnet of a specified name exists already. If it does not, Kuryr will proceed with recreation. It had happened multiple times that no network or subnet appeared on the list, but we've ended with a duplicate. The only possibility is that the 504'd request was still being processed in the background.

At this point I think request timeouts should be synchronized between the HAProxy and the neutron-servers behind it. Otherwise we can't really know if the request is being processed or not.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=2024690


Version-Release number of selected component (if applicable):


How reproducible:
Fairly easy, when some scale is applied to Neutron.

Steps to Reproduce:
1. Install OpenShift with Kuryr.
2. Run creation of several namespaces with several pods in there, so that Kuryr will start to make multiple concurrent requests to Neutron API.

Actual results:
Some calls end up with 504, yet they're still being processed internally by neutron-server. Kuryr will never get back the IDs of the created resources.

Expected results:
If we get error from the API, we should be guaranteed no resource was created.

Additional info:
I'm happy to assist with reproducing this, it should be fairly easy.

Comment 1 Michał Dulko 2022-05-09 15:06:42 UTC

*** This bug has been marked as a duplicate of bug 2083245 ***


Note You need to log in before you can comment on or make changes to this bug.