Bug 1943599 - CAPO Port Re-Use logic lacks basic sanity checks, and prevents multiple NICs from being created on the same network
Summary: CAPO Port Re-Use logic lacks basic sanity checks, and prevents multiple NICs ...
Keywords:
Status: CLOSED DUPLICATE of bug 1955969
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.8
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: ---
Assignee: Matthew Booth
QA Contact: Jon Uriarte
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-03-26 14:49 UTC by egarcia
Modified: 2021-12-02 15:13 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-12-02 15:13:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-api-provider-openstack pull 175 0 None closed Bug 1948546: Port create bugs 2021-05-25 14:32:08 UTC

Description egarcia 2021-03-26 14:49:26 UTC
https://github.com/openshift/cluster-api-provider-openstack/commit/42ae205f8c9046f6b95490d39fb531221bddd274#diff-74426b0daa2349d4dffb9633a1a3ef807c0136fc398d0275213a4734c336f067R290-R313

There isn't a good way for CAPO to lock other async processes or other CAPO threads from taking the same ports. This could potentially lead to a race condition when trying to claim the same port that would result in a failed deployment. Instead CAPO should focus on making sure to create and destroy the resources it needs for a given machine.

Comment 1 egarcia 2021-03-26 14:53:25 UTC
ACKed by Kuryr, not a blocker for them.

Comment 3 egarcia 2021-03-29 15:15:12 UTC
2 new pieces of info:

1. CAPO and CAPI is completely sequential
2. The use pattern of looking up a resource by name and using it if found is used throughout the library, and even now in upstream

I think that the chance of a race is very low, it might be better off to leave this for now, and engage with upstream to figure this out if we want to change it.

Comment 4 egarcia 2021-04-01 15:32:59 UTC
Re-use logic caused the same port to be attached as an interface twice in a customer's system:

Duplicate entry 'fa:16:3e:3c:a7:ff/9291d07d-69d3-4f61-a6a3-e105dd5663e0-0' for key 'uniq_virtual_interfaces0address0deleted so Failed to allocate the network(s)

Comment 6 egarcia 2021-04-07 16:00:10 UTC
Steps to reproduce: Define a machine spec with 2 subnets that are in the same network.

Comment 8 egarcia 2021-04-07 16:01:26 UTC
Issue has been filed upstream as well: https://github.com/kubernetes-sigs/cluster-api-provider-openstack/issues/834

Comment 10 egarcia 2021-06-07 13:55:06 UTC
the naming duplication issues have been handled downstream by a separate patch. I will link it when I find it. This is not a blocker for 4.8.

Comment 13 ShiftStack Bugwatcher 2021-11-25 16:11:27 UTC
Removing the Triaged keyword because:
* the target release value is missing

* the QE automation assessment (flag qe_test_coverage) is missing

Comment 14 Matthew Booth 2021-12-02 15:13:08 UTC

*** This bug has been marked as a duplicate of bug 1955969 ***


Note You need to log in before you can comment on or make changes to this bug.