Bug 1955969

Summary: Workers cannot be deployed attached to multiple networks.
Product: OpenShift Container Platform Reporter: Adolfo Duarte <adduarte>
Component: Cloud ComputeAssignee: Adolfo Duarte <adduarte>
Cloud Compute sub component: OpenStack Provider QA Contact: Udi Shkalim <ushkalim>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: high    
Priority: urgent CC: adduarte, egarcia, m.andre, mfedosin, pprinett, ushkalim
Version: 4.8Keywords: TestBlocker, Triaged, UpcomingSprint
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
No doc needed
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-07 10:01:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 3 egarcia 2021-05-05 15:37:51 UTC
*** Bug 1956251 has been marked as a duplicate of this bug. ***

Comment 4 Adolfo Duarte 2021-05-05 15:57:59 UTC
In this section of the code [1] ports are created with the name of the node, which when two or more ports are created will cause a match in this part of the code [2] which could cause problems because the second port with the same name would not be created instead the first port created with that name will be returned instead

So it seems that the problem seems to be related to ports with same names. 

This was also supported by the fact that when using the "port" entry in the machineset twice, without providing a namesuffix here [3] causes the similar errors. 

which take the form of: 

" oslo_db.exception.DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, "Duplicate entry 'fa:16:3e:14:5e:07/8effd769-18b5-443b-9c8c-bd4210b1d909-0' for key 'uniq_virtual_interfaces0address0deleted'")
[SQL: INSERT INTO virtual_interfaces (created_at, updated_at, deleted_at, deleted, address, network_id, instance_uuid, uuid, tag) VALUES (%(created_at)s, %(updated_at)s, %(deleted_at)s, %(deleted)s, %(address)s, %(network_id)s, %(instance_uuid)s, %(uuid)s, %(tag)s)]
"

[1] https://github.com/openshift/cluster-api-provider-openstack/blob/master/pkg/cloud/openstack/clients/machineservice.go#L337
[2] https://github.com/openshift/cluster-api-provider-openstack/blob/master/pkg/cloud/openstack/clients/machineservice.go#L330
[3] https://github.com/openshift/cluster-api-provider-openstack/blob/master/pkg/cloud/openstack/clients/machineservice.go#L619

Comment 5 Adolfo Duarte 2021-05-07 01:03:48 UTC
Patch is ready for this: https://github.com/openshift/cluster-api-provider-openstack/pull/181

Comment 7 Pierre Prinetti 2021-05-07 14:52:37 UTC
Is there a plan to modify the way Terraform names the Control Plane ports at install time, to match what would be done by CAPO in day 2?

Comment 9 Adolfo Duarte 2021-05-21 05:12:06 UTC
*** Bug 1936511 has been marked as a duplicate of this bug. ***

Comment 10 Pierre Prinetti 2021-05-21 12:26:11 UTC
Master scaling is not officially supported in OCP as far as I know. However, a failed master machine is expected to be replaced by CAPO. At that point, there may be an inconsistency in the name of the ports created at install-time, and those created day2 as a replacement of failed machines.

This surprising, but not immediately problematic behaviour can be fixed down the road; I don't want to block this patch for such a minor issue.

Comment 11 Adolfo Duarte 2021-05-21 17:56:21 UTC
If the masters are replaced by CAPO then the new machines will come up with a port with a new port name, thus maintaining port name uniqueness which is the root cause and what this patch addresses. 

I think correcting the names in terraform is just a cosmetic change, and you could argue that flagging terraform created ports by NOT giving them the uuid of the subnet might be a useful piece of information.   
The only reason I would back port this fix to terraform would be if there is a scenario where port name uniqueness is not assured.

Comment 12 Udi Shkalim 2021-05-23 14:12:56 UTC
Verified on:
(shiftstack) [stack@undercloud-0 ~]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.nightly-2021-05-21-233425   True        False         138m    Cluster version is 4.8.0-0.nightly-2021-05-21-233425

snippet from install-config.yaml:
compute:
- name: worker
  platform:
    openstack:
      zones: []
      additionalNetworkIDs: ['cc3ca09f-6ac3-4e36-bf8a-4ecce205469a']

(shiftstack) [stack@undercloud-0 ~]$ openstack server list
+--------------------------------------+-----------------------------+--------+--------------------------------------------------------------------------+--------------------+--------+
| ID                                   | Name                        | Status | Networks                                                                 | Image              | Flavor |
+--------------------------------------+-----------------------------+--------+--------------------------------------------------------------------------+--------------------+--------+
| 95856884-1716-40db-8832-c302a087c45c | ostest-q76wq-worker-0-snszj | ACTIVE | ostest-q76wq-openshift=10.196.3.83; provider-net-vlan128=192.168.128.105 | ostest-q76wq-rhcos |        |
| 9dff12e4-2b94-420d-b935-203faf84c043 | ostest-q76wq-worker-0-kscpc | ACTIVE | ostest-q76wq-openshift=10.196.1.74; provider-net-vlan128=192.168.128.125 | ostest-q76wq-rhcos |        |
| acad1dd0-ef98-4874-b6dc-343aab34730b | ostest-q76wq-worker-0-9f7ck | ACTIVE | ostest-q76wq-openshift=10.196.0.42; provider-net-vlan128=192.168.128.177 | ostest-q76wq-rhcos |        |
| 7209b9b8-04b6-4bbe-b548-6e852efe3eba | ostest-q76wq-master-2       | ACTIVE | ostest-q76wq-openshift=10.196.1.116                                      | ostest-q76wq-rhcos |        |
| 5dffa86c-cea7-4fd6-ab22-3fb4c98bd6f0 | ostest-q76wq-master-1       | ACTIVE | ostest-q76wq-openshift=10.196.2.124                                      | ostest-q76wq-rhcos |        |
| 2af75cec-ec12-44c0-b525-db33b6c3ff5c | ostest-q76wq-master-0       | ACTIVE | ostest-q76wq-openshift=10.196.2.51                                       | ostest-q76wq-rhcos |        |
+--------------------------------------+-----------------------------+--------+--------------------------------------------------------------------------+--------------------+--------+

Comment 15 Matthew Booth 2021-12-02 15:13:08 UTC
*** Bug 1943599 has been marked as a duplicate of this bug. ***