Bug 1944365

Summary: openstack: missing validation for apiVIP and ingressVIP
Product: OpenShift Container Platform Reporter: Emilien Macchi <emacchi>
Component: InstallerAssignee: Pierre Prinetti <pprinett>
Installer sub component: OpenShift on OpenStack QA Contact: Itay Matza <imatza>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: low CC: bscott, fraziersexton7236166, m.andre, mbooth, pprinett
Version: 4.8Keywords: Triaged
Target Milestone: ---   
Target Release: 4.12.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
* Previously, users could manually set the API and Ingress virtual IP addresses to values that conflicted with the allocation pool of the DHCP server when installing a cluster on OpenStack. This could cause the DHCP server to assign one of the VIP addresses to a new machine, which would fail to start. In this update, the installation program validates the user-provided VIP addresses to ensure that they do not conflict with any DHCP pools. (link:https://bugzilla.redhat.com/show_bug.cgi?id=1944365[*BZ1944365*])
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-01-17 19:46:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Emilien Macchi 2021-03-29 19:23:10 UTC
By default, the API VIP takes x.x.x.5 and the Ingress VIP takes x.x.x.7 from the network’s CIDR block. To override these default values, we can set values for platform.openstack.apiVIP and platform.openstack.ingressVIP that are outside of the DHCP allocation pool.

There is no validation in the installer at the moment that'll make sure the end-user set this right.
If it's not right, it'll cause connectivity problems with the API when deploying the cluster.

Comment 1 Adolfo Duarte 2021-04-20 04:50:43 UTC
currently there is some sort of check 
using: 
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  machineNetwork:
  - cidr: 10.0.128.0/17
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16
platform:
  openstack:
    apiFloatingIP: 192.168.25.59
    apiVIP: 10.0.18.5
    cloud: openshift
    defaultMachinePlatform:
      type: m1.xxlarge
    externalDNS: null
    externalNetwork: hostonly
    ingressVIP: 10.0.128.7

then running ./openshift-install create ...

I get
[stack@standalone fake]$ ./openshift-install create install-config
FATAL failed to fetch Install Config: failed to load asset "Install Config": invalid "install-config.yaml" file: platform.openstack.apiVIP: Invalid value: "10.0.18.5": IP is not in the machineNetwork

doing the same for ingressVIP I get: 

[stack@standalone fake]$ ./openshift-install create install-config
FATAL failed to fetch Install Config: failed to load asset "Install Config": invalid "install-config.yaml" file: platform.openstack.ingressVIP: Invalid value: "10.0.18.7": IP is not in the machineNetwork 

@emacchi, do you have an install-config.yaml where the check might be failing?

Comment 2 Emilien Macchi 2021-04-20 13:43:00 UTC
My bug description was really incomplete, I'm sorry for that.
I reproduced the issue in the context of a Bring Your Own Network, and if your subnet has an allocation pool that includes .5 and .7, it caused the error that I had. I'll try to reproduce it quickly.

Note that your error is because you have a type, your machineNetwork is 10.0.128.0/17, not 10.0.18.0/17.

Comment 4 Martin André 2021-06-01 13:35:38 UTC
Related PR validating the API and Ingress VIPs are different: https://github.com/openshift/installer/pull/4946

We can build upon the above PR to add a check for API and Ingress VIPs being outside of the DHCP allocation pool, however I wouldn't be comfortable with the new check to be fatal as there are probably a few corner cases. For example it's possible for the VIPs to be in the allocation pool with the DHCP server is configured in such a way to not allocate the IPs other than to the API and Ingress ports.

Comment 6 ShiftStack Bugwatcher 2021-11-25 16:11:28 UTC
Removing the Triaged keyword because:
* the target release value is missing

* the QE automation assessment (flag qe_test_coverage) is missing

Comment 10 Pierre Prinetti 2022-07-07 06:29:32 UTC
Note for the verifier:

this is how I tested my PR --> https://github.com/openshift/installer/pull/6069#issuecomment-1174872791

Comment 12 Itay Matza 2022-08-04 11:25:21 UTC
Verified with openshift-install 4.12.0-0.nightly-2022-08-01-151317 on top of RHOS-16.1-RHEL-8-20220329.n.1:

1. Configured an OpenStack subnet with the following allocation pool:
```
	$ openstack subnet show restricted_subnet -c allocation_pools
	+------------------+-------------------------+
	| Field            | Value                   |
	+------------------+-------------------------+
	| allocation_pools | 172.16.0.4-172.16.0.254 |
	+------------------+-------------------------+

```

1.1. Configured an ingressVIP outside the machineNetwork:
```
	$ grep -A 1 "machineNetwork" install-config.yaml && grep "VIP" install-config.yaml
	  machineNetwork:
		- cidr: "172.16.0.0/25"
		apiVIP: "172.16.0.4"
		ingressVIP: "172.16.0.254"
	$ openshift-install create manifests
	ERROR failed to fetch Master Machines: failed to load asset "Install Config": failed to create install config: invalid "install-config.yaml" file: platform.openstack.ingressVIP: Invalid value: "172.16.0.254": IP is not in the machineNetwork 
```

1.2. Configured the VIPs overlap with the allocation pool:
```
	$ grep -A 1 "machineNetwork" install-config.yaml && grep "VIP" install-config.yaml
	  machineNetwork:
		- cidr: "172.16.0.0/24"
		apiVIP: "172.16.0.4"
		ingressVIP: "172.16.0.254"
	$ openshift-install create manifests
	ERROR failed to fetch Master Machines: failed to load asset "Install Config": failed to create install config: [platform.openstack.apiVIP: Invalid value: "172.16.0.4": apiVIP can not fall in a MachineNetwork allocation pool, platform.openstack.ingressVIP: Invalid value: "172.16.0.254": ingressVIP can not fall in a MachineNetwork allocation pool]
```

1.3. Configured the ingressVIP overlaps with the allocation pool:
```
	$ grep -A 1 "machineNetwork" install-config.yaml && grep "VIP" install-config.yaml
	  machineNetwork:
		- cidr: "172.16.0.0/24"
		apiVIP: "172.16.0.3"
		ingressVIP: "172.16.0.254"
	$ openshift-install create manifests
	ERROR failed to fetch Master Machines: failed to load asset "Install Config": failed to create install config: platform.openstack.ingressVIP: Invalid value: "172.16.0.254": ingressVIP can not fall in a MachineNetwork allocation pool
```

1.4. Configured the apiVIP overlaps with the allocation pool:
```
	$ grep -A 1 "machineNetwork" install-config.yaml && grep "VIP" install-config.yaml
	  machineNetwork:
		- cidr: "172.16.0.0/24"
		apiVIP: "172.16.0.4"
		ingressVIP: "172.16.0.3"
	$ openshift-install create manifests
	ERROR failed to fetch Master Machines: failed to load asset "Install Config": failed to create install config: platform.openstack.apiVIP: Invalid value: "172.16.0.4": apiVIP can not fall in a MachineNetwork allocation pool
```

1.5. Using the default VIPs which overlap with the allocation pool:
```
	$ grep -A 1 "machineNetwork" install-config.yaml && grep "VIP" install-config.yaml
	  machineNetwork:
		- cidr: "172.16.0.0/24"
		#apiVIP: "172.16.0.4"
		#ingressVIP: "172.16.0.3"
	$ openshift-install create manifests
	ERROR failed to fetch Master Machines: failed to load asset "Install Config": failed to create install config: [platform.openstack.apiVIP: Invalid value: "172.16.0.5": apiVIP can not fall in a MachineNetwork allocation pool, platform.openstack.ingressVIP: Invalid value: "172.16.0.7": ingressVIP can not fall in a MachineNetwork allocation pool]
```

2. Configured the VIPs outside of the allocation pool, and the Manifests created:
```
	$ openstack subnet unset restricted_subnet --allocation-pool start=172.16.0.4,end=172.16.0.254
	$ openstack subnet set restricted_subnet --allocation-pool start=172.16.0.8,end=172.16.0.254
	$ openstack subnet show restricted_subnet -c allocation_pools
	+------------------+-------------------------+
	| Field            | Value                   |
	+------------------+-------------------------+
	| allocation_pools | 172.16.0.8-172.16.0.254 |
	+------------------+-------------------------+
	$ grep -A 1 "machineNetwork" install-config.yaml && grep "VIP" install-config.yaml
	  machineNetwork:
		- cidr: "172.16.0.0/24"
		apiVIP: "172.16.0.5"
		ingressVIP: "172.16.0.7"
	$ openshift-install create manifests
	INFO Credentials loaded from file "/home/cloud-user/clouds.yaml" 
	INFO Consuming Install Config from target directory 
	INFO Manifests created in: manifests and openshift
```

3. Configured multiple allocation pool:
```
	$ openstack subnet set restricted_subnet --allocation-pool start=172.16.0.8,end=172.16.0.100
	$ openstack subnet set restricted_subnet --allocation-pool start=172.16.0.110,end=172.16.0.254
	$ openstack subnet show restricted_subnet -c allocation_pools                                                                                                                  
	+------------------+---------------------------------------------------+
	| Field            | Value                                             |
	+------------------+---------------------------------------------------+                                                                                                                                          
	| allocation_pools | 172.16.0.8-172.16.0.100,172.16.0.110-172.16.0.254 |                                                                                                                                          
	+------------------+---------------------------------------------------+
```

3.1. Configured the VIPs overlap with the allocation pools:
```
	$ grep -A 1 "machineNetwork" install-config.yaml && grep "VIP" install-config.yaml
	  machineNetwork:
		- cidr: "172.16.0.0/24"
		apiVIP: "172.16.0.10"
		ingressVIP: "172.16.0.110"
	$ openshift-install create manifests
	ERROR failed to fetch Master Machines: failed to load asset "Install Config": failed to create install config: [platform.openstack.ingressVIP: Invalid value: "172.16.0.110": ingressVIP can not fall in a MachineNetwork allocation pool, platform.openstack.apiVIP: Invalid value: "172.16.0.10": apiVIP can not fall in a MachineNetwork allocation pool]
```

3.2. Configured the VIPs outside of the allocation pools, and the Manifests created:
```
	$ grep -A 1 "machineNetwork" install-config.yaml && grep "VIP" install-config.yaml
	  machineNetwork:
		- cidr: "172.16.0.0/24"
		apiVIP: "172.16.0.101"
		ingressVIP: "172.16.0.102"
	$ openshift-install create manifests
	WARNING Discarding the Master Machines that was provided in the target directory because its dependencies are dirty and it needs to be regenerated
	INFO Credentials loaded from file "/home/cloud-user/clouds.yaml"
	INFO Consuming Install Config from target directory
	WARNING Discarding the Worker Machines that was provided in the target directory because its dependencies are dirty and it needs to be regenerated
	WARNING Discarding the Common Manifests that was provided in the target directory because its dependencies are dirty and it needs to be regenerated
	WARNING Discarding the Openshift Manifests that was provided in the target directory because its dependencies are dirty and it needs to be regenerated
	INFO Consuming OpenShift Install (Manifests) from target directory
	INFO Manifests created in: manifests and openshift
```

Comment 15 errata-xmlrpc 2023-01-17 19:46:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7399

Comment 16 fraziersexton 2023-11-16 04:45:29 UTC Comment hidden (spam)