Bug 1843587

Summary: Installation fails if user takes full provisioningDHCPRange range
Product: OpenShift Container Platform Reporter: rlopez
Component: InstallerAssignee: Stephen Benjamin <stbenjam>
Installer sub component: OpenShift on Bare Metal IPI QA Contact: Shelly Miron <smiron>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: medium CC: beth.white, calfonso, pablo.iranzo, sdasu, stbenjam
Version: 4.4Keywords: Triaged
Target Milestone: ---Flags: smiron: needinfo+
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 2020-07-31 Doc Type: Bug Fix
Doc Text:
Previously a user was able to configure the DHCP range for the provisioning network to consume the entire subnet, including the cluster provisioning VIP. Now, the VIP's are validated to ensure they do not overlap with the DHCP range.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:04:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1831748    

Description rlopez 2020-06-03 15:29:10 UTC
Description of problem:

Currently the Installer does not make a validation check if someone happens to use the entire provisioningDHCPRange. The current default setting is to use IPs 10-100, however, if a user included 1-254 (assuming default 172.22.0.0/24 in this example) the install would fail because it would not be able to assign an IP for the bootstrap VM nor the cluster provisioning IP. 

Solution would be for the installer to ensure there are available IPs in the range so it can assign a bootstrap VM IP and cluster provisioning IP.

Comment 5 Shelly Miron 2020-09-01 07:49:34 UTC
I modified the provisioningDHCPRange in the install_config.yaml as here:

platform:
  baremetal:
    provisioningDHCPExternal: false
    externalBridge: baremetal-0
    provisioningBridge: provisioning-0
    libvirtURI: qemu+ssh://root.qe.lab.redhat.com/system
    provisioningDHCPRange: fd00:1101:0:0:0:0:0:4,fd00:1101:0:0:FFFF:FFFF:FFFF:FFFF
    provisioningNetworkInterface: enp4s0
    provisioningNetworkCIDR: fd00:1101::/64

I gave the field range from fd00:1101:0:0:0:0:0:4 to fd00:1101:0:0:FFFF:FFFF:FFFF:FFFF - i assume it supposed to fail? becasue it did not gave any error message.

When giving range of fd00:1101:0:0:0:0:0:3 to fd00:1101:0:0:FFFF:FFFF:FFFF:FFFF , it failed with error message:

FATAL failed to fetch Cluster: failed to fetch dependency of "Cluster": failed to generate asset "Platform Provisioning Check": platform.baremetal.provisioningHostIP: 
Invalid value: "fd00:1101::3": "fd00:1101::3" overlaps with the allocated DHCP range

Comment 6 Stephen Benjamin 2020-09-01 13:07:36 UTC
:4 to the end should be fine if there's enough room for the VIP's.  What was your API VIP? We might not be checking for that one...

Comment 7 Shelly Miron 2020-09-01 13:38:07 UTC
(In reply to Stephen Benjamin from comment #6)
> :4 to the end should be fine if there's enough room for the VIP's.  What was
> your API VIP? We might not be checking for that one...

platform:
  baremetal:
    provisioningDHCPExternal: false
    externalBridge: baremetal-0 
    provisioningBridge: provisioning-0 
    libvirtURI: qemu+ssh://root.qe.lab.redhat.com/system
    provisioningNetworkInterface: enp4s0
    provisioningNetworkCIDR: fd00:1101::/64
    bootstrapOSImage: http://registry.ocp-edge-cluster-0.qe.lab.redhat.com:8080/images/rhcos-46.82.202008181646-0-qemu.x86_64.qcow2.gz?sha256=11b8410e74539778ea556bb80de1903414d019da4d5ac4c1ae8ac9371a07e50b
    clusterOSImage: http://registry.ocp-edge-cluster-0.qe.lab.redhat.com:8080/images/rhcos-46.82.202008181646-0-openstack.x86_64.qcow2.gz?sha256=ce56e6de236d9bfb8beb519c3b80681f8077f31db2c64f2fbffd3e097855dde4
    apiVIP: 192.168.123.5 <<<<<------
    dnsVIP: 192.168.123.6
    ingressVIP: 192.168.123.10
    hosts:
        .......

(IPV4 baremetal network, IPV6 provisioning network)

Comment 8 Stephen Benjamin 2020-09-01 13:51:20 UTC
Sorry I hadn't had coffee yet -- this should be expected then, we'll give an error if the user tries to consume part of the DHCP space including the provisioning VIP's.  Your provisioning VIP's are :2 and :3, so the :4 range is safe.  The error you get when consuming :3 is correct. Does that make sense?

Comment 9 Shelly Miron 2020-09-02 06:19:44 UTC
(In reply to Stephen Benjamin from comment #8)
> Sorry I hadn't had coffee yet -- this should be expected then, we'll give an
> error if the user tries to consume part of the DHCP space including the
> provisioning VIP's.  Your provisioning VIP's are :2 and :3, so the :4 range
> is safe.  The error you get when consuming :3 is correct. Does that make
> sense?

Yes. switching to verified.

Comment 11 errata-xmlrpc 2020-10-27 16:04:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196