Description of problem: The installer is currently validating and warning for a CIDR overlap that doesn't exist: https://github.com/openshift/installer/blob/master/pkg/validate/validate.go#L138 After confirming with SDN, this Docker Bridge subnet does not exist, nor is used in OCP v4 on CRI-O How reproducible: always Steps to Reproduce: 1. Provide a Machine CIDR of "172.17.0.0/16" in the installer 2. Try to install your cluster Actual results: "time="2020-04-21T10:45:21Z" level=debug msg="installer console log: level=fatal msg=\"failed to fetch Master Machines: failed to load asset \\\"Install Config\\\": invalid \\\"install-config.yaml\\\" file: networking.clusterNetwork[0].cidr: Invalid value: \\\"172.16.0.0/15\\\": overlaps with default Docker Bridge subnet (172.16.0.0/15)\"\n" installID=qn46zp8f" Expected results: This validation is invalid and should be removed Additional info: Validation appears to have been originally introduced in https://github.com/openshift/installer/pull/342/files#diff-474a68b8f7d6552a4f35f1d003d86e8bR463. This *was* a valid validation in OCP v3, but no longer in OCP v4
Docker will not be present on any OCP 4.x clusters, as docker is not installed by RHCOS and should not even be used on BYOH. As such https://github.com/openshift/installer/blob/master/pkg/validate/validate.go#L137 is wrong and should be removed. The code in question validates whether the ClusterCIDR, ServiceCIDR, and MachineNetworks overlap with "special" subnets. Docker won't be one of those in 4.x. The only thing I can think of to check is the podman/crio default bridge subnet, but normally nodes should not be running non-host-network containers in an OpenShift cluster outside of kubelet and the CNI plugin. That subnet is 10.88.0.0/16
> "time="2020-04-21T10:45:21Z" level=debug msg="installer console log: level=fatal msg=\"failed to fetch Master Machines: failed to load asset \\\"Install Config\\\": invalid \\\"install-config.yaml\\\" file: networking.clusterNetwork[0].cidr: Invalid value: \\\"172.16.0.0/15\\\": overlaps with default Docker Bridge subnet (172.16.0.0/15)\"\n" installID=qn46zp8f" The validation message is clear that the installer does not support networks that overlap with docker subnnet bridge (172.16.0.0/15). This is not a bug. If the user requirement is to remove this restriction please track the feature work in JIRA by creating a story in - https://issues.redhat.com/projects/RFE or, - https://issues.redhat.com/projects/CORS
@abhinav it is a bug, because docker has not been present on *any* OpenShift 4.x cluster ever. It is a bug that a user of OpenShift 4.x cannot assign a ClusterCIDR, ServiceCIDR, or MachineNetwork that overlaps with a subnet that is not, and will never be, used by a docker bridge in OpenShift or RHEL.
Reproducing it with openshift-install 4.5.0-rc.5 Install GCP cluster with network set as below: networking: clusterNetwork: - cidr: 172.16.0.0/14 hostPrefix: 23 machineNetwork: - cidr: 10.211.66.0/23 networkType: OpenShiftSDN serviceNetwork: - 172.30.0.0/16 # openshift-install create cluster --dir bz FATAL failed to fetch Metadata: failed to load asset "Install Config": invalid "install-config.yaml" file: networking.clusterNetwork[0].cidr: Invalid value: "172.16.0.0/14": overlaps with default Docker Bridge subnet (172.16.0.0/14) Verifying it with openshift-install 4.6.0-0.nightly-2020-08-02-091622 on GCP # openshift-install create cluster --dir bz WARNING networking.clusterNetwork[0]: 172.16.0.0/14 overlaps with default Docker Bridge subnet INFO Credentials loaded from file "/root/.gcp/osServiceAccount.json" INFO Consuming Install Config from target directory INFO Creating infrastructure resources... ... INFO Waiting up to 10m0s for the openshift-console route to be created... INFO Install complete! INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/root/build/bz/auth/kubeconfig' INFO Access the OpenShift web-console here: https://console-openshift-console.apps.yybz.qe.gcp.devcluster.openshift.com INFO Login to the console with user: "kubeadmin", and password: "4SZZN-zSrQz-Fwza4-uoAiN" INFO Time elapsed: 35m6s There is a warning against CIDR prompted and the installation is successful.
Verifying it on libvirt with below setting: networking: clusterNetwork: - cidr: 172.16.0.0/14 hostPrefix: 23 machineNetwork: - cidr: 10.211.66.0/23 networkType: OpenShiftSDN serviceNetwork: - 172.30.0.0/16 platform: libvirt: URI: qemu+tcp://192.168.122.1/system network: if: mybridge0 # openshift-install create cluster --dir bz1 FATAL failed to fetch Metadata: failed to load asset "Install Config": invalid "install-config.yaml" file: [networking.clusterNetwork[0]: Invalid value: "172.16.0.0/14": overlaps with default Docker Bridge subnet, platform: Invalid value: "libvirt": must specify one of the platforms (aws, azure, baremetal, gcp, none, openstack, ovirt, vsphere)] There is an error against CIDR prompted
Verified on AWS with below networking: networking: clusterNetwork: - cidr: 10.128.0.0/14 hostPrefix: 23 machineNetwork: - cidr: 172.17.0.0/16 networkType: OpenShiftSDN serviceNetwork: - 172.30.0.0/16 + ./openshift-install create manifests --dir '/home/jenkins/workspace/Launch Environment Flexy/workdir/install-dir' level=warning msg="networking.machineNetwork[0]: 172.17.0.0/16 overlaps with default Docker Bridge subnet" There is a warning against CIDR prompted and the installation is successful.
Verified on vsphere with below networking: networking: clusterNetwork: - cidr: 10.128.0.0/14 hostPrefix: 23 machineNetwork: - cidr: 10.0.0.0/16 networkType: OpenShiftSDN serviceNetwork: - 172.17.0.0/14 + ./openshift-install create manifests --dir '/home/installer4/workspace/Launch Environment Flexy/workdir/install-dir' level=warning msg="networking.serviceNetwork[0]: 172.17.0.0/14 overlaps with default Docker Bridge subnet" There is a warning against serviceNetwork prompted. In short, on libvirt platform overlap with docker bridge is forbidden, and on the others platform overloap with docker bridge is allowed and has an warning shown to user. Moving it to verified state.
(In reply to Yang Yang from comment #18) > Verified on vsphere with below networking: > networking: > clusterNetwork: > - cidr: 10.128.0.0/14 > hostPrefix: 23 > machineNetwork: > - cidr: 10.0.0.0/16 > networkType: OpenShiftSDN > serviceNetwork: > - 172.17.0.0/14 > > + ./openshift-install create manifests --dir > '/home/installer4/workspace/Launch Environment Flexy/workdir/install-dir' > level=warning msg="networking.serviceNetwork[0]: 172.17.0.0/14 overlaps with > default Docker Bridge subnet" > > There is a warning against serviceNetwork prompted. > > In short, on libvirt platform overlap with docker bridge is forbidden, and > on the others platform overloap with docker bridge is allowed and has an > warning shown to user. Moving it to verified state. ######### I'm testing on vpshere install-config and this is what I get: ❯ openshift-install create manifests --dir=dev-ocp/ --log-level=debug DEBUG OpenShift Installer 4.5.15 DEBUG Built from commit 9893a482f310ee72089872f1a4caea3dbec34f28 DEBUG Fetching Master Machines... DEBUG Loading Master Machines... DEBUG Loading Cluster ID... DEBUG Loading Install Config... DEBUG Loading SSH Key... DEBUG Loading Base Domain... DEBUG Loading Platform... DEBUG Loading Cluster Name... DEBUG Loading Base Domain... DEBUG Loading Platform... DEBUG Loading Pull Secret... DEBUG Loading Platform... FATAL failed to fetch Master Machines: failed to load asset "Install Config": invalid "install-config.yaml" file: networking.machineNetwork[0]: Invalid value: "172.16.0.0/12": overlaps with default Docker Bridge subnet (172.16.0.0/12) the install-config.yml is as below: networking: clusterNetwork: - cidr: 10.128.0.0/14 hostPrefix: 23 machineNetwork: - cidr: 172.16.0.0/12 networkType: OpenShiftSDN serviceNetwork: - 10.132.0.0/16
Adding to comment 20, I'm using openshift-installer on Mac
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196
(In reply to Choo Pui kun from comment #20) > (In reply to Yang Yang from comment #18) > > Verified on vsphere with below networking: > > networking: > > clusterNetwork: > > - cidr: 10.128.0.0/14 > > hostPrefix: 23 > > machineNetwork: > > - cidr: 10.0.0.0/16 > > networkType: OpenShiftSDN > > serviceNetwork: > > - 172.17.0.0/14 > > > > + ./openshift-install create manifests --dir > > '/home/installer4/workspace/Launch Environment Flexy/workdir/install-dir' > > level=warning msg="networking.serviceNetwork[0]: 172.17.0.0/14 overlaps with > > default Docker Bridge subnet" > > > > There is a warning against serviceNetwork prompted. > > > > In short, on libvirt platform overlap with docker bridge is forbidden, and > > on the others platform overloap with docker bridge is allowed and has an > > warning shown to user. Moving it to verified state. > > > ######### > > I'm testing on vpshere install-config and this is what I get: > > ❯ openshift-install create manifests --dir=dev-ocp/ --log-level=debug > DEBUG OpenShift Installer 4.5.15 > DEBUG Built from commit 9893a482f310ee72089872f1a4caea3dbec34f28 > DEBUG Fetching Master Machines... > DEBUG Loading Master Machines... > DEBUG Loading Cluster ID... > DEBUG Loading Install Config... > DEBUG Loading SSH Key... > DEBUG Loading Base Domain... > DEBUG Loading Platform... > DEBUG Loading Cluster Name... > DEBUG Loading Base Domain... > DEBUG Loading Platform... > DEBUG Loading Pull Secret... > DEBUG Loading Platform... > FATAL failed to fetch Master Machines: failed to load asset "Install > Config": invalid "install-config.yaml" file: networking.machineNetwork[0]: > Invalid value: "172.16.0.0/12": overlaps with default Docker Bridge subnet > (172.16.0.0/12) > > the install-config.yml is as below: > networking: > clusterNetwork: > - cidr: 10.128.0.0/14 > hostPrefix: 23 > machineNetwork: > - cidr: 172.16.0.0/12 > networkType: OpenShiftSDN > serviceNetwork: > - 10.132.0.0/16 The issue is fixed in 4.6, so please have a try with 4.6 payload.