Bug 1678824

Summary: Installer not linting all the address fields in the config
Product: OpenShift Container Platform Reporter: Siva Reddy <schituku>
Component: InstallerAssignee: Matthew Staebler <mstaeble>
Installer sub component: openshift-installer QA Contact: Roshni <rpattath>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: adahiya, mstaeble, rpattath, schituku, sponnaga, wking
Version: 4.1.0   
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-04 10:44:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
The install config with invalid network address value in -cidr field
none
yamls after manifest creation none

Description Siva Reddy 2019-02-19 16:42:50 UTC
Created attachment 1536414 [details]
The install config with invalid network address value in -cidr field

Description of problem: The installer is not checking if address is network address in -cidr field but it is validating the values in machineCIDR and serviceCIDR fields. 

Version-Release number of the following components:
# ./openshift-install version
./openshift-install v4.0.0-0.176.0.0-dirty

How reproducible:
Always

Steps to Reproduce:
1. update the attached install-config.yaml  with your pull secret and ssh-key
2. create a test dir and copy the updated install-config.yaml into it.
   #mkdir test
   #cp install-config.yaml test/
2. create manifests using the install config
   ./openshift-install create manifests --dir 1/
3. The installer should validate the values present in the -cidr field to be valid network address

networking:
  clusterNetworks:
  - cidr: 10.0.128.0/16
    hostSubnetLength: 9
  machineCIDR: 10.0.0.0/16
  serviceCIDR: 172.30.0.0/16
  type: OpenShiftSDN   

Actual results:
   The installer is not validating the field -cidr

Expected results:
   The installer should check the field and throw an error message that the value is not a valid address like it is doing for machineCIDR

Comment 1 Matthew Staebler 2019-02-19 17:11:57 UTC
10.0.128.0/16 is a valid CIDR. Are you expected that 10.0.128.0/16 should be rejected as a CIDR?

Comment 2 Matthew Staebler 2019-02-19 17:25:37 UTC
Cluster network CIDR validation added in https://github.com/openshift/installer/pull/1276.

Comment 3 Siva Reddy 2019-02-19 17:52:55 UTC
(In reply to Matthew Staebler from comment #1)
> 10.0.128.0/16 is a valid CIDR. Are you expected that 10.0.128.0/16 should be
> rejected as a CIDR?

Was expecting the following which is the error message on machineCIDR -
FATAL failed to fetch Install Config: failed to load asset "Install Config": invalid "install-config.yaml" file: networking.machineCIDR: Invalid value: "10.0.128.0/16": invalid network address. got 10.0.128.0/16, expecting 10.0.0.0/16

Comment 6 Roshni 2019-02-28 21:01:24 UTC
root@ip-172-31-31-94: ~/installer # ./openshift-install version
./openshift-install v4.0.5-1-dirty

Build: 4.0.0-0.nightly-2019-02-27-213933

RHCOS: ami-0a0438a5e62dc0f9f

I do not see the expected failure when creating manifests but I do see the failure when creating the cluster. I noticed the during manifest creation the info in the updated install-config.yaml is not picked up, it using the old information. I am attaching the relevant yamls.

Comment 7 Roshni 2019-02-28 21:02:31 UTC
Created attachment 1539637 [details]
yamls after manifest creation

Comment 8 W. Trevor King 2019-03-01 10:33:40 UTC
Trying to reproduce:

$ oc adm release info --commits registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-02-27-213933 | grep installer
  installer                                     https://github.com/openshift/installer                                     563f71fdfb75f96177912ca9b1d4285c7f03cea1
$ git checkout 563f71fdfb75f96177912ca9b1d4285c7f03cea1
$ hack/build.sh
$ rm -rf wking
$ mkdir wking
$ cat >wking/install-config.yaml <<EOF
> apiVersion: v1beta3
> baseDomain: qe.devcluster.openshift.com
> compute:
> - name: worker
>   platform: {}
>   replicas: 1
> controlPlane:
>   name: master
>   platform: {}
>   replicas: 1
> metadata:
>   creationTimestamp: null
>   name: rpattath-test3
> networking:
>   clusterNetworks:
>   - cidr: 10.0.128.0/14
>     hostSubnetLength: 9
>   machineCIDR: 10.0.0.0/16
>   serviceCIDR: 172.30.0.0/16
>   type: OpenShiftSDN
> platform:
>   aws:
>     region: us-east-2
> pullSecret: '{"auths":{"example.com":{"auth":"testing"}}}'
> EOF
$ openshift-install --dir=wking create manifests
FATAL failed to fetch Master Machines: failed to load asset "Install Config": invalid "install-config.yaml" file: networking.clusterNetworks[0].cidr: Invalid value: "10.0.128.0/14": invalid network address. got 10.0.128.0/14, expecting 10.0.0.0/14 

Which looks good to me.  Taking a closer look at your attempt:

$ grep 'openshift-install\|install-config' bug1678824-cidr 
root@ip-172-31-31-94: ~/installer # cat install-config.yaml 
root@ip-172-31-31-94: ~/installer # ./openshift-install create manifests --dir 1/
  install-config: |
root@ip-172-31-31-94: ~/installer # ./openshift-install create cluster
FATAL failed to fetch Terraform Variables: failed to load asset "Install Config": invalid "install-config.yaml" file: networking.clusterNetworks[0].cidr: Invalid value: "10.0.128.0/14": invalid network address. got 10.0.128.0/14, expecting 10.0.0.0/14

The reason you see default CIDRs and no error in your 'create manifest' call is that you set --dir to point it at a different asset directory.  That 'create manifests' isn't seeing your install-config.yaml, which is why it goes through the wizard to help you configure from scratch.  Then your subsequent 'create cluster' leaves off the --dir, so it does slurp up your install-config.yaml and correctly errors out on the invalid CIDR.  That sounds like VERIFIED to me (and with the fix out in both 0.13.0 and 0.13.1, possibly CLOSED), but I'll leave it up to you in case my argument here is not convincing ;).

Comment 9 Roshni 2019-03-01 19:15:20 UTC
Trevor,

Agree with your argument now. I am still getting used to installations in multiple dirs. So from the below findings, the bug looks fixed. Please move the bug to ON_QA and I can mark it verified. Apologies for the Failed_QA.

root@ip-172-31-31-94: ~/installer # ll 1
total 4
-rw-r--r--. 1 root root 3856 Mar  1 18:46 install-config.yaml

When cidr is invalid:

root@ip-172-31-31-94: ~/installer # ./openshift-install create manifests --dir 1/
FATAL failed to fetch Master Machines: failed to load asset "Install Config": invalid "install-config.yaml" file: networking.clusterNetworks[0].cidr: Invalid value: "10.0.128.0/16": invalid network address. got 10.0.128.0/16, expecting 10.0.0.0/16 
root@ip-172-31-31-94: ~/installer # ./openshift-install create cluster --dir 1/
FATAL failed to fetch Terraform Variables: failed to load asset "Install Config": invalid "install-config.yaml" file: networking.clusterNetworks[0].cidr: Invalid value: "10.0.128.0/16": invalid network address. got 10.0.128.0/16, expecting 10.0.0.0/16 

when machine cidr is invalid:
 
root@ip-172-31-31-94: ~/installer # ./openshift-install create manifests --dir 1/
FATAL failed to fetch Master Machines: failed to load asset "Install Config": failed to unmarshal: error unmarshaling JSON: failed to Parse cidr string to net.IPNet: invalid CIDR address: 10.300.0.0/16 
root@ip-172-31-31-94: ~/installer # ./openshift-install create cluster --dir 1/
FATAL failed to fetch Terraform Variables: failed to load asset "Install Config": failed to unmarshal: error unmarshaling JSON: failed to Parse cidr string to net.IPNet: invalid CIDR address: 10.300.0.0/16 

When service dir is invalid: 

root@ip-172-31-31-94: ~/installer # ./openshift-install create manifests --dir 1/
FATAL failed to fetch Master Machines: failed to load asset "Install Config": invalid "install-config.yaml" file: networking.serviceCIDR: Invalid value: "172.30.128.0/16": invalid network address. got 172.30.128.0/16, expecting 172.30.0.0/16 
root@ip-172-31-31-94: ~/installer # ./openshift-install create cluster --dir 1/
FATAL failed to fetch Terraform Variables: failed to load asset "Install Config": invalid "install-config.yaml" file: networking.serviceCIDR: Invalid value: "172.30.128.0/16": invalid network address. got 172.30.128.0/16, expecting 172.30.0.0/16

Comment 10 W. Trevor King 2019-03-01 20:15:42 UTC
> Please move the bug to ON_QA...

Done :)

Comment 12 Siva Reddy 2019-03-12 15:24:49 UTC
Verified that the installer is validating the network addresses as expected.

Version-Release number of the following components:
# ./openshift-install version
./openshift-install v4.0.16-1-dirty


Steps to Verify:
1. update the attached install-config.yaml  with your pull secret and ssh-key
2. create a test dir and copy the updated install-config.yaml into it.
   #mkdir test
   #cp install-config.yaml test/
2. create manifests using the install config
   ./openshift-install create manifests --dir 1/
3. The installer should validate the values present in the -cidr field to be valid network address
4. repeat the stesp for machine and service cidr values also


  The installer is validating the addresses. The output with the error messages are as follows:

# ./openshift-install create manifests
FATAL failed to fetch Master Machines: failed to load asset "Install Config": invalid "install-config.yaml" file: networking.clusterNetworks[0].cidr: Invalid value: "10.0.128.0/16": invalid network address. got 10.0.128.0/16, expecting 10.0.0.0/16 

# ./openshift-install create manifests
FATAL failed to fetch Master Machines: failed to load asset "Install Config": invalid "install-config.yaml" file: networking.machineCIDR: Invalid value: "10.0.128.0/16": invalid network address. got 10.0.128.0/16, expecting 10.0.0.0/16 

# ./openshift-install create manifests
FATAL failed to fetch Master Machines: failed to load asset "Install Config": invalid "install-config.yaml" file: networking.serviceCIDR: Invalid value: "172.30.128.0/16": invalid network address. got 172.30.128.0/16, expecting 172.30.0.0/16

Comment 14 errata-xmlrpc 2019-06-04 10:44:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758