Bug 1899161 - OpenShift 4.6/OSP install fails when node flavor has less than 25GB, even with dedicated storage
Summary: OpenShift 4.6/OSP install fails when node flavor has less than 25GB, even wit...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.6.z
Hardware: All
OS: All
urgent
urgent
Target Milestone: ---
: 4.6.z
Assignee: Pierre Prinetti
QA Contact: weiwei jiang
URL:
Whiteboard:
Depends On: 1891543
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-18 16:09 UTC by Emilien Macchi
Modified: 2021-01-25 20:02 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Installer checks that the minimum 25GB disk is available per node. However, this validation only checks the osp flavor, and does not validate if separate rootDisk has been attached from dedicated storage. Consequence: When using a small flavor in combination with sufficient rootDisk, Installer refuses to install. Fix: With this patch, the additional rootDisk is taken into consideration when validating the required disk space. Result: A successful install can be carried out using a rootDisk in combination with a flavor with a small disk.
Clone Of: 1891543
Environment:
Last Closed: 2021-01-25 20:02:12 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 4394 0 None closed Bug 1899161: Cherry-Pick: openstack: remove platform flavor validation 2021-01-29 18:09:24 UTC
Github openshift installer pull 4508 0 None closed Bug 1899161: openstack: consider volumes for storage requirements checks 2021-01-29 18:09:24 UTC
Red Hat Product Errata RHSA-2021:0171 0 None None None 2021-01-25 20:02:34 UTC

Comment 9 Johnny Liu 2020-12-17 12:01:52 UTC
Retest this bug with 4.6.0-0.nightly-2020-12-16-201440

[root@preserve-jialiu-ansible ~]# openshift-install-46 version
openshift-install-46 4.6.0-0.nightly-2020-12-16-201440
built from commit a48ad4a15b42102d1747d2f5f3b635deffb950b5
release image registry.svc.ci.openshift.org/ocp/release@sha256:faf7c37ca625da7ddf71be3d492a4a245536d600ff99595e6893b1dab026fea1

[root@preserve-jialiu-ansible ~]# openstack flavor show m1.large
+----------------------------+----------------------------------------------------------------+
| Field                      | Value                                                          |
+----------------------------+----------------------------------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                                          |
| OS-FLV-EXT-DATA:ephemeral  | 0                                                              |
| access_project_ids         | None                                                           |
| disk                       | 60                                                             |
| id                         | a9acc2de-39d7-4148-8d16-413c3b696e9d                           |
| name                       | m1.large                                                       |
| os-flavor-access:is_public | True                                                           |
| properties                 | aggregate_instance_extra_specs:server_type:aggregate='general' |
| ram                        | 8192                                                           |
| rxtx_factor                | 1.0                                                            |
| swap                       |                                                                |
| vcpus                      | 4                                                              |
+----------------------------+----------------------------------------------------------------+
[root@preserve-jialiu-ansible ~]# openstack flavor show ci.usm-qe-02
+----------------------------+-----------------------------------------------------------+
| Field                      | Value                                                     |
+----------------------------+-----------------------------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                                     |
| OS-FLV-EXT-DATA:ephemeral  | 0                                                         |
| access_project_ids         | None                                                      |
| disk                       | 20                                                        |
| id                         | 68f44283-1ebe-43b0-abdc-12e15b17348d                      |
| name                       | ci.usm-qe-02                                              |
| os-flavor-access:is_public | True                                                      |
| properties                 | aggregate_instance_extra_specs:server_type:aggregate='ci' |
| ram                        | 4096                                                      |
| rxtx_factor                | 1.0                                                       |
| swap                       |                                                           |
| vcpus                      | 4                                                         |
+----------------------------+-----------------------------------------------------------+



Scenario 1:
- Set platform.openstack.computeFlavor to a flavor which does not meet minimal requirement of disk, 
- leave controlPlane.platform.openstack and compute[0].platform.openstack to empty.

[root@preserve-jialiu-ansible ~]# cat demo2/install-config.yaml 
apiVersion: v1
baseDomain: 10.0.103.240.nip.io
compute:
- architecture: amd64
  hyperthreading: Enabled
  name: worker
  platform:
    openstack: {}
  replicas: 2
controlPlane:
  architecture: amd64
  hyperthreading: Enabled
  name: master
  platform:
    openstack: {}
  replicas: 3
metadata:
  name: wj45ios1019a
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  machineNetwork:
  - cidr: 192.168.0.0/18
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16
platform:
  openstack:
    cloud: upshift
    computeFlavor: ci.usm-qe-02
    externalNetwork: provider_net_cci_8
    octaviaSupport: '0'
    region: regionOne
    trunkSupport: '1'
publish: External
pullSecret: '{"auths":{"cloud.openshift.com":{"auth":"aaaa"}}}'

[root@preserve-jialiu-ansible ~]# openshift-install-47 create manifests --dir demo2
FATAL failed to fetch Master Machines: failed to load asset "Install Config": platform.openstack.defaultMachinePlatform.type: Invalid value: "ci.usm-qe-02": Flavor did not meet the following minimum requirements: Must have minimum of 25 GB Disk, had 20 GB 
[root@preserve-jialiu-ansible ~]# openshift-install-47 version
openshift-install-47 4.7.0-0.nightly-2020-12-17-001141
built from commit 018b03ac6d96edc4bc91e84c894715132680964f
release image registry.svc.ci.openshift.org/ocp/release@sha256:122eb597f788d3c666cdb121928fa7c8f3d5f8147ad102d4a64b705d0ecdaa31
[root@preserve-jialiu-ansible ~]# openshift-install-46 create manifests --dir demo2
INFO Credentials loaded from file "/root/clouds.yaml" 
INFO Consuming Install Config from target directory 
INFO Manifests created in: demo2/manifests and demo2/openshift

In this scenario, 4.7 report error message, but 4.6 did not. For my understanding, 4.7 installer did an expected and reasonable behaviour.



Scenario 2:.
- Set platform.openstack.computeFlavor to a flavor which does not meet minimal requirement of disk
- set controlPlane.platform.openstack.rootVolume and compute[0].platform.openstack.rootVolume to 20G, which does not meet minimal requirement of disk.

[root@preserve-jialiu-ansible ~]# cat demo4/install-config.yaml
apiVersion: v1
baseDomain: 10.0.103.240.nip.io
compute:
- architecture: amd64
  hyperthreading: Enabled
  name: worker
  platform:
    openstack:
      rootVolume:
        size: 20
        type: tripleo
      type: m1.large
  replicas: 2
controlPlane:
  architecture: amd64
  hyperthreading: Enabled
  name: master
  platform:
    openstack:
      rootVolume:
        size: 20
        type: tripleo
      type: m1.large
  replicas: 3
metadata:
  name: wj45ios1019a
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  machineNetwork:
  - cidr: 192.168.0.0/18
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16
platform:
  openstack:
    cloud: upshift
    computeFlavor: ci.usm-qe-02
    externalNetwork: provider_net_cci_8
    octaviaSupport: '0'
    region: regionOne
    trunkSupport: '1'
publish: External
pullSecret: '{"auths":{"cloud.openshift.com":{"auth":"aaaa"}}}'

[root@preserve-jialiu-ansible ~]# openshift-install-47 create manifests --dir demo4
FATAL failed to fetch Master Machines: failed to load asset "Install Config": [platform.openstack.defaultMachinePlatform.type: Invalid value: "ci.usm-qe-02": Flavor did not meet the following minimum requirements: Must have minimum of 25 GB Disk, had 20 GB, controlPlane.platform.openstack.rootVolume.size: Invalid value: 20: Volume size must be greater than 25 to use root volumes, had 20, compute[0].platform.openstack.rootVolume.size: Invalid value: 20: Volume size must be greater than 25 to use root volumes, had 20] 
[root@preserve-jialiu-ansible ~]# openshift-install-46 create manifests --dir demo4
INFO Credentials loaded from file "/root/clouds.yaml" 
INFO Consuming Install Config from target directory 
INFO Manifests created in: demo4/manifests and demo4/openshift 


In this scenario, 4.7 report error message, but 4.6 did not. For my understanding, 4.7 installer did an expected and reasonable behaviour.


Scenario 3:
- Set platform.openstack.computeFlavor to a flavor which does not meet minimal requirement of disk
- set controlPlane.platform.openstack.type and compute[0].platform.openstack.type to a flavor which does not meet minimal requirement of disk
- set controlPlane.platform.openstack.rootVolume and compute[0].platform.openstack.rootVolume to 25G, which meet minimal requirement of disk.

[root@preserve-jialiu-ansible ~]# cat demo6/install-config.yaml 
apiVersion: v1
baseDomain: 10.0.103.240.nip.io
compute:
- architecture: amd64
  hyperthreading: Enabled
  name: worker
  platform:
    openstack:
      rootVolume:
        size: 25
        type: tripleo
      type: ci.usm-qe-02
  replicas: 2
controlPlane:
  architecture: amd64
  hyperthreading: Enabled
  name: master
  platform:
    openstack:
      rootVolume:
        size: 25
        type: tripleo
      type: ci.usm-qe-02
  replicas: 3
metadata:
  name: wj45ios1019a
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  machineNetwork:
  - cidr: 192.168.0.0/18
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16
platform:
  openstack:
    cloud: upshift
    computeFlavor: ci.usm-qe-02
    externalNetwork: provider_net_cci_8
    octaviaSupport: '0'
    region: regionOne
    trunkSupport: '1'
publish: External
pullSecret: '{"auths":{"cloud.openshift.com":{"auth":"aaaa"}}}'

[root@preserve-jialiu-ansible ~]# openshift-install-47 create manifests --dir demo6
FATAL failed to fetch Master Machines: failed to load asset "Install Config": platform.openstack.defaultMachinePlatform.type: Invalid value: "ci.usm-qe-02": Flavor did not meet the following minimum requirements: Must have minimum of 25 GB Disk, had 20 GB 
[root@preserve-jialiu-ansible ~]# openshift-install-46 create manifests --dir demo6
FATAL failed to fetch Master Machines: failed to load asset "Install Config": [controlPlane.platform.openstack.flavorName: Invalid value: "ci.usm-qe-02": Flavor did not meet the following minimum requirements: Must have minimum of 25 GB Disk, had 20 GB, compute[0].platform.openstack.flavorName: Invalid value: "ci.usm-qe-02": Flavor did not meet the following minimum requirements: Must have minimum of 25 GB Disk, had 20 GB] 

In this scenario, 4.7 report error message to indicate the default machine type does not meet minimum requirement, did not complain controlPlane and compute specific setting. While 4.6 complain both controlPlane and compute set wrong flavor, but ignore user's root volume setting.

Scenario 4:
- Not set platform.openstack.computeFlavor
- set controlPlane.platform.openstack.type and compute[0].platform.openstack.type to a flavor which does not meet minimal requirement of disk
- set controlPlane.platform.openstack.rootVolume and compute[0].platform.openstack.rootVolume to 25G, which meet minimal requirement of disk.

[root@preserve-jialiu-ansible ~]# cat demo10/install-config.yaml 
apiVersion: v1
baseDomain: 10.0.103.240.nip.io
compute:
- architecture: amd64
  hyperthreading: Enabled
  name: worker
  platform:
    openstack:
      rootVolume:
        size: 25
        type: tripleo
      type: ci.usm-qe-02
  replicas: 2
controlPlane:
  architecture: amd64
  hyperthreading: Enabled
  name: master
  platform:
    openstack:
      rootVolume:
        size: 25
        type: tripleo
      type: ci.usm-qe-02
  replicas: 3
metadata:
  name: wj45ios1019a
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  machineNetwork:
  - cidr: 192.168.0.0/18
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16
platform:
  openstack:
    cloud: upshift
    externalNetwork: provider_net_cci_8
    octaviaSupport: '0'
    region: regionOne
    trunkSupport: '1'
publish: External
pullSecret: '{"auths":{"cloud.openshift.com":{"auth":"aaaa"}}}'


[root@preserve-jialiu-ansible ~]# openshift-install-46 create manifests --dir demo10
FATAL failed to fetch Master Machines: failed to load asset "Install Config": [controlPlane.platform.openstack.flavorName: Invalid value: "ci.usm-qe-02": Flavor did not meet the following minimum requirements: Must have minimum of 25 GB Disk, had 20 GB, compute[0].platform.openstack.flavorName: Invalid value: "ci.usm-qe-02": Flavor did not meet the following minimum requirements: Must have minimum of 25 GB Disk, had 20 GB] 

[root@preserve-jialiu-ansible ~]# openshift-install-47 create manifests --dir demo10
INFO Credentials loaded from file "/root/clouds.yaml" 
INFO Consuming Install Config from target directory 
INFO Manifests created in: demo10/manifests and demo10/openshift 

In this scenario, 4.7 goes well, did not complain controlPlane and compute type setting. While 4.6 complain both controlPlane and compute set wrong flavor, but ignore user's root volume setting.


In all the above 4 scenario, from user spective, 4.7 installer is doing good job, and sounds more reasonable, but 4.6 not.

Comment 11 Emilien Macchi 2020-12-21 19:24:21 UTC
I need to backport https://github.com/openshift/installer/pull/4323 which is a bit tricky, big merge conflict.

Comment 12 Emilien Macchi 2020-12-21 19:57:54 UTC
We depend on this backport: https://github.com/openshift/installer/pull/4344
So once this one merges, we can cherry-pick https://github.com/openshift/installer/pull/4323.

Comment 18 weiwei jiang 2021-01-19 03:25:48 UTC
Verified with 4.6.0-0.nightly-2021-01-18-070340

./openshift-install-4.6 4.6.0-0.nightly-2021-01-18-070340
built from commit a8eaa59310e2513d607f5873ca70211617dfebc7
release image registry.ci.openshift.org/ocp/release@sha256:21707e67d495d43925016e18a36620f05775e9cd48bee4abe99badf1bd6c0f0e


---
apiVersion: v1
baseDomain: 10.0.101.66.nip.io
compute:
- architecture: amd64
  hyperthreading: Enabled
  name: worker
  platform:
    openstack:
      rootVolume:
        size: 25
        type: tripleo
      type: ci.usm-qe-02
  replicas: 2
controlPlane:
  architecture: amd64
  hyperthreading: Enabled
  name: master
  platform:
    openstack:
      rootVolume:
        size: 25
        type: tripleo
      type: ci.usm-qe-02
  replicas: 3
metadata:
  name: wj45ios1019a
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  machineNetwork:
  - cidr: 192.168.0.0/18
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16
platform:
  openstack:
    apiFloatingIP: 10.0.101.66
    cloud: upshift
    computeFlavor: ci.usm-qe-02
    externalNetwork: provider_net_cci_8
publish: External
pullSecret: Hidden
sshKey: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCWkwurd8TNAi+D7ffvyDdhGBSQtJx3/Yedlwvvha0q772vLlOAGlKCw4dajKy6qty1/GGQDgTJ17h3C9TEArI8ZqILnyydeY56DL+ELN3dtGBVof/N2qtW0+SmEnd1Mi7Qy5Tx4e/GVmB3NgX9szwNOVXhebzgBsXc9x+RtCVLPLC8J+qqSdTUZ0UfJsh2ptlQLGHmmTpF//QlJ1tngvAFeCOxJUhrLAa37P9MtFsiNk31EfKyBk3eIdZljTERmqFaoJCohsFFEdO7tVgU6p5NwniAyBGZVjZBzjELoI1aZ+/g9yReIScxl1R6PWqEzcU6lGo2hInnb6nuZFGb+90D
  openshift-qe@redhat.com
Running: ./openshift-install-4.6 create manifests --dir /tmp/tmp.wdbhFtUAQy 2>&1
level=info msg="Credentials loaded from file \"/home/wjiang/osp_remover/clouds.yaml\""
level=info msg="Consuming Install Config from target directory"
level=info msg="Manifests created in: /tmp/tmp.wdbhFtUAQy/manifests and /tmp/tmp.wdbhFtUAQy/openshift"


---
apiVersion: v1
baseDomain: 10.0.101.66.nip.io
compute:
- architecture: amd64
  hyperthreading: Enabled
  name: worker
  platform:
    openstack:
      rootVolume:
        size: 20
        type: tripleo
      type: m1.large
  replicas: 2
controlPlane:
  architecture: amd64
  hyperthreading: Enabled
  name: master
  platform:
    openstack:
      rootVolume:
        size: 20
        type: tripleo
      type: m1.large
  replicas: 3
metadata:
  name: wj45ios1019a
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  machineNetwork:
  - cidr: 192.168.0.0/18
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16
platform:
  openstack:
    apiFloatingIP: 10.0.101.66
    cloud: upshift
    defaultMachinePlatform:
      type: ci.usm-qe-02
    externalNetwork: provider_net_cci_8
publish: External
pullSecret: Hidden
sshKey: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCWkwurd8TNAi+D7ffvyDdhGBSQtJx3/Yedlwvvha0q772vLlOAGlKCw4dajKy6qty1/GGQDgTJ17h3C9TEArI8ZqILnyydeY56DL+ELN3dtGBVof/N2qtW0+SmEnd1Mi7Qy5Tx4e/GVmB3NgX9szwNOVXhebzgBsXc9x+RtCVLPLC8J+qqSdTUZ0UfJsh2ptlQLGHmmTpF//QlJ1tngvAFeCOxJUhrLAa37P9MtFsiNk31EfKyBk3eIdZljTERmqFaoJCohsFFEdO7tVgU6p5NwniAyBGZVjZBzjELoI1aZ+/g9yReIScxl1R6PWqEzcU6lGo2hInnb6nuZFGb+90D
  openshift-qe@redhat.com
Running: ./openshift-install-4.6 create manifests --dir /tmp/tmp.pj1fO7z0eB 2>&1
level=fatal msg="failed to fetch Master Machines: failed to load asset \"Install Config\": [platform.openstack.defaultMachinePlatform.type: Not found: \"ci.usm-qe-02\", controlPlane.platform.openstack.rootVolume.size: Invalid value: 20: Volume size must be greater than 25 to use root volumes, had 20, compute[0].platform.openstack.rootVolume.size: Invalid value: 20: Volume size must be greater than 25 to use root volumes, had 20]"

Comment 20 errata-xmlrpc 2021-01-25 20:02:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.6.13 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0171


Note You need to log in before you can comment on or make changes to this bug.