Bug 1955544
Summary: | [IPI][OSP] densed master-only installation with 0 workers fails due to missing worker security group on masters | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Joachim von Thadden <j.thadden> |
Component: | Installer | Assignee: | Martin André <m.andre> |
Installer sub component: | OpenShift on OpenStack | QA Contact: | Itay Matza <imatza> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | low | ||
Priority: | low | CC: | imatza, lmadsen, m.andre, mrunge, pprinett, swilber |
Version: | 4.7 | ||
Target Milestone: | --- | ||
Target Release: | 4.10.0 | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause: OCP master nodes were missing ingress security group rules when deploying on OpenStack and masters are schedulable.
Consequence: OCP deployments on OpenStack platform fails for compact clusters with no dedicated workers.
Fix: Add ingress security group rules on OpenStack when masters are schedulable.
Result: It is possible to deploy compact 3 nodes clusters on OpenStack platform.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2022-03-10 16:03:38 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 2016267 |
Description
Joachim von Thadden
2021-04-30 10:59:24 UTC
I could reproduce the issue fine. I tried to set the master node as "Schedulable", to let OCP deploy worker pods onto a master node, but it didn't seem to work for me. I'll keep looking. Since there is an easy workaround (with security group), I'll set the bug as LOW. Please let me know if you think otherwise. This would be incredibly useful for Service Telemetry Framework testing, which uses a 3-node reference architecture for OCP. I'd like to request this be bumped in priority to maybe medium. Our intent would be to use this for testing the Shift on Stack scenario to providing OpenStack monitoring without the need for an external cluster. CC: @pkilambi @swilber Verified in OCP 4.10.0-0.nightly-2021-10-16-173656 on top of OSP RHOS-16.1-RHEL-8-20210916.n.0. Verification steps: 1) Installation of OCP with 3 masters and with 0 workers finished successfully: >$ openshift-install create cluster --dir ostest/ >INFO Credentials loaded from file "/home/stack/clouds.yaml" >WARNING Making control-plane schedulable by setting MastersSchedulable to true for Scheduler cluster settings >INFO Consuming Install Config from target directory >INFO Obtaining RHCOS image file from 'https://releases-art-rhcos.svc.ci.openshift.org/art/storage/releases/rhcos-4.9/49.84.202107010027-0/x86_64/rhcos-49.84.202107010027-0-openstack.x86_64.qcow2.gz?sha256=00cb56c8711686255744646394e22a8ca5f27e059016f6758f14388e5a0a14cb' >INFO The file was found in cache: /home/stack/.cache/openshift-installer/image_cache/rhcos-49.84.202107010027-0-openstack.x86_64.qcow2. Reusing... >WARNING Following quotas Subnet, SecurityGroup, Port, Network, SecurityGroupRule are available but will be completely used pretty soon. >INFO Creating infrastructure resources... >INFO Waiting up to 20m0s for the Kubernetes API at https://api.ostest.shiftstack.com:6443... >INFO API v1.22.1+9312243 up >INFO Waiting up to 30m0s for bootstrapping to complete... >INFO Destroying the bootstrap resources... >INFO Waiting up to 40m0s for the cluster at https://api.ostest.shiftstack.com:6443 to initialize... >INFO Waiting up to 10m0s for the openshift-console route to be created... >INFO Install complete! >INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/home/stack/ostest/auth/kubeconfig' >INFO Access the OpenShift web-console here: https://console-openshift-console.apps.ostest.shiftstack.com >INFO Login to the console with user: "kubeadmin", and password: "5Ph7w-6NQQC-kNeu8-AZNny" >INFO Time elapsed: 23m56s 2) Make sure the OCP cluster is operational: >$ oc get machineset -A >NAMESPACE NAME DESIRED CURRENT READY AVAILABLE AGE >openshift-machine-api ostest-sccdp-worker-0 0 0 120m >$ oc get machines -A >NAMESPACE NAME PHASE TYPE REGION ZONE AGE >openshift-machine-api ostest-sccdp-master-0 Running m4.xlarge regionOne nova 120m >openshift-machine-api ostest-sccdp-master-1 Running m4.xlarge regionOne nova 120m >openshift-machine-api ostest-sccdp-master-2 Running m4.xlarge regionOne nova 120m >$ oc get nodes >NAME STATUS ROLES AGE VERSION >ostest-sccdp-master-0 Ready master,worker 119m v1.22.1+9312243 >ostest-sccdp-master-1 Ready master,worker 119m v1.22.1+9312243 >ostest-sccdp-master-2 Ready master,worker 119m v1.22.1+9312243 >$ openstack server list >+--------------------------------------+-----------------------+--------+-------------------------------------+--------------------+--------+ >| ID | Name | Status | Networks | Image | Flavor | >+--------------------------------------+-----------------------+--------+-------------------------------------+--------------------+--------+ >| fa32d58d-4d3d-45f5-aadc-3fc5774860a6 | ostest-sccdp-master-2 | ACTIVE | ostest-sccdp-openshift=10.196.0.111 | ostest-sccdp-rhcos | | >| c2194eb3-7d23-4655-b899-991b887b8eea | ostest-sccdp-master-1 | ACTIVE | ostest-sccdp-openshift=10.196.1.213 | ostest-sccdp-rhcos | | >| 9835f5c0-2441-4238-a8e1-41c1f6877723 | ostest-sccdp-master-0 | ACTIVE | ostest-sccdp-openshift=10.196.2.171 | ostest-sccdp-rhcos | | >+--------------------------------------+-----------------------+--------+-------------------------------------+--------------------+--------+ >$ oc get clusteroperators >NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE >authentication 4.10.0-0.nightly-2021-10-16-173656 True False False 103m >baremetal 4.10.0-0.nightly-2021-10-16-173656 True False False 110m >cloud-controller-manager 4.10.0-0.nightly-2021-10-16-173656 True False False 119m >cloud-credential 4.10.0-0.nightly-2021-10-16-173656 True False False 120m >cluster-autoscaler 4.10.0-0.nightly-2021-10-16-173656 True False False 113m >config-operator 4.10.0-0.nightly-2021-10-16-173656 True False False 117m >console 4.10.0-0.nightly-2021-10-16-173656 True False False 107m >csi-snapshot-controller 4.10.0-0.nightly-2021-10-16-173656 True False False 116m >dns 4.10.0-0.nightly-2021-10-16-173656 True False False 113m >etcd 4.10.0-0.nightly-2021-10-16-173656 True False False 115m >image-registry 4.10.0-0.nightly-2021-10-16-173656 True False False 111m >ingress 4.10.0-0.nightly-2021-10-16-173656 True False False 110m >insights 4.10.0-0.nightly-2021-10-16-173656 True False False 110m >kube-apiserver 4.10.0-0.nightly-2021-10-16-173656 True False False 113m >kube-controller-manager 4.10.0-0.nightly-2021-10-16-173656 True False False 114m >kube-scheduler 4.10.0-0.nightly-2021-10-16-173656 True False False 114m >kube-storage-version-migrator 4.10.0-0.nightly-2021-10-16-173656 True False False 117m >machine-api 4.10.0-0.nightly-2021-10-16-173656 True False False 110m >machine-approver 4.10.0-0.nightly-2021-10-16-173656 True False False 116m >machine-config 4.10.0-0.nightly-2021-10-16-173656 True False False 115m >marketplace 4.10.0-0.nightly-2021-10-16-173656 True False False 116m >monitoring 4.10.0-0.nightly-2021-10-16-173656 True False False 108m >network 4.10.0-0.nightly-2021-10-16-173656 True False False 118m >node-tuning 4.10.0-0.nightly-2021-10-16-173656 True False False 116m >openshift-apiserver 4.10.0-0.nightly-2021-10-16-173656 True False False 112m >openshift-controller-manager 4.10.0-0.nightly-2021-10-16-173656 True False False 109m >openshift-samples 4.10.0-0.nightly-2021-10-16-173656 True False False 110m >operator-lifecycle-manager 4.10.0-0.nightly-2021-10-16-173656 True False False 116m >operator-lifecycle-manager-catalog 4.10.0-0.nightly-2021-10-16-173656 True False False 116m >operator-lifecycle-manager-packageserver 4.10.0-0.nightly-2021-10-16-173656 True False False 111m >service-ca 4.10.0-0.nightly-2021-10-16-173656 True False False 117m >storage 4.10.0-0.nightly-2021-10-16-173656 True False False 114m 3) Create a new project with three pods. The pods are running on the master nodes: >$ oc get pods -n demo -o wide >NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES >demo-7897db69cc-9zhh4 1/1 Running 0 18h 10.128.130.201 ostest-sccdp-master-1 <none> <none> >demo-7897db69cc-fmrdc 1/1 Running 0 18h 10.128.131.216 ostest-sccdp-master-2 <none> <none> >demo-7897db69cc-mk22w 1/1 Running 0 18h 10.128.130.46 ostest-sccdp-master-0 <none> <none> 4) Creating two workers. Changed the replica value from 0 to 2. The two instances and the clusteroperators are up and running. (In reply to Itay Matza from comment #10) > Verified in OCP 4.10.0-0.nightly-2021-10-16-173656 on top of OSP > RHOS-16.1-RHEL-8-20210916.n.0. > > Verification steps: > > 1) Installation of OCP with 3 masters and with 0 workers finished > successfully: > >$ openshift-install create cluster --dir ostest/ > >INFO Credentials loaded from file "/home/stack/clouds.yaml" > >WARNING Making control-plane schedulable by setting MastersSchedulable to true for Scheduler cluster settings > >INFO Consuming Install Config from target directory > >INFO Obtaining RHCOS image file from 'https://releases-art-rhcos.svc.ci.openshift.org/art/storage/releases/rhcos-4.9/49.84.202107010027-0/x86_64/rhcos-49.84.202107010027-0-openstack.x86_64.qcow2.gz?sha256=00cb56c8711686255744646394e22a8ca5f27e059016f6758f14388e5a0a14cb' > >INFO The file was found in cache: /home/stack/.cache/openshift-installer/image_cache/rhcos-49.84.202107010027-0-openstack.x86_64.qcow2. Reusing... > >WARNING Following quotas Subnet, SecurityGroup, Port, Network, SecurityGroupRule are available but will be completely used pretty soon. > >INFO Creating infrastructure resources... > >INFO Waiting up to 20m0s for the Kubernetes API at https://api.ostest.shiftstack.com:6443... > >INFO API v1.22.1+9312243 up > >INFO Waiting up to 30m0s for bootstrapping to complete... > >INFO Destroying the bootstrap resources... > >INFO Waiting up to 40m0s for the cluster at https://api.ostest.shiftstack.com:6443 to initialize... > >INFO Waiting up to 10m0s for the openshift-console route to be created... > >INFO Install complete! > >INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/home/stack/ostest/auth/kubeconfig' > >INFO Access the OpenShift web-console here: https://console-openshift-console.apps.ostest.shiftstack.com > >INFO Login to the console with user: "kubeadmin", and password: "5Ph7w-6NQQC-kNeu8-AZNny" > >INFO Time elapsed: 23m56s > > 2) Make sure the OCP cluster is operational: > >$ oc get machineset -A > >NAMESPACE NAME DESIRED CURRENT READY AVAILABLE AGE > >openshift-machine-api ostest-sccdp-worker-0 0 0 120m > >$ oc get machines -A > >NAMESPACE NAME PHASE TYPE REGION ZONE AGE > >openshift-machine-api ostest-sccdp-master-0 Running m4.xlarge regionOne nova 120m > >openshift-machine-api ostest-sccdp-master-1 Running m4.xlarge regionOne nova 120m > >openshift-machine-api ostest-sccdp-master-2 Running m4.xlarge regionOne nova 120m > >$ oc get nodes > >NAME STATUS ROLES AGE VERSION > >ostest-sccdp-master-0 Ready master,worker 119m v1.22.1+9312243 > >ostest-sccdp-master-1 Ready master,worker 119m v1.22.1+9312243 > >ostest-sccdp-master-2 Ready master,worker 119m v1.22.1+9312243 > >$ openstack server list > >+--------------------------------------+-----------------------+--------+-------------------------------------+--------------------+--------+ > >| ID | Name | Status | Networks | Image | Flavor | > >+--------------------------------------+-----------------------+--------+-------------------------------------+--------------------+--------+ > >| fa32d58d-4d3d-45f5-aadc-3fc5774860a6 | ostest-sccdp-master-2 | ACTIVE | ostest-sccdp-openshift=10.196.0.111 | ostest-sccdp-rhcos | | > >| c2194eb3-7d23-4655-b899-991b887b8eea | ostest-sccdp-master-1 | ACTIVE | ostest-sccdp-openshift=10.196.1.213 | ostest-sccdp-rhcos | | > >| 9835f5c0-2441-4238-a8e1-41c1f6877723 | ostest-sccdp-master-0 | ACTIVE | ostest-sccdp-openshift=10.196.2.171 | ostest-sccdp-rhcos | | > >+--------------------------------------+-----------------------+--------+-------------------------------------+--------------------+--------+ > >$ oc get clusteroperators > >NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE > >authentication 4.10.0-0.nightly-2021-10-16-173656 True False False 103m > >baremetal 4.10.0-0.nightly-2021-10-16-173656 True False False 110m > >cloud-controller-manager 4.10.0-0.nightly-2021-10-16-173656 True False False 119m > >cloud-credential 4.10.0-0.nightly-2021-10-16-173656 True False False 120m > >cluster-autoscaler 4.10.0-0.nightly-2021-10-16-173656 True False False 113m > >config-operator 4.10.0-0.nightly-2021-10-16-173656 True False False 117m > >console 4.10.0-0.nightly-2021-10-16-173656 True False False 107m > >csi-snapshot-controller 4.10.0-0.nightly-2021-10-16-173656 True False False 116m > >dns 4.10.0-0.nightly-2021-10-16-173656 True False False 113m > >etcd 4.10.0-0.nightly-2021-10-16-173656 True False False 115m > >image-registry 4.10.0-0.nightly-2021-10-16-173656 True False False 111m > >ingress 4.10.0-0.nightly-2021-10-16-173656 True False False 110m > >insights 4.10.0-0.nightly-2021-10-16-173656 True False False 110m > >kube-apiserver 4.10.0-0.nightly-2021-10-16-173656 True False False 113m > >kube-controller-manager 4.10.0-0.nightly-2021-10-16-173656 True False False 114m > >kube-scheduler 4.10.0-0.nightly-2021-10-16-173656 True False False 114m > >kube-storage-version-migrator 4.10.0-0.nightly-2021-10-16-173656 True False False 117m > >machine-api 4.10.0-0.nightly-2021-10-16-173656 True False False 110m > >machine-approver 4.10.0-0.nightly-2021-10-16-173656 True False False 116m > >machine-config 4.10.0-0.nightly-2021-10-16-173656 True False False 115m > >marketplace 4.10.0-0.nightly-2021-10-16-173656 True False False 116m > >monitoring 4.10.0-0.nightly-2021-10-16-173656 True False False 108m > >network 4.10.0-0.nightly-2021-10-16-173656 True False False 118m > >node-tuning 4.10.0-0.nightly-2021-10-16-173656 True False False 116m > >openshift-apiserver 4.10.0-0.nightly-2021-10-16-173656 True False False 112m > >openshift-controller-manager 4.10.0-0.nightly-2021-10-16-173656 True False False 109m > >openshift-samples 4.10.0-0.nightly-2021-10-16-173656 True False False 110m > >operator-lifecycle-manager 4.10.0-0.nightly-2021-10-16-173656 True False False 116m > >operator-lifecycle-manager-catalog 4.10.0-0.nightly-2021-10-16-173656 True False False 116m > >operator-lifecycle-manager-packageserver 4.10.0-0.nightly-2021-10-16-173656 True False False 111m > >service-ca 4.10.0-0.nightly-2021-10-16-173656 True False False 117m > >storage 4.10.0-0.nightly-2021-10-16-173656 True False False 114m > > 3) Create a new project with three pods. > The pods are running on the master nodes: > >$ oc get pods -n demo -o wide > >NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES > >demo-7897db69cc-9zhh4 1/1 Running 0 18h 10.128.130.201 ostest-sccdp-master-1 <none> <none> > >demo-7897db69cc-fmrdc 1/1 Running 0 18h 10.128.131.216 ostest-sccdp-master-2 <none> <none> > >demo-7897db69cc-mk22w 1/1 Running 0 18h 10.128.130.46 ostest-sccdp-master-0 <none> <none> > > 4) Creating two workers. > Changed the replica value from 0 to 2. The two instances and the > clusteroperators are up and running. ^ Verified with Kuryr network type. Removing the Triaged keyword because: * the QE automation assessment (flag qe_test_coverage) is missing Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days |