Bug 1927317 - [Arbiter] Storage Cluster installation did not started because ocs-operator was Expecting 8 node found 4
Summary: [Arbiter] Storage Cluster installation did not started because ocs-operator w...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Container Storage
Classification: Red Hat Storage
Component: ocs-operator
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: OCS 4.7.0
Assignee: Raghavendra Talur
QA Contact: Pratik Surve
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-02-10 14:06 UTC by Pratik Surve
Modified: 2021-09-09 09:58 UTC (History)
11 users (show)

Fixed In Version: 4.7.0-262.ci
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-19 09:19:24 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ocs-operator pull 1073 0 None open Bug 1927317: [release-4.7] StorageCluster: fix the logic for the minimum number of nodes 2021-02-12 23:09:06 UTC
Red Hat Product Errata RHSA-2021:2041 0 None None None 2021-05-19 09:19:48 UTC

Description Pratik Surve 2021-02-10 14:06:44 UTC
Description of problem (please be detailed as possible and provide log
snippests):

Storage Cluster installation did not start because ocs-operator was Expecting 8 node found 4

A version of all relevant components (if applicable):

OCP version:- 4.7.0-0.nightly-2021-02-09-224509
OCS version:- ocs-operator.v4.7.0-257.ci

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
yes

Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
yes

Can this issue reproduce from the UI?
yes

If this is a regression, please provide more details to justify this:
yes

Steps to Reproduce:
1. Deploy OCP 4.7 
2. Deploy OCS 4.7 with arbiter enable 
3. Check for ocs-operator logs


Actual results:
# Snippet from OCS-operator logs

{"level":"error","ts":1612962614.143765,"logger":"controllers.StorageCluster","msg":"Failed to set node topology map","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster","error":"Not enough nodes found: Expected 8, found 4","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/remote-source/deps/gomod/pkg/mod/github.com/go-logr/zapr.0/zapr.go:132\ngithub.com/openshift/ocs-operator/controllers/storagecluster.(*StorageClusterReconciler).reconcilePhases\n\t/remote-source/app/controllers/storagecluster/reconcile.go:277\ngithub.com/openshift/ocs-operator/controllers/storagecluster.(*StorageClusterReconciler).Reconcile\n\t/remote-source/app/controllers/storagecluster/reconcile.go:132\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.3/pkg/internal/controller/controller.go:244\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.3/pkg/internal/controller/controller.go:218\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.3/pkg/internal/controller/controller.go:197\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.2/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.2/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.2/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.2/pkg/util/wait/wait.go:90"}



Expected results:
 

Additional info:

# oc get storagecluster 
NAME                 AGE   PHASE         EXTERNAL   CREATED AT             VERSION
ocs-storagecluster   77m   Progressing              2021-02-10T12:48:16Z   4.7.0

# oc get pods                
NAME                                    READY   STATUS    RESTARTS   AGE
noobaa-operator-56965c6578-k2hxj        1/1     Running   0          87m
ocs-metrics-exporter-75499b5ffc-jhsql   1/1     Running   0          87m
ocs-operator-577bfccff7-b24tl           1/1     Running   0          87m
rook-ceph-operator-688f677848-8pfvj     1/1     Running   0          87m


#  oc get nodes --show-labels           
NAME                                         STATUS   ROLES    AGE    VERSION           LABELS
ip-10-0-132-213.us-east-2.compute.internal   Ready    master   153m   v1.20.0+ba45583   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m4.xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2a,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-132-213,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.kubernetes.io/instance-type=m4.xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-east-2a,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2a
ip-10-0-135-243.us-east-2.compute.internal   Ready    worker   143m   v1.20.0+ba45583   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m5.4xlarge,beta.kubernetes.io/os=linux,cluster.ocs.openshift.io/openshift-storage=,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2a,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-135-243,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=m5.4xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-east-2a,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2a
ip-10-0-150-194.us-east-2.compute.internal   Ready    worker   143m   v1.20.0+ba45583   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m5.4xlarge,beta.kubernetes.io/os=linux,cluster.ocs.openshift.io/openshift-storage=,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2a,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-150-194,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=m5.4xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-east-2a,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2a
ip-10-0-167-133.us-east-2.compute.internal   Ready    master   152m   v1.20.0+ba45583   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m4.xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2b,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-167-133,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.kubernetes.io/instance-type=m4.xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-east-2b,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2b
ip-10-0-167-186.us-east-2.compute.internal   Ready    worker   143m   v1.20.0+ba45583   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m5.4xlarge,beta.kubernetes.io/os=linux,cluster.ocs.openshift.io/openshift-storage=,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2b,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-167-186,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=m5.4xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-east-2b,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2b
ip-10-0-177-73.us-east-2.compute.internal    Ready    worker   143m   v1.20.0+ba45583   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m5.4xlarge,beta.kubernetes.io/os=linux,cluster.ocs.openshift.io/openshift-storage=,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2b,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-177-73,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=m5.4xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-east-2b,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2b
ip-10-0-201-133.us-east-2.compute.internal   Ready    worker   143m   v1.20.0+ba45583   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m5.4xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2c,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-201-133,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=m5.4xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-east-2c,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2c
ip-10-0-214-29.us-east-2.compute.internal    Ready    master   153m   v1.20.0+ba45583   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m4.xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2c,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-214-29,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.kubernetes.io/instance-type=m4.xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-east-2c,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2c
ip-10-0-216-25.us-east-2.compute.internal    Ready    worker   143m   v1.20.0+ba45583   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m5.4xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2c,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-216-25,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=m5.4xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-east-2c,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2c

Comment 8 Michael Adam 2021-02-12 18:10:43 UTC
This is in fact a regression introduced by one of the patches for the arbiter fixes.
Acking for 4.7.
Patch is being worked on.

Comment 9 Michael Adam 2021-02-12 19:52:53 UTC
https://github.com/openshift/ocs-operator/pull/1070

Comment 16 errata-xmlrpc 2021-05-19 09:19:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2041

Comment 17 Jilju Joy 2021-09-09 09:57:54 UTC
This will be covered in deployment test. Deployment yaml conf/deployment/vsphere/upi_1az_rhcos_vsan_lso_vmdk_3m_6w_arbiter.yaml.


Note You need to log in before you can comment on or make changes to this bug.