Bug 1946595 - ocs-storagecluster phase is "Ready" when flexible scaling and arbiter are both enabled
Summary: ocs-storagecluster phase is "Ready" when flexible scaling and arbiter are bot...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Container Storage
Classification: Red Hat Storage
Component: ocs-operator
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: OCS 4.8.0
Assignee: Nitin Goyal
QA Contact: Oded
URL:
Whiteboard:
Depends On: 1913357
Blocks: 1938134
TreeView+ depends on / blocked
 
Reported: 2021-04-06 13:23 UTC by Oded
Modified: 2023-09-15 01:04 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
.Arbiter and flexible scaling can't be enabled at the same time When arbiter and flexible scaling both are enabled, the storage cluster was shown in `READY` state even though there were logs or messages with the error `arbiter and flexibleScaling both can't be enabled`. This was happening because of the incorrect specs of the storage cluster CR. With this update, the storage cluster is in "ERROR" state with the correct error message.
Clone Of: 1913357
Environment:
Last Closed: 2021-08-03 18:15:56 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ocs-operator pull 1146 0 None closed storagecluster: Fix Update in validateStorageClusterSpec 2021-05-24 11:40:25 UTC
Red Hat Product Errata RHBA-2021:3003 0 None None None 2021-08-03 18:16:39 UTC

Comment 3 Mudit Agarwal 2021-04-06 14:03:07 UTC
Nitin, please add doc text for this.

Comment 9 Mudit Agarwal 2021-06-01 11:02:26 UTC
Doc text needs to be modified as we have fixed this issue now.

Comment 11 Oded 2021-06-10 12:28:55 UTC
Need to test it again because monitoring issue on my cluster

SetUp:
OCP Version:4.8.0-0.nightly-2021-06-09-065137
OCS Version: ocs-operator.v4.8.0-413.ci
LSO version:4.7.0-202102110027.p0
Provider: Vmware


Test Procedure:
1.Install OCS4.8  Cluster (LSO)

2.check storage cluster status
$ oc get storagecluster
NAME                 AGE   PHASE   EXTERNAL   CREATED AT             VERSION
ocs-storagecluster   18h   Ready              2021-06-09T16:35:09Z   4.8.0

$ oc get storagecluster -o yaml | grep flex
          f:flexibleScaling: {}
    flexibleScaling: true

$ oc get storagecluster -o yaml | grep arbiter
          f:arbiter: {}
    arbiter: {}

3.Enable arbiter:
spec:
  arbiter: 
   enable: true

4.Check ocs-operator log:
$ oc logs ocs-operator-dd57fd889-6zj8j
{"level":"error","ts":1623325972.242787,"logger":"controller-runtime.manager.controller.storagecluster","msg":"Reconciler error","reconciler group":"ocs.openshift.io","reconciler kind":"StorageCluster","name":"ocs-storagecluster","namespace":"openshift-storage","error":"arbiter and flexibleScaling both can't be enabled","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/remote-source/app/vendor/github.com/go-logr/zapr/zapr.go:132\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:297\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:248\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.UntilWithContext\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99"}

5.New Warning on Console "arbiter and flexibleScaling both can't be enabled"

6.Check storagecluster status:
$ oc get storagecluster
NAME                 AGE   PHASE   EXTERNAL   CREATED AT             VERSION
ocs-storagecluster   19h   Error              2021-06-09T16:35:09Z   4.8.0

$ oc describe storagecluster
Events:
  Type     Reason            Age                   From                       Message
  ----     ------            ----                  ----                       -------
  Warning  FailedValidation  53s (x24 over 9m17s)  controller_storagecluster  arbiter and flexibleScaling both can't be enabled
  
$ oc get csv -A
NAMESPACE                              NAME                                           DISPLAY                       VERSION                 REPLACES                     PHASE
openshift-local-storage                local-storage-operator.4.7.0-202102110027.p0   Local Storage                 4.7.0-202102110027.p0                                Succeeded
openshift-operator-lifecycle-manager   packageserver                                  Package Server                0.17.0                                               Succeeded
openshift-storage                      ocs-operator.v4.8.0-413.ci                     OpenShift Container Storage   4.8.0-413.ci            ocs-operator.v4.8.0-411.ci   Succeeded

$ oc get cephcluster
NAME                             DATADIRHOSTPATH   MONCOUNT   AGE   PHASE   MESSAGE                        HEALTH      EXTERNAL
ocs-storagecluster-cephcluster   /var/lib/rook     3          19h   Ready   Cluster created successfully   HEALTH_OK   

7.Disable Arbiter on storagecluster yaml file:
$ oc edit storagecluster -n openshift-storage
spec:
  arbiter: {}

8.Check storagecluster status:
$ oc get storagecluster
NAME                 AGE   PHASE   EXTERNAL   CREATED AT             VERSION
ocs-storagecluster   19h   Ready              2021-06-09T16:35:09Z   4.8.0


for more deatis:
https://docs.google.com/document/d/1Ahu6qEIbaYOij3KO0fyHKrAAAmunjQZ-WuPsC8oULOE/edit

Comment 12 Oded 2021-06-10 14:58:42 UTC
All error messages are visible in the UI. [arbiter and flexibleScaling both can't be enabled]

for more deatis:
https://docs.google.com/document/d/1Ahu6qEIbaYOij3KO0fyHKrAAAmunjQZ-WuPsC8oULOE/edit

Comment 13 Olive Lakra 2021-07-09 05:18:36 UTC
Hi Mudit - please review the revised doc text and share feedback.

Comment 14 Mudit Agarwal 2021-07-09 07:43:26 UTC
This needs to be changed to:

.Arbiter and flexible scaling can't be enabled at the same time.
When arbiter and flexible scaling both are enabled, the storage cluster was shown in `READY` state even though there were logs or messages with the error `arbiter and flexibleScaling both can't be enabled`.
This was happening because of the incorrect specs of the storage cluster CR.
With this update, storage cluster is in "ERROR" state with the correct error message.

Comment 16 errata-xmlrpc 2021-08-03 18:15:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Container Storage 4.8.0 container images bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3003

Comment 17 Red Hat Bugzilla 2023-09-15 01:04:42 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.