Bug 1946595
| Summary: | ocs-storagecluster phase is "Ready" when flexible scaling and arbiter are both enabled | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Container Storage | Reporter: | Oded <oviner> |
| Component: | ocs-operator | Assignee: | Nitin Goyal <nigoyal> |
| Status: | CLOSED ERRATA | QA Contact: | Oded <oviner> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.7 | CC: | ebenahar, edonnell, jarrpa, madam, mbukatov, muagarwa, nberry, nigoyal, ocs-bugs, olakra, oviner, rtalur, sostapov, uchapaga |
| Target Milestone: | --- | Keywords: | AutomationBackLog |
| Target Release: | OCS 4.8.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
.Arbiter and flexible scaling can't be enabled at the same time
When arbiter and flexible scaling both are enabled, the storage cluster was shown in `READY` state even though there were logs or messages with the error `arbiter and flexibleScaling both can't be enabled`. This was happening because of the incorrect specs of the storage cluster CR. With this update, the storage cluster is in "ERROR" state with the correct error message.
|
Story Points: | --- |
| Clone Of: | 1913357 | Environment: | |
| Last Closed: | 2021-08-03 18:15:56 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1913357 | ||
| Bug Blocks: | 1938134 | ||
|
Comment 3
Mudit Agarwal
2021-04-06 14:03:07 UTC
Doc text needs to be modified as we have fixed this issue now. Need to test it again because monitoring issue on my cluster
SetUp:
OCP Version:4.8.0-0.nightly-2021-06-09-065137
OCS Version: ocs-operator.v4.8.0-413.ci
LSO version:4.7.0-202102110027.p0
Provider: Vmware
Test Procedure:
1.Install OCS4.8 Cluster (LSO)
2.check storage cluster status
$ oc get storagecluster
NAME AGE PHASE EXTERNAL CREATED AT VERSION
ocs-storagecluster 18h Ready 2021-06-09T16:35:09Z 4.8.0
$ oc get storagecluster -o yaml | grep flex
f:flexibleScaling: {}
flexibleScaling: true
$ oc get storagecluster -o yaml | grep arbiter
f:arbiter: {}
arbiter: {}
3.Enable arbiter:
spec:
arbiter:
enable: true
4.Check ocs-operator log:
$ oc logs ocs-operator-dd57fd889-6zj8j
{"level":"error","ts":1623325972.242787,"logger":"controller-runtime.manager.controller.storagecluster","msg":"Reconciler error","reconciler group":"ocs.openshift.io","reconciler kind":"StorageCluster","name":"ocs-storagecluster","namespace":"openshift-storage","error":"arbiter and flexibleScaling both can't be enabled","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/remote-source/app/vendor/github.com/go-logr/zapr/zapr.go:132\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:297\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:248\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.UntilWithContext\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99"}
5.New Warning on Console "arbiter and flexibleScaling both can't be enabled"
6.Check storagecluster status:
$ oc get storagecluster
NAME AGE PHASE EXTERNAL CREATED AT VERSION
ocs-storagecluster 19h Error 2021-06-09T16:35:09Z 4.8.0
$ oc describe storagecluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedValidation 53s (x24 over 9m17s) controller_storagecluster arbiter and flexibleScaling both can't be enabled
$ oc get csv -A
NAMESPACE NAME DISPLAY VERSION REPLACES PHASE
openshift-local-storage local-storage-operator.4.7.0-202102110027.p0 Local Storage 4.7.0-202102110027.p0 Succeeded
openshift-operator-lifecycle-manager packageserver Package Server 0.17.0 Succeeded
openshift-storage ocs-operator.v4.8.0-413.ci OpenShift Container Storage 4.8.0-413.ci ocs-operator.v4.8.0-411.ci Succeeded
$ oc get cephcluster
NAME DATADIRHOSTPATH MONCOUNT AGE PHASE MESSAGE HEALTH EXTERNAL
ocs-storagecluster-cephcluster /var/lib/rook 3 19h Ready Cluster created successfully HEALTH_OK
7.Disable Arbiter on storagecluster yaml file:
$ oc edit storagecluster -n openshift-storage
spec:
arbiter: {}
8.Check storagecluster status:
$ oc get storagecluster
NAME AGE PHASE EXTERNAL CREATED AT VERSION
ocs-storagecluster 19h Ready 2021-06-09T16:35:09Z 4.8.0
for more deatis:
https://docs.google.com/document/d/1Ahu6qEIbaYOij3KO0fyHKrAAAmunjQZ-WuPsC8oULOE/edit
All error messages are visible in the UI. [arbiter and flexibleScaling both can't be enabled] for more deatis: https://docs.google.com/document/d/1Ahu6qEIbaYOij3KO0fyHKrAAAmunjQZ-WuPsC8oULOE/edit Hi Mudit - please review the revised doc text and share feedback. This needs to be changed to: .Arbiter and flexible scaling can't be enabled at the same time. When arbiter and flexible scaling both are enabled, the storage cluster was shown in `READY` state even though there were logs or messages with the error `arbiter and flexibleScaling both can't be enabled`. This was happening because of the incorrect specs of the storage cluster CR. With this update, storage cluster is in "ERROR" state with the correct error message. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Container Storage 4.8.0 container images bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3003 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days |