1921023 – Do not enable Flexible Scaling to true for Internal mode clusters(revert to 4.6 behavior)

Bug 1921023 - Do not enable Flexible Scaling to true for Internal mode clusters(revert to 4.6 behavior)

Summary: Do not enable Flexible Scaling to true for Internal mode clusters(revert to 4...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Console Storage Plugin
Sub Component:
Version:	4.7
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	4.7.0
Assignee:	Afreen
QA Contact:	Oded
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-01-27 12:56 UTC by Neha Berry
Modified:	2021-02-24 15:56 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-02-24 15:56:44 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
UI screenshot for current implementation of flexible scaling in Internal mode (185.68 KB, image/png) 2021-01-27 12:56 UTC, Neha Berry	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift console pull 8043	0	None	closed	Bug 1921023: Disable flexible scaling for internal mode	2021-02-08 08:46:05 UTC
Red Hat Product Errata	RHSA-2020:5633	0	None	None	None	2021-02-24 15:56:58 UTC

Description Neha Berry 2021-01-27 12:56:42 UTC

Created attachment 1751239 [details]
UI screenshot for current implementation of flexible scaling in Internal mode

Description of problem:
==============================
After a long round of discussion in email threads, following was decided:

Internal mode (3 AZ or 1 AZ) - > do not enable N+1 scaling by default.

Hence, in current OCS 4.7 + OCP 4.7 deployment, this feature gets enabled by default even for clusters with zone=0 (e.g. vmware dynamic)

Hence, we need to disable this feature in UI for dynamic(Internal) mode

Eran's summary
-----------------
1. When having 3 zones on any platform, we will keep the current OCS 4.6 behavior. 
2. When having a single zone on any platform we will differentiate between the two options:
Using LSO - We will automatically apply the flexible scaling, using count 3, replica 1, and portable = false.
Using dynamic provisioning storage - we will keep the current OCS 4.6 behavior

Nithya's summary


Summing up the discussion:
--------------------------------

>> 1. Internal mode :
a)No flexible scaling. 
b)count = 1, replica = 3
c) Use current 4.6 behaviour of racks if zones < 3

>> 2. Internal Attached:
a) Flexible scaling = true
b) count = 3, replica = 1
c) When expanding via UI, expand in sets of 3 OSDs. Increment count in this case.
d) The customer can use CLI to expand in increments of 1.





Version-Release number of selected component (if applicable):
===============================
OCP 4.7
OCS 4.7

How reproducible:
==================
Always

Steps to Reproduce:
===================
1. Install OCS 4.7 on a vmware setup whre zone=0
2. The flexible scaling is enabled by default


Expected results:
==================
After the fix is reverted for Internal mode, flexible scaling will not be enabled even if zone=0 for dynamic mode clusters

Comment 4 Oded 2021-02-05 14:29:30 UTC

Flexible Scaling does not true for Internal mode cluster

1.Install OCP cluster on vmware:
Provider:VSphere
OCP Version: 4.7.0-0.nightly-2021-02-05-053408

2.Install Internal OCS4.7 Operator via UI:
OCS Version:ocs-operator.v4.7.0-251.ci

3.Check ceph status:
sh-4.4# ceph health
HEALTH_OK


4.Check failureDomain: 
$ oc get cephblockpool -n openshift-storage -o yaml | grep failureDomain
          f:failureDomain: {}
    failureDomain: rack


5.Get storagecluster:
$ oc get storagecluster -n openshift-storage -o yaml | grep -i flex
$
-> flexibleScaling does not exist


$ oc get storagecluster -n openshift-storage -o yaml 
apiVersion: v1
items:
- apiVersion: ocs.openshift.io/v1
  kind: StorageCluster
  metadata:
    annotations:
      uninstall.ocs.openshift.io/cleanup-policy: delete
      uninstall.ocs.openshift.io/mode: graceful
    creationTimestamp: "2021-02-05T14:00:40Z"
    finalizers:
    - storagecluster.ocs.openshift.io
    generation: 2
    managedFields:
    - apiVersion: ocs.openshift.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:spec:
          .: {}
          f:arbiter: {}
          f:encryption:
            .: {}
            f:kms: {}
          f:nodeTopologies: {}
      manager: Mozilla
      operation: Update
      time: "2021-02-05T14:00:40Z"
    - apiVersion: ocs.openshift.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:uninstall.ocs.openshift.io/cleanup-policy: {}
            f:uninstall.ocs.openshift.io/mode: {}
          f:finalizers: {}
        f:spec:
          f:externalStorage: {}
          f:managedResources:
            .: {}
            f:cephBlockPools: {}
            f:cephFilesystems: {}
            f:cephObjectStoreUsers: {}
            f:cephObjectStores: {}
          f:storageDeviceSets: {}
          f:version: {}
        f:status:
          .: {}
          f:conditions: {}
          f:failureDomain: {}
          f:images:
            .: {}
            f:ceph:
              .: {}
              f:actualImage: {}
              f:desiredImage: {}
            f:noobaaCore:
              .: {}
              f:actualImage: {}
              f:desiredImage: {}
            f:noobaaDB:
              .: {}
              f:actualImage: {}
              f:desiredImage: {}
          f:nodeTopologies:
            .: {}
            f:labels:
              .: {}
              f:kubernetes.io/hostname: {}
              f:topology.rook.io/rack: {}
          f:phase: {}
          f:relatedObjects: {}
      manager: ocs-operator
      operation: Update
      time: "2021-02-05T14:03:43Z"
    name: ocs-storagecluster
    namespace: openshift-storage
    resourceVersion: "70736"
    selfLink: /apis/ocs.openshift.io/v1/namespaces/openshift-storage/storageclusters/ocs-storagecluster
    uid: 0155585e-c490-433e-be03-b9b731c85604
  spec:
    arbiter: {}
    encryption:
      kms: {}
    externalStorage: {}
    managedResources:
      cephBlockPools: {}
      cephFilesystems: {}
      cephObjectStoreUsers: {}
      cephObjectStores: {}
    nodeTopologies: {}
    storageDeviceSets:
    - config: {}
      count: 1
      dataPVCTemplate:
        metadata: {}
        spec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 512Gi
          storageClassName: thin
          volumeMode: Block
        status: {}
      name: ocs-deviceset-thin
      placement: {}
      portable: true
      preparePlacement: {}
      replica: 3
      resources: {}
    version: 4.7.0
  status:
    conditions:
    - lastHeartbeatTime: "2021-02-05T14:10:46Z"
      lastTransitionTime: "2021-02-05T14:00:42Z"
      message: Reconcile completed successfully
      reason: ReconcileCompleted
      status: "True"
      type: ReconcileComplete
    - lastHeartbeatTime: "2021-02-05T14:10:46Z"
      lastTransitionTime: "2021-02-05T14:05:58Z"
      message: Reconcile completed successfully
      reason: ReconcileCompleted
      status: "True"
      type: Available
    - lastHeartbeatTime: "2021-02-05T14:10:46Z"
      lastTransitionTime: "2021-02-05T14:05:58Z"
      message: Reconcile completed successfully
      reason: ReconcileCompleted
      status: "False"
      type: Progressing
    - lastHeartbeatTime: "2021-02-05T14:10:46Z"
      lastTransitionTime: "2021-02-05T14:00:41Z"
      message: Reconcile completed successfully
      reason: ReconcileCompleted
      status: "False"
      type: Degraded
    - lastHeartbeatTime: "2021-02-05T14:10:46Z"
      lastTransitionTime: "2021-02-05T14:05:58Z"
      message: Reconcile completed successfully
      reason: ReconcileCompleted
      status: "True"
      type: Upgradeable
    failureDomain: rack
    images:
      ceph:
        actualImage: quay.io/rhceph-dev/rhceph@sha256:35e13c86bf5891b6db3386e74fc2be728906173a7aabb5d1aa11452a62d136e9
        desiredImage: quay.io/rhceph-dev/rhceph@sha256:35e13c86bf5891b6db3386e74fc2be728906173a7aabb5d1aa11452a62d136e9
      noobaaCore:
        actualImage: quay.io/rhceph-dev/mcg-core@sha256:6462b82d9e0d90b5312cabd8a5e9701dd8f550104667f66ad1faa8a826ea79ce
        desiredImage: quay.io/rhceph-dev/rhceph@sha256:35e13c86bf5891b6db3386e74fc2be728906173a7aabb5d1aa11452a62d136e9
      noobaaDB:
        actualImage: registry.redhat.io/rhel8/postgresql-12@sha256:c6b6da4f762c2f68bfe558efe954739438ffa2e9aae1c617b50011fb0eed8347
        desiredImage: registry.redhat.io/rhel8/postgresql-12@sha256:c6b6da4f762c2f68bfe558efe954739438ffa2e9aae1c617b50011fb0eed8347
    nodeTopologies:
      labels:
        kubernetes.io/hostname:
        - compute-0
        - compute-1
        - compute-2
        topology.rook.io/rack:
        - rack0
        - rack1
        - rack2
    phase: Ready
    relatedObjects:
    - apiVersion: ceph.rook.io/v1
      kind: CephCluster
      name: ocs-storagecluster-cephcluster
      namespace: openshift-storage
      resourceVersion: "70716"
      uid: f0138b4f-f68b-487a-bb04-7da6663cc891
    - apiVersion: noobaa.io/v1alpha1
      kind: NooBaa
      name: noobaa
      namespace: openshift-storage
      resourceVersion: "70735"
      uid: 1ab320df-3258-4d6e-b6d6-6fb036f22c21
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

Comment 7 errata-xmlrpc 2021-02-24 15:56:44 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633

Note You need to log in before you can comment on or make changes to this bug.