Bug 1921023

Summary: Do not enable Flexible Scaling to true for Internal mode clusters(revert to 4.6 behavior)
Product: OpenShift Container Platform Reporter: Neha Berry <nberry>
Component: Console Storage PluginAssignee: Afreen <afrahman>
Status: CLOSED ERRATA QA Contact: Oded <oviner>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.7CC: aos-bugs, nthomas
Target Milestone: ---   
Target Release: 4.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-24 15:56:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
UI screenshot for current implementation of flexible scaling in Internal mode none

Description Neha Berry 2021-01-27 12:56:42 UTC
Created attachment 1751239 [details]
UI screenshot for current implementation of flexible scaling in Internal mode

Description of problem:
==============================
After a long round of discussion in email threads, following was decided:

Internal mode (3 AZ or 1 AZ) - > do not enable N+1 scaling by default.

Hence, in current OCS 4.7 + OCP 4.7 deployment, this feature gets enabled by default even for clusters with zone=0 (e.g. vmware dynamic)

Hence, we need to disable this feature in UI for dynamic(Internal) mode

Eran's summary
-----------------
1. When having 3 zones on any platform, we will keep the current OCS 4.6 behavior. 
2. When having a single zone on any platform we will differentiate between the two options:
Using LSO - We will automatically apply the flexible scaling, using count 3, replica 1, and portable = false.
Using dynamic provisioning storage - we will keep the current OCS 4.6 behavior

Nithya's summary


Summing up the discussion:
--------------------------------

>> 1. Internal mode :
a)No flexible scaling. 
b)count = 1, replica = 3
c) Use current 4.6 behaviour of racks if zones < 3

>> 2. Internal Attached:
a) Flexible scaling = true
b) count = 3, replica = 1
c) When expanding via UI, expand in sets of 3 OSDs. Increment count in this case.
d) The customer can use CLI to expand in increments of 1.





Version-Release number of selected component (if applicable):
===============================
OCP 4.7
OCS 4.7

How reproducible:
==================
Always

Steps to Reproduce:
===================
1. Install OCS 4.7 on a vmware setup whre zone=0
2. The flexible scaling is enabled by default


Expected results:
==================
After the fix is reverted for Internal mode, flexible scaling will not be enabled even if zone=0 for dynamic mode clusters

Comment 4 Oded 2021-02-05 14:29:30 UTC
Flexible Scaling does not true for Internal mode cluster

1.Install OCP cluster on vmware:
Provider:VSphere
OCP Version: 4.7.0-0.nightly-2021-02-05-053408

2.Install Internal OCS4.7 Operator via UI:
OCS Version:ocs-operator.v4.7.0-251.ci

3.Check ceph status:
sh-4.4# ceph health
HEALTH_OK


4.Check failureDomain: 
$ oc get cephblockpool -n openshift-storage -o yaml | grep failureDomain
          f:failureDomain: {}
    failureDomain: rack


5.Get storagecluster:
$ oc get storagecluster -n openshift-storage -o yaml | grep -i flex
$
-> flexibleScaling does not exist


$ oc get storagecluster -n openshift-storage -o yaml 
apiVersion: v1
items:
- apiVersion: ocs.openshift.io/v1
  kind: StorageCluster
  metadata:
    annotations:
      uninstall.ocs.openshift.io/cleanup-policy: delete
      uninstall.ocs.openshift.io/mode: graceful
    creationTimestamp: "2021-02-05T14:00:40Z"
    finalizers:
    - storagecluster.ocs.openshift.io
    generation: 2
    managedFields:
    - apiVersion: ocs.openshift.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:spec:
          .: {}
          f:arbiter: {}
          f:encryption:
            .: {}
            f:kms: {}
          f:nodeTopologies: {}
      manager: Mozilla
      operation: Update
      time: "2021-02-05T14:00:40Z"
    - apiVersion: ocs.openshift.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:uninstall.ocs.openshift.io/cleanup-policy: {}
            f:uninstall.ocs.openshift.io/mode: {}
          f:finalizers: {}
        f:spec:
          f:externalStorage: {}
          f:managedResources:
            .: {}
            f:cephBlockPools: {}
            f:cephFilesystems: {}
            f:cephObjectStoreUsers: {}
            f:cephObjectStores: {}
          f:storageDeviceSets: {}
          f:version: {}
        f:status:
          .: {}
          f:conditions: {}
          f:failureDomain: {}
          f:images:
            .: {}
            f:ceph:
              .: {}
              f:actualImage: {}
              f:desiredImage: {}
            f:noobaaCore:
              .: {}
              f:actualImage: {}
              f:desiredImage: {}
            f:noobaaDB:
              .: {}
              f:actualImage: {}
              f:desiredImage: {}
          f:nodeTopologies:
            .: {}
            f:labels:
              .: {}
              f:kubernetes.io/hostname: {}
              f:topology.rook.io/rack: {}
          f:phase: {}
          f:relatedObjects: {}
      manager: ocs-operator
      operation: Update
      time: "2021-02-05T14:03:43Z"
    name: ocs-storagecluster
    namespace: openshift-storage
    resourceVersion: "70736"
    selfLink: /apis/ocs.openshift.io/v1/namespaces/openshift-storage/storageclusters/ocs-storagecluster
    uid: 0155585e-c490-433e-be03-b9b731c85604
  spec:
    arbiter: {}
    encryption:
      kms: {}
    externalStorage: {}
    managedResources:
      cephBlockPools: {}
      cephFilesystems: {}
      cephObjectStoreUsers: {}
      cephObjectStores: {}
    nodeTopologies: {}
    storageDeviceSets:
    - config: {}
      count: 1
      dataPVCTemplate:
        metadata: {}
        spec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 512Gi
          storageClassName: thin
          volumeMode: Block
        status: {}
      name: ocs-deviceset-thin
      placement: {}
      portable: true
      preparePlacement: {}
      replica: 3
      resources: {}
    version: 4.7.0
  status:
    conditions:
    - lastHeartbeatTime: "2021-02-05T14:10:46Z"
      lastTransitionTime: "2021-02-05T14:00:42Z"
      message: Reconcile completed successfully
      reason: ReconcileCompleted
      status: "True"
      type: ReconcileComplete
    - lastHeartbeatTime: "2021-02-05T14:10:46Z"
      lastTransitionTime: "2021-02-05T14:05:58Z"
      message: Reconcile completed successfully
      reason: ReconcileCompleted
      status: "True"
      type: Available
    - lastHeartbeatTime: "2021-02-05T14:10:46Z"
      lastTransitionTime: "2021-02-05T14:05:58Z"
      message: Reconcile completed successfully
      reason: ReconcileCompleted
      status: "False"
      type: Progressing
    - lastHeartbeatTime: "2021-02-05T14:10:46Z"
      lastTransitionTime: "2021-02-05T14:00:41Z"
      message: Reconcile completed successfully
      reason: ReconcileCompleted
      status: "False"
      type: Degraded
    - lastHeartbeatTime: "2021-02-05T14:10:46Z"
      lastTransitionTime: "2021-02-05T14:05:58Z"
      message: Reconcile completed successfully
      reason: ReconcileCompleted
      status: "True"
      type: Upgradeable
    failureDomain: rack
    images:
      ceph:
        actualImage: quay.io/rhceph-dev/rhceph@sha256:35e13c86bf5891b6db3386e74fc2be728906173a7aabb5d1aa11452a62d136e9
        desiredImage: quay.io/rhceph-dev/rhceph@sha256:35e13c86bf5891b6db3386e74fc2be728906173a7aabb5d1aa11452a62d136e9
      noobaaCore:
        actualImage: quay.io/rhceph-dev/mcg-core@sha256:6462b82d9e0d90b5312cabd8a5e9701dd8f550104667f66ad1faa8a826ea79ce
        desiredImage: quay.io/rhceph-dev/rhceph@sha256:35e13c86bf5891b6db3386e74fc2be728906173a7aabb5d1aa11452a62d136e9
      noobaaDB:
        actualImage: registry.redhat.io/rhel8/postgresql-12@sha256:c6b6da4f762c2f68bfe558efe954739438ffa2e9aae1c617b50011fb0eed8347
        desiredImage: registry.redhat.io/rhel8/postgresql-12@sha256:c6b6da4f762c2f68bfe558efe954739438ffa2e9aae1c617b50011fb0eed8347
    nodeTopologies:
      labels:
        kubernetes.io/hostname:
        - compute-0
        - compute-1
        - compute-2
        topology.rook.io/rack:
        - rack0
        - rack1
        - rack2
    phase: Ready
    relatedObjects:
    - apiVersion: ceph.rook.io/v1
      kind: CephCluster
      name: ocs-storagecluster-cephcluster
      namespace: openshift-storage
      resourceVersion: "70716"
      uid: f0138b4f-f68b-487a-bb04-7da6663cc891
    - apiVersion: noobaa.io/v1alpha1
      kind: NooBaa
      name: noobaa
      namespace: openshift-storage
      resourceVersion: "70735"
      uid: 1ab320df-3258-4d6e-b6d6-6fb036f22c21
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

Comment 7 errata-xmlrpc 2021-02-24 15:56:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633