1915111 – OCS OSD selection algorithm is making some strange choices.

Bug 1915111 - OCS OSD selection algorithm is making some strange choices.

Summary: OCS OSD selection algorithm is making some strange choices.

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenShift Container Storage
Classification:	Red Hat Storage
Component:	ocs-operator
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	OCS 4.7.0
Assignee:	Jose A. Rivera
QA Contact:	Avi Liani
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-01-12 01:34 UTC by Vincent S. Cojot
Modified:	2021-05-19 09:18 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-05-19 09:17:47 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2021:2041	0	None	None	None	2021-05-19 09:18:12 UTC

Description Vincent S. Cojot 2021-01-12 01:34:06 UTC

Description of problem (please be detailed as possible and provide log
snippests):

I'm deploying OCS + OCP in a pipeline and having fun with the deployed OCP aftwards (Lab environment). I'm currently struggling to understand what makes OCS or its operator select an OSD from a node.

Version of all relevant components (if applicable):


$ oc get csv -A
NAMESPACE                              NAME                                           DISPLAY                       VERSION                 REPLACES                                  PHASE
openshift-cnv                          kubevirt-hyperconverged-operator.v2.5.2        OpenShift Virtualization      2.5.2                   kubevirt-hyperconverged-operator.v2.5.1   Succeeded
openshift-local-storage                local-storage-operator.4.6.0-202012161211.p0   Local Storage                 4.6.0-202012161211.p0                                             Succeeded
openshift-operator-lifecycle-manager   packageserver                                  Package Server                0.16.1                                                            Succeeded
openshift-storage                      ocs-operator.v4.6.0                            OpenShift Container Storage   4.6.0                                                             Succeeded

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

No

Is there any workaround available to the best of your knowledge?

No

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
3 (neither easy nor complex)


Can this issue reproducible?

Yes, the 'balancing' of OSDs is messed up every time I select an 'odd' number for counts.

Can this issue reproduce from the UI?

No idea.

If this is a regression, please provide more details to justify this:

No idea

Steps to Reproduce:

Comment 2 Vincent S. Cojot 2021-01-12 01:39:05 UTC

Here's an idea of what's going on.
I'm deploying OCP 4.6.8 with 3 masters, 3 workers and 6 infra nodes to use for ceph:
$ oc get nodes
NAME                         STATUS   ROLES          AGE    VERSION
ocp4d-gcvpn-infra-0-4w4sh    Ready    infra,worker   47m    v1.19.0+7070803
ocp4d-gcvpn-infra-0-gzmvx    Ready    infra,worker   48m    v1.19.0+7070803
ocp4d-gcvpn-infra-0-kzlnn    Ready    infra,worker   47m    v1.19.0+7070803
ocp4d-gcvpn-infra-0-vncgl    Ready    infra,worker   47m    v1.19.0+7070803
ocp4d-gcvpn-infra-0-zc2sm    Ready    infra,worker   47m    v1.19.0+7070803
ocp4d-gcvpn-infra-0-zjfdm    Ready    infra,worker   47m    v1.19.0+7070803
ocp4d-gcvpn-master-0         Ready    master         105m   v1.19.0+7070803
ocp4d-gcvpn-master-1         Ready    master         105m   v1.19.0+7070803
ocp4d-gcvpn-master-2         Ready    master         105m   v1.19.0+7070803
ocp4d-gcvpn-worker-0-64w6n   Ready    worker         100m   v1.19.0+7070803
ocp4d-gcvpn-worker-0-7242l   Ready    worker         100m   v1.19.0+7070803
ocp4d-gcvpn-worker-0-dqrjj   Ready    worker         100m   v1.19.0+7070803

Each of the 'infra' node has a boot disk and 8 x 8Tb virtio-SCSI disks with a unique WWN.

The local storage operator YAML is quite simplistic:
apiVersion: local.storage.openshift.io/v1
kind: LocalVolume
metadata:
  name: localstorage-ocs-osd
  namespace: openshift-local-storage
  labels:
    app: ocs-storagecluster
spec:
  nodeSelector:
    nodeSelectorTerms:
      - matchExpressions:
          - key: cluster.ocs.openshift.io/openshift-storage
            operator: In
            values:
            - ""
  storageClassDevices:
    - storageClassName: localstorage-ocs-osd-sc
      volumeMode: Block
      devicePaths:
        - /dev/disk/by-id/wwn-0x5000c50015ea71aa
        - /dev/disk/by-id/wwn-0x5000c50015ea71ab
        - /dev/disk/by-id/wwn-0x5000c50015ea71ac
        - /dev/disk/by-id/wwn-0x5000c50015ea71ad
        - /dev/disk/by-id/wwn-0x5000c50015ea71ae
        - /dev/disk/by-id/wwn-0x5000c50015ea71b0
        - /dev/disk/by-id/wwn-0x5000c50015ea71b1
        - /dev/disk/by-id/wwn-0x5000c50015ea71b2
[....long list of computed WWWNs.....]
        - /dev/disk/by-id/wwn-0x5000c50015ea721d
        - /dev/disk/by-id/wwn-0x5000c50015ea721e
        - /dev/disk/by-id/wwn-0x5000c50015ea7220

For the OCS storage cluster, I have this (and some resource limitations):
apiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
  name: ocs-storagecluster
  namespace: openshift-storage
spec:
  manageNodes: false
  monDataDirHostPath: /var/lib/rook
  storageDeviceSets:
  - count: 5   # <-- modify count to to desired value
    dataPVCTemplate:
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 1
        storageClassName: localstorage-ocs-osd-sc
        volumeMode: Block

Comment 3 Vincent S. Cojot 2021-01-12 01:40:54 UTC

# https://red-hat-storage.github.io/ocs-training/training/ocs4/ocs4-install-no-ui.html#_create_cluster
apiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
  name: ocs-storagecluster
  namespace: openshift-storage
spec:
  manageNodes: false
  resources:
    mds:
      limits:
        cpu: 500m
        memory: "4Gi"
      requests:
        cpu: 500m
        memory: "4Gi"
    rgw:
      limits:
        cpu: 500m
        memory: "4Gi"
      requests:
        cpu: 500m
        memory: "4Gi"
    mon:
      limits:
        cpu: 500m
        memory: "2Gi"
      requests:
        cpu: 500m
        memory: "2Gi"
    osd:
      limits:
        cpu: 500m
        memory: "4Gi"
      requests:
        cpu: 500m
        memory: "4Gi"
    mgr:
      limits:
        cpu: 500m
        memory: "2Gi"
      requests:
        cpu: 500m
        memory: "2Gi"
    noobaa-core:
      limits:
        cpu: 500m
        memory: "2Gi"
      requests:
        cpu: 500m
        memory: "2Gi"
    noobaa-db:
      limits:
        cpu: 500m
        memory: "2Gi"
      requests:
        cpu: 500m
        memory: "2Gi"

  monDataDirHostPath: /var/lib/rook
  storageDeviceSets:
  - count: 5   # <-- modify count to to desired value
    dataPVCTemplate:
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 1
        storageClassName: localstorage-ocs-osd-sc
        volumeMode: Block
    name: ocs-deviceset
    placement: {}
    portable: false
    replica: 3
    resources:
      limits:
        cpu: 500m
        memory: "4Gi"
      requests:
        cpu: 500m
        memory: "4Gi"


so far so good.. each node has 8x8Tb disks and I'm only requesting 'count: 5' from each rack.

Comment 4 Vincent S. Cojot 2021-01-12 01:42:50 UTC

On a fresh deploy and with 'count: 5', here's what I get (6 infra nodes):
$ oc rsh rook-ceph-tools-8589699f6c-l9x57 ceph osd tree
ID  CLASS WEIGHT    TYPE NAME                              STATUS REWEIGHT PRI-AFF 
 -1       120.00000 root default                                                   
 -4        40.00000     rack rack0                                                 
 -3        24.00000         host ocp4d-gcvpn-infra-0-4w4sh                         
  0   ssd   8.00000             osd.0                          up  1.00000 1.00000 
  1   ssd   8.00000             osd.1                          up  1.00000 1.00000 
  2   ssd   8.00000             osd.2                          up  1.00000 1.00000 
-11        16.00000         host ocp4d-gcvpn-infra-0-vncgl                         
  7   ssd   8.00000             osd.7                          up  1.00000 1.00000 
  8   ssd   8.00000             osd.8                          up  1.00000 1.00000 
 -8        40.00000     rack rack1                                                 
 -7        32.00000         host ocp4d-gcvpn-infra-0-gzmvx                         
  3   ssd   8.00000             osd.3                          up  1.00000 1.00000 
  4   ssd   8.00000             osd.4                          up  1.00000 1.00000 
  5   ssd   8.00000             osd.5                          up  1.00000 1.00000 
  6   ssd   8.00000             osd.6                          up  1.00000 1.00000 
-13         8.00000         host ocp4d-gcvpn-infra-0-zc2sm                         
 14   ssd   8.00000             osd.14                         up  1.00000 1.00000 
-16        40.00000     rack rack2                                                 
-15        40.00000         host ocp4d-gcvpn-infra-0-zjfdm                         
  9   ssd   8.00000             osd.9                          up  1.00000 1.00000 
 10   ssd   8.00000             osd.10                         up  1.00000 1.00000 
 11   ssd   8.00000             osd.11                         up  1.00000 1.00000 
 12   ssd   8.00000             osd.12                         up  1.00000 1.00000 
 13   ssd   8.00000             osd.13                         up  1.00000 1.00000 

so each of the racks have 5 OSD's, which is fine but the OSD's aren't even evently balanced within each rack.

Comment 5 Vincent S. Cojot 2021-01-12 01:46:30 UTC

Increasing 'count' to 6 (total of 3 x 6 OSDs = 6 for each rack) yield these results (osds 15, 16 and 17) get properly scaled and spread across nodes:


$ oc rsh rook-ceph-tools-8589699f6c-l9x57 ceph osd tree
ID  CLASS WEIGHT    TYPE NAME                              STATUS REWEIGHT PRI-AFF 
 -1       144.00000 root default                                                   
 -4        48.00000     rack rack0                                                 
 -3        24.00000         host ocp4d-gcvpn-infra-0-4w4sh                         
  0   ssd   8.00000             osd.0                          up  1.00000 1.00000 
  1   ssd   8.00000             osd.1                          up  1.00000 1.00000 
  2   ssd   8.00000             osd.2                          up  1.00000 1.00000 
-11        24.00000         host ocp4d-gcvpn-infra-0-vncgl                         
  7   ssd   8.00000             osd.7                          up  1.00000 1.00000 
  8   ssd   8.00000             osd.8                          up  1.00000 1.00000 
 15   ssd   8.00000             osd.15                         up  1.00000 1.00000 
 -8        48.00000     rack rack1                                                 
 -7        40.00000         host ocp4d-gcvpn-infra-0-gzmvx                         
  3   ssd   8.00000             osd.3                          up  1.00000 1.00000 
  4   ssd   8.00000             osd.4                          up  1.00000 1.00000 
  5   ssd   8.00000             osd.5                          up  1.00000 1.00000 
  6   ssd   8.00000             osd.6                          up  1.00000 1.00000 
 16   ssd   8.00000             osd.16                         up  1.00000 1.00000 
-13         8.00000         host ocp4d-gcvpn-infra-0-zc2sm                         
 14   ssd   8.00000             osd.14                         up  1.00000 1.00000 
-16        48.00000     rack rack2                                                 
-19         8.00000         host ocp4d-gcvpn-infra-0-kzlnn                         
 17   ssd   8.00000             osd.17                         up  1.00000 1.00000 
-15        40.00000         host ocp4d-gcvpn-infra-0-zjfdm                         
  9   ssd   8.00000             osd.9                          up  1.00000 1.00000 
 10   ssd   8.00000             osd.10                         up  1.00000 1.00000 
 11   ssd   8.00000             osd.11                         up  1.00000 1.00000 
 12   ssd   8.00000             osd.12                         up  1.00000 1.00000 
 13   ssd   8.00000             osd.13                         up  1.00000 1.00000 

(the 6th OCS node 'ocp4d-gcvpn-infra-0-kzlnn' showed up with osd 17).

Comment 6 Vincent S. Cojot 2021-01-12 01:51:07 UTC

Increasing count to '7' (osd's 18, 19 and 20 will come up) yield these:

$ oc rsh rook-ceph-tools-8589699f6c-l9x57 ceph osd tree
ID  CLASS WEIGHT    TYPE NAME                              STATUS REWEIGHT PRI-AFF 
 -1       168.00000 root default                                                   
 -4        56.00000     rack rack0                                                 
 -3        24.00000         host ocp4d-gcvpn-infra-0-4w4sh                         
  0   ssd   8.00000             osd.0                          up  1.00000 1.00000 
  1   ssd   8.00000             osd.1                          up  1.00000 1.00000 
  2   ssd   8.00000             osd.2                          up  1.00000 1.00000 
-11        32.00000         host ocp4d-gcvpn-infra-0-vncgl                         
  7   ssd   8.00000             osd.7                          up  1.00000 1.00000 
  8   ssd   8.00000             osd.8                          up  1.00000 1.00000 
 15   ssd   8.00000             osd.15                         up  1.00000 1.00000 
 18   ssd   8.00000             osd.18                         up  1.00000 1.00000 
 -8        56.00000     rack rack1                                                 
 -7        48.00000         host ocp4d-gcvpn-infra-0-gzmvx                         
  3   ssd   8.00000             osd.3                          up  1.00000 1.00000 
  4   ssd   8.00000             osd.4                          up  1.00000 1.00000 
  5   ssd   8.00000             osd.5                          up  1.00000 1.00000 
  6   ssd   8.00000             osd.6                          up  1.00000 1.00000 
 16   ssd   8.00000             osd.16                         up  1.00000 1.00000 
 19   ssd   8.00000             osd.19                         up  1.00000 1.00000 
-13         8.00000         host ocp4d-gcvpn-infra-0-zc2sm                         
 14   ssd   8.00000             osd.14                         up  1.00000 1.00000 
-16        56.00000     rack rack2                                                 
-19        16.00000         host ocp4d-gcvpn-infra-0-kzlnn                         
 17   ssd   8.00000             osd.17                         up  1.00000 1.00000 
 20   ssd   8.00000             osd.20                         up  1.00000 1.00000 
-15        40.00000         host ocp4d-gcvpn-infra-0-zjfdm                         
  9   ssd   8.00000             osd.9                          up  1.00000 1.00000 
 10   ssd   8.00000             osd.10                         up  1.00000 1.00000 
 11   ssd   8.00000             osd.11                         up  1.00000 1.00000 
 12   ssd   8.00000             osd.12                         up  1.00000 1.00000 
 13   ssd   8.00000             osd.13                         up  1.00000 1.00000 

IMHO, osd 19 should have gone to node 'ocp4d-gcvpn-infra-0-zc2sm' (it only has 1 OSD) and not to node 'ocp4d-gcvpn-infra-0-gzmvx' (bringing its total to 6 OSDs).
osd 20 went to node 'ocp4d-gcvpn-infra-0-kzlnn' which is fine because it only had 1 OSD prior to the scale-up.

Comment 7 Vincent S. Cojot 2021-01-12 01:55:09 UTC

Increasing count to '8' (osd's 21, 22 and 23 will come up) yield these results:

$ oc rsh rook-ceph-tools-8589699f6c-l9x57 ceph osd tree
ID  CLASS WEIGHT    TYPE NAME                              STATUS REWEIGHT PRI-AFF 
 -1       192.00000 root default                                                   
 -4        64.00000     rack rack0                                                 
 -3        24.00000         host ocp4d-gcvpn-infra-0-4w4sh                         
  0   ssd   8.00000             osd.0                          up  1.00000 1.00000 
  1   ssd   8.00000             osd.1                          up  1.00000 1.00000 
  2   ssd   8.00000             osd.2                          up  1.00000 1.00000 
-11        40.00000         host ocp4d-gcvpn-infra-0-vncgl                         
  7   ssd   8.00000             osd.7                          up  1.00000 1.00000 
  8   ssd   8.00000             osd.8                          up  1.00000 1.00000 
 15   ssd   8.00000             osd.15                         up  1.00000 1.00000 
 18   ssd   8.00000             osd.18                         up  1.00000 1.00000 
 21   ssd   8.00000             osd.21                         up  1.00000 1.00000 
 -8        64.00000     rack rack1                                                 
 -7        56.00000         host ocp4d-gcvpn-infra-0-gzmvx                         
  3   ssd   8.00000             osd.3                          up  1.00000 1.00000 
  4   ssd   8.00000             osd.4                          up  1.00000 1.00000 
  5   ssd   8.00000             osd.5                          up  1.00000 1.00000 
  6   ssd   8.00000             osd.6                          up  1.00000 1.00000 
 16   ssd   8.00000             osd.16                         up  1.00000 1.00000 
 19   ssd   8.00000             osd.19                         up  1.00000 1.00000 
 22   ssd   8.00000             osd.22                         up  1.00000 1.00000 
-13         8.00000         host ocp4d-gcvpn-infra-0-zc2sm                         
 14   ssd   8.00000             osd.14                         up  1.00000 1.00000 
-16        64.00000     rack rack2                                                 
-19        24.00000         host ocp4d-gcvpn-infra-0-kzlnn                         
 17   ssd   8.00000             osd.17                         up  1.00000 1.00000 
 20   ssd   8.00000             osd.20                         up  1.00000 1.00000 
 23   ssd   8.00000             osd.23                         up  1.00000 1.00000 
-15        40.00000         host ocp4d-gcvpn-infra-0-zjfdm                         
  9   ssd   8.00000             osd.9                          up  1.00000 1.00000 
 10   ssd   8.00000             osd.10                         up  1.00000 1.00000 
 11   ssd   8.00000             osd.11                         up  1.00000 1.00000 
 12   ssd   8.00000             osd.12                         up  1.00000 1.00000 
 13   ssd   8.00000             osd.13                         up  1.00000 1.00000 

here, OSD 21 should have gone to ocp4d-gcvpn-infra-0-4w4sh (only 3 OSDs) instead of node 'ocp4d-gcvpn-infra-0-vncgl' (which had 4 OSDs).
OSD 22 should have gone to node 'ocp4d-gcvpn-infra-0-zc2sm' (only one OSD) instead of node 'ocp4d-gcvpn-infra-0-gzmvx' (it already had 6 OSDs).
OSD 23 went correctly to the least used node (ocp4d-gcvpn-infra-0-kzlnn).

Comment 8 Vincent S. Cojot 2021-01-12 01:58:46 UTC

Increasing count to '9' (osd's 24, 25 and 26 will come up) yield these results:
$ oc rsh rook-ceph-tools-8589699f6c-l9x57 ceph osd tree
ID  CLASS WEIGHT    TYPE NAME                              STATUS REWEIGHT PRI-AFF 
 -1       216.00000 root default                                                   
 -4        72.00000     rack rack0                                                 
 -3        32.00000         host ocp4d-gcvpn-infra-0-4w4sh                         
  0   ssd   8.00000             osd.0                          up  1.00000 1.00000 
  1   ssd   8.00000             osd.1                          up  1.00000 1.00000 
  2   ssd   8.00000             osd.2                          up  1.00000 1.00000 
 24   ssd   8.00000             osd.24                         up  1.00000 1.00000 
-11        40.00000         host ocp4d-gcvpn-infra-0-vncgl                         
  7   ssd   8.00000             osd.7                          up  1.00000 1.00000 
  8   ssd   8.00000             osd.8                          up  1.00000 1.00000 
 15   ssd   8.00000             osd.15                         up  1.00000 1.00000 
 18   ssd   8.00000             osd.18                         up  1.00000 1.00000 
 21   ssd   8.00000             osd.21                         up  1.00000 1.00000 
 -8        72.00000     rack rack1                                                 
 -7        56.00000         host ocp4d-gcvpn-infra-0-gzmvx                         
  3   ssd   8.00000             osd.3                          up  1.00000 1.00000 
  4   ssd   8.00000             osd.4                          up  1.00000 1.00000 
  5   ssd   8.00000             osd.5                          up  1.00000 1.00000 
  6   ssd   8.00000             osd.6                          up  1.00000 1.00000 
 16   ssd   8.00000             osd.16                         up  1.00000 1.00000 
 19   ssd   8.00000             osd.19                         up  1.00000 1.00000 
 22   ssd   8.00000             osd.22                         up  1.00000 1.00000 
-13        16.00000         host ocp4d-gcvpn-infra-0-zc2sm                         
 14   ssd   8.00000             osd.14                         up  1.00000 1.00000 
 25   ssd   8.00000             osd.25                         up  1.00000 1.00000 
-16        72.00000     rack rack2                                                 
-19        24.00000         host ocp4d-gcvpn-infra-0-kzlnn                         
 17   ssd   8.00000             osd.17                         up  1.00000 1.00000 
 20   ssd   8.00000             osd.20                         up  1.00000 1.00000 
 23   ssd   8.00000             osd.23                         up  1.00000 1.00000 
-15        48.00000         host ocp4d-gcvpn-infra-0-zjfdm                         
  9   ssd   8.00000             osd.9                          up  1.00000 1.00000 
 10   ssd   8.00000             osd.10                         up  1.00000 1.00000 
 11   ssd   8.00000             osd.11                         up  1.00000 1.00000 
 12   ssd   8.00000             osd.12                         up  1.00000 1.00000 
 13   ssd   8.00000             osd.13                         up  1.00000 1.00000 
 26   ssd   8.00000             osd.26                         up  1.00000 1.00000 

OSD's 24 and 25 both went to the least used node in the rack but OSD 26 went to host 'ocp4d-gcvpn-infra-0-zjfdm' (already had 5 OSDs) instead of node 'ocp4d-gcvpn-infra-0-kzlnn' (had only 3 OSDs).

Comment 9 Vincent S. Cojot 2021-01-12 02:00:39 UTC

this goes on and on until nodes in a rack have no more available disks and at which point scale up operations start filling up the free OSDs within a rack.

Comment 10 Jose A. Rivera 2021-01-27 16:15:51 UTC

This is a known issue and indeed is being resolved by the implementation of TopologySpreadConstraints: https://bugzilla.redhat.com/show_bug.cgi?id=1814681

As such, I think we can safely take this in OCS 4.7 as an additional verification of the feature.

Comment 11 Martin Bukatovic 2021-01-27 20:16:41 UTC

Since this should be addressed via BZ 1814681 which is already acked, providing QA ack assuming that here QE team will check that the topologySpreadConstraints fix works with LSO as well.

Comment 14 Mudit Agarwal 2021-02-12 14:18:14 UTC

https://bugzilla.redhat.com/show_bug.cgi?id=1814681 is ON_QA

Comment 15 Avi Liani 2021-03-02 12:51:39 UTC

I created cluster on vmware with LSO,

OCP : 4.7.0

# oc get csv -A
NAMESPACE                              NAME                                           DISPLAY                       VERSION                 REPLACES   PHASE
openshift-local-storage                local-storage-operator.4.7.0-202102110027.p0   Local Storage                 4.7.0-202102110027.p0              Succeeded
openshift-operator-lifecycle-manager   packageserver                                  Package Server                0.17.0                             Succeeded
openshift-storage                      ocs-operator.v4.7.0-278.ci                     OpenShift Container Storage   4.7.0-278.ci                       Succeeded

6 worker nodes with 8 disk's each

# oc get nodes
NAME              STATUS   ROLES    AGE    VERSION
compute-0         Ready    worker   172m   v1.20.0+ba45583
compute-1         Ready    worker   172m   v1.20.0+ba45583
compute-2         Ready    worker   172m   v1.20.0+ba45583
compute-3         Ready    worker   172m   v1.20.0+ba45583
compute-4         Ready    worker   172m   v1.20.0+ba45583
compute-5         Ready    worker   172m   v1.20.0+ba45583
control-plane-0   Ready    master   3h1m   v1.20.0+ba45583
control-plane-1   Ready    master   3h1m   v1.20.0+ba45583
control-plane-2   Ready    master   3h1m   v1.20.0+ba45583

after Creating the cluster (3 OSD's), OSD tree look like :

sh-4.4# ceph osd tree
ID CLASS WEIGHT  TYPE NAME          STATUS REWEIGHT PRI-AFF 
-1       0.29306 root default                               
-7       0.09769     host compute-3                         
 2   hdd 0.09769         osd.2          up  1.00000 1.00000 
-3       0.09769     host compute-4                         
 0   hdd 0.09769         osd.0          up  1.00000 1.00000 
-5       0.09769     host compute-5                         
 1   hdd 0.09769         osd.1          up  1.00000 1.00000 

we can see that each OSD is from different host.

after adding capacity, the OSD tree look like :

sh-4.4# ceph osd tree
ID  CLASS WEIGHT  TYPE NAME          STATUS REWEIGHT PRI-AFF 
 -1       0.58612 root default                               
-13       0.09769     host compute-0                         
  5   hdd 0.09769         osd.5          up  1.00000 1.00000 
-11       0.09769     host compute-1                         
  3   hdd 0.09769         osd.3          up  1.00000 1.00000 
 -9       0.09769     host compute-2                         
  4   hdd 0.09769         osd.4          up  1.00000 1.00000 
 -7       0.09769     host compute-3                         
  2   hdd 0.09769         osd.2          up  1.00000 1.00000 
 -3       0.09769     host compute-4                         
  0   hdd 0.09769         osd.0          up  1.00000 1.00000 
 -5       0.09769     host compute-5                         
  1   hdd 0.09769         osd.1          up  1.00000 1.00000 

we can see that the OSD are spread evenly on all worker (one OSD on each worker)

after adding capacity again, the OSD tree look like :

sh-4.4# ceph osd tree
ID  CLASS WEIGHT  TYPE NAME          STATUS REWEIGHT PRI-AFF 
 -1       0.87918 root default                               
-13       0.19537     host compute-0                         
  5   hdd 0.09769         osd.5          up  1.00000 1.00000 
  7   hdd 0.09769         osd.7          up  1.00000 1.00000 
-11       0.09769     host compute-1                         
  3   hdd 0.09769         osd.3          up  1.00000 1.00000 
 -9       0.19537     host compute-2                         
  4   hdd 0.09769         osd.4          up  1.00000 1.00000 
  6   hdd 0.09769         osd.6          up  1.00000 1.00000 
 -7       0.09769     host compute-3                         
  2   hdd 0.09769         osd.2          up  1.00000 1.00000 
 -3       0.09769     host compute-4                         
  0   hdd 0.09769         osd.0          up  1.00000 1.00000 
 -5       0.19537     host compute-5                         
  1   hdd 0.09769         osd.1          up  1.00000 1.00000 
  8   hdd 0.09769         osd.8          up  1.00000 1.00000 

after adding capacity again, the OSD tree look like :

sh-4.4# ceph osd tree
ID  CLASS WEIGHT  TYPE NAME          STATUS REWEIGHT PRI-AFF 
 -1       1.17224 root default                               
-13       0.19537     host compute-0                         
  5   hdd 0.09769         osd.5          up  1.00000 1.00000 
  7   hdd 0.09769         osd.7          up  1.00000 1.00000 
-11       0.19537     host compute-1                         
  3   hdd 0.09769         osd.3          up  1.00000 1.00000 
  9   hdd 0.09769         osd.9          up  1.00000 1.00000 
 -9       0.19537     host compute-2                         
  4   hdd 0.09769         osd.4          up  1.00000 1.00000 
  6   hdd 0.09769         osd.6          up  1.00000 1.00000 
 -7       0.19537     host compute-3                         
  2   hdd 0.09769         osd.2          up  1.00000 1.00000 
 11   hdd 0.09769         osd.11         up  1.00000 1.00000 
 -3       0.19537     host compute-4                         
  0   hdd 0.09769         osd.0          up  1.00000 1.00000 
 10   hdd 0.09769         osd.10         up  1.00000 1.00000 
 -5       0.19537     host compute-5                         
  1   hdd 0.09769         osd.1          up  1.00000 1.00000 
  8   hdd 0.09769         osd.8          up  1.00000 1.00000 

and again, we see that the OSD's are spread evenly on all worker

after adding capacity again, the OSD tree look like :

sh-4.4# ceph osd tree
ID  CLASS WEIGHT  TYPE NAME          STATUS REWEIGHT PRI-AFF 
 -1       1.46530 root default                               
-13       0.19537     host compute-0                         
  5   hdd 0.09769         osd.5          up  1.00000 1.00000 
  7   hdd 0.09769         osd.7          up  1.00000 1.00000 
-11       0.19537     host compute-1                         
  3   hdd 0.09769         osd.3          up  1.00000 1.00000 
  9   hdd 0.09769         osd.9          up  1.00000 1.00000 
 -9       0.29306     host compute-2                         
  4   hdd 0.09769         osd.4          up  1.00000 1.00000 
  6   hdd 0.09769         osd.6          up  1.00000 1.00000 
 12   hdd 0.09769         osd.12         up  1.00000 1.00000 
 -7       0.19537     host compute-3                         
  2   hdd 0.09769         osd.2          up  1.00000 1.00000 
 11   hdd 0.09769         osd.11         up  1.00000 1.00000 
 -3       0.29306     host compute-4                         
  0   hdd 0.09769         osd.0          up  1.00000 1.00000 
 10   hdd 0.09769         osd.10         up  1.00000 1.00000 
 14   hdd 0.09769         osd.14         up  1.00000 1.00000 
 -5       0.29306     host compute-5                         
  1   hdd 0.09769         osd.1          up  1.00000 1.00000 
  8   hdd 0.09769         osd.8          up  1.00000 1.00000 
 13   hdd 0.09769         osd.13         up  1.00000 1.00000 

after adding capacity again, the OSD tree look like :

sh-4.4# ceph osd tree
ID  CLASS WEIGHT  TYPE NAME          STATUS REWEIGHT PRI-AFF 
 -1       1.75836 root default                               
-13       0.29306     host compute-0                         
  5   hdd 0.09769         osd.5          up  1.00000 1.00000 
  7   hdd 0.09769         osd.7          up  1.00000 1.00000 
 15   hdd 0.09769         osd.15         up  1.00000 1.00000 
-11       0.29306     host compute-1                         
  3   hdd 0.09769         osd.3          up  1.00000 1.00000 
  9   hdd 0.09769         osd.9          up  1.00000 1.00000 
 17   hdd 0.09769         osd.17         up  1.00000 1.00000 
 -9       0.29306     host compute-2                         
  4   hdd 0.09769         osd.4          up  1.00000 1.00000 
  6   hdd 0.09769         osd.6          up  1.00000 1.00000 
 12   hdd 0.09769         osd.12         up  1.00000 1.00000 
 -7       0.29306     host compute-3                         
  2   hdd 0.09769         osd.2          up  1.00000 1.00000 
 11   hdd 0.09769         osd.11         up  1.00000 1.00000 
 16   hdd 0.09769         osd.16         up  1.00000 1.00000 
 -3       0.29306     host compute-4                         
  0   hdd 0.09769         osd.0          up  1.00000 1.00000 sh-4.4# ceph osd tree
ID  CLASS WEIGHT  TYPE NAME          STATUS REWEIGHT PRI-AFF 
 -1       1.75836 root default                               
-13       0.29306     host compute-0                         
  5   hdd 0.09769         osd.5          up  1.00000 1.00000 
  7   hdd 0.09769         osd.7          up  1.00000 1.00000 
 15   hdd 0.09769         osd.15         up  1.00000 1.00000 
-11       0.29306     host compute-1                         
  3   hdd 0.09769         osd.3          up  1.00000 1.00000 
  9   hdd 0.09769         osd.9          up  1.00000 1.00000 
 17   hdd 0.09769         osd.17         up  1.00000 1.00000 
 -9       0.29306     host compute-2                         
  4   hdd 0.09769         osd.4          up  1.00000 1.00000 
  6   hdd 0.09769         osd.6          up  1.00000 1.00000 
 12   hdd 0.09769         osd.12         up  1.00000 1.00000 
 -7       0.29306     host compute-3                         
  2   hdd 0.09769         osd.2          up  1.00000 1.00000 
 11   hdd 0.09769         osd.11         up  1.00000 1.00000 
 16   hdd 0.09769         osd.16         up  1.00000 1.00000 
 -3       0.29306     host compute-4                         
  0   hdd 0.09769         osd.0          up  1.00000 1.00000 
 10   hdd 0.09769         osd.10         up  1.00000 1.00000 
 14   hdd 0.09769         osd.14         up  1.00000 1.00000 
 -5       0.29306     host compute-5                         
  1   hdd 0.09769         osd.1          up  1.00000 1.00000 
  8   hdd 0.09769         osd.8          up  1.00000 1.00000 
 13   hdd 0.09769         osd.13         up  1.00000 1.00000 

 10   hdd 0.09769         osd.10         up  1.00000 1.00000 
 14   hdd 0.09769         osd.14         up  1.00000 1.00000 
 -5       0.29306     host compute-5                         
  1   hdd 0.09769         osd.1          up  1.00000 1.00000 
  8   hdd 0.09769         osd.8          up  1.00000 1.00000 
 13   hdd 0.09769         osd.13         up  1.00000 1.00000 


so, IMO this can be verified.

Comment 18 errata-xmlrpc 2021-05-19 09:17:47 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2041

Note You need to log in before you can comment on or make changes to this bug.