Bug 1983961 - [ROKS] ODF/OCP - Rebuilding Data resiliency shows 99% for long time [NEEDINFO]
Summary: [ROKS] ODF/OCP - Rebuilding Data resiliency shows 99% for long time
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ceph
Version: 4.7
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: ---
Assignee: Travis Nielsen
QA Contact: Raz Tamir
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-07-20 09:51 UTC by Elvir Kuric
Modified: 2023-08-09 16:37 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-06 09:57:57 UTC
Embargoed:
tnielsen: needinfo? (ekuric)
muagarwa: needinfo? (ekuric)


Attachments (Terms of Use)

Description Elvir Kuric 2021-07-20 09:51:06 UTC
Description of problem:

Panel "Data Resiliency" in OCP web interface shows "Rebuilding data resiliency 99% " for very long time.

Version-Release number of selected component (if applicable):

$ oc get clusterversion 
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.16    True        False         14d     Cluster version is 4.7.16
[elvir@makina datapresent]$ oc get storagecluster  -n openshift-storage
NAME                 AGE   PHASE   EXTERNAL   CREATED AT             VERSION
ocs-storagecluster   14d   Ready              2021-07-06T08:56:17Z   4.7.0


How reproducible:
always 
2 of 2 tries .


Steps to Reproduce:

1. Write 50 TBs of data to ceph storage backend ( available storage is 100 TB so storage is not full ). We believe this is visible with smaller data set too. 

2. Replace one of the nodes in the cluster (Delete old node, create a new node) - via (ibmcloud ks worker replace < worker id >  ) from command line 

3. Monitor Data Resiliency in console - has be stuck on 99% for over 3 hours now - please check graph . 

Actual results:
Data Resiliency in console - has be stuck on 99% for over 3 hours now - please check graph


Expected results:
Data Resiliency to finish faster 

Additional info:

ODF/OCP cluster was OK - possible to use it. It is unclear why web console was reporting that status for long time
even ODF cluster was up and HEALTHY for hours prior and after test. 

ceph cluster was HEALTHY when OCP web console was reporting above status.

$ ceph -s 
  cluster:
    id:     5506601c-7254-498c-aeea-d9331b4be16e
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum a,b,d (age 23h)
    mgr: a(active, since 21h)
    mds: ocs-storagecluster-cephfilesystem:1 {0=ocs-storagecluster-cephfilesystem-b=up:active} 1 up:standby-replay
    osd: 24 osds: 24 up (since 16h), 24 in (since 16h)
    rgw: 1 daemon active (ocs.storagecluster.cephobjectstore.a)
 
  data:
    pools:   10 pools, 656 pgs
    objects: 12.59M objects, 48 TiB
    usage:   144 TiB used, 231 TiB / 375 TiB avail
    pgs:     655 active+clean
             1   active+clean+scrubbing+deep+repair
 
  io:
    client:   938 B/s rd, 12 KiB/s wr, 2 op/s rd, 1 op/s wr

Comment 3 Elvir Kuric 2021-07-20 13:46:50 UTC
What is the impact of this taking so long? e.g if other nodes went down during this period do we risk any data/availability loss?

Comment 4 afrahman 2021-07-23 05:07:25 UTC
Form UI perspective, the message looks good.
Your Pgs are:
   pgs:     655 active+clean
             1   active+clean+scrubbing+deep+repair

Resiliency =  active+clean/total = 655/656

We should move this BZ to operator/ceph. They can invenstigate it better.

Comment 6 Travis Nielsen 2021-07-26 19:25:38 UTC
Since one PG is still not active+clean, the recovery is still in progress:

pgs:     655 active+clean
         1   active+clean+scrubbing+deep+repair

Can you collect a must-gather on the cluster? Ceph would need more details to troubleshoot.

Comment 7 Neha Ojha 2021-07-28 00:59:49 UTC
(In reply to Travis Nielsen from comment #6)
> Since one PG is still not active+clean, the recovery is still in progress:
> 
> pgs:     655 active+clean
>          1   active+clean+scrubbing+deep+repair

You should able to run I/O against a PG, which has a state active+*something*. In this case, the PG is in active+clean+scrubbing+deep+repair, which means one of the following:

1. during a deep-scrub there were issues that were found which needed a repair (hence "+repair")
2. we are seeing this cosmetic issue, tracked in https://tracker.ceph.com/issues/50446, which puts a PG in active+clean+scrubbing+deep+repair state instead of active+clean+scrubbing+deep, when a deep-scrub is run on a PG 

You can set nodeep-scrub in the cluster during the tests to rule either of the above.

> 
> Can you collect a must-gather on the cluster? Ceph would need more details
> to troubleshoot.

Comment 8 Elvir Kuric 2021-07-28 10:15:52 UTC
Thank you Travis and Neha! 

setting 

# ceph osd noscrup nodeep-scrub 

helps it is shown faster "Rebuilding data resiliency" achieve 100%.

After additional tests we figured out the following:

1. node is replaced, eg command  : 

ibmcloud ks worker replace --worker kube-c3hidk8d0cktobbskgr0-dmodff1-default-00000f73 --cluster dm-odff1

2. OSDs will be down

 cluster:
    id:     5506601c-7254-498c-aeea-d9331b4be16e
    health: HEALTH_WARN
            noscrub,nodeep-scrub flag(s) set
            3 osds down
            1 OSDs or CRUSH {nodes, device-classes} have {NOUP,NODOWN,NOIN,NOOUT} flags set
            3 hosts (3 osds) down
            Degraded data redundancy: 5126099/40375602 objects degraded (12.696%), 275 pgs degraded, 262 pgs undersized
 
  services:
    mon: 3 daemons, quorum a,b,d (age 5h)
    mgr: a(active, since 10m)
    mds: ocs-storagecluster-cephfilesystem:1 {0=ocs-storagecluster-cephfilesystem-b=up:active} 1 up:standby-replay
    osd: 24 osds: 21 up (since 6m), 24 in (since 14h); 9 remapped pgs
         flags noscrub,nodeep-scrub
    rgw: 1 daemon active (ocs.storagecluster.cephobjectstore.a)
 
  data:
    pools:   10 pools, 656 pgs
    objects: 13.46M objects, 51 TiB
    usage:   134 TiB used, 194 TiB / 328 TiB avail
    pgs:     5126099/40375602 objects degraded (12.696%)
             364 active+clean
             237 active+undersized+degraded
             28  active+recovery_wait+degraded
             16  active+undersized
             8   active+recovery_wait+undersized+degraded+remapped
             1   active+recovering+degraded
             1   active+recovering+undersized+degraded+remapped
             1   active+recovery_wait
 
  io:
    client:   29 KiB/s rd, 9.2 MiB/s wr, 5 op/s rd, 13 op/s wr
    recovery: 47 MiB/s, 11 objects/s

3. after some time , now will start and OSDs will up up 

 cluster:
    id:     5506601c-7254-498c-aeea-d9331b4be16e
    health: HEALTH_WARN
            noscrub,nodeep-scrub flag(s) set
            1 OSDs or CRUSH {nodes, device-classes} have {NOUP,NODOWN,NOIN,NOOUT} flags set
            Degraded data redundancy: 35856/40380021 objects degraded (0.089%), 203 pgs degraded
 
  services:
    mon: 3 daemons, quorum a,b,d (age 5h)
    mgr: a(active, since 23m)
    mds: ocs-storagecluster-cephfilesystem:1 {0=ocs-storagecluster-cephfilesystem-b=up:active} 1 up:standby-replay
    osd: 24 osds: 24 up (since 9s), 24 in (since 15h); 197 remapped pgs
         flags noscrub,nodeep-scrub
    rgw: 1 daemon active (ocs.storagecluster.cephobjectstore.a)
 
  data:
    pools:   10 pools, 656 pgs
    objects: 13.46M objects, 51 TiB
    usage:   154 TiB used, 221 TiB / 375 TiB avail
    pgs:     35856/40380021 objects degraded (0.089%)
             450 active+clean
             197 active+recovery_wait+undersized+degraded+remapped
             5   active+recovery_wait+degraded
             2   active+recovery_wait
             1   active+recovering
             1   active+recovering+degraded
 
  io:
    client:   6.7 MiB/s wr, 0 op/s rd, 7 op/s wr
    recovery: 134 B/s, 10 objects/s
 
Wed-2021-07-28-10-52-06 ceph -s status


4. it will take time till all these "objects degraded" are fixed, with noscrub, nodeep-scrup this is faster. 



We found that after test with node replace, OSDs will not spread equally across nodes

$ oc get pods -n openshift-storage -o wide |grep osd


rook-ceph-osd-0-fd95fb959-9dqwx                                   2/2     Running     1          5d21h   172.17.99.137    10.240.64.10    <none>           <none>
rook-ceph-osd-1-d45fbd654-tt4dk                                   2/2     Running     1          5d21h   172.17.99.140    10.240.64.10    <none>           <none>
rook-ceph-osd-12-5d9cbdc748-8f846                                 2/2     Running     1          5d22h   172.17.99.138    10.240.64.10    <none>           <none>
rook-ceph-osd-8-78c8d778f9-tcjvp                                  2/2     Running     1          5d21h   172.17.99.139    10.240.64.10    <none>           <none>

rook-ceph-osd-10-78647675cf-cwh4h                                 2/2     Running     0          5d19h   172.17.89.109    10.240.128.12   <none>           <none>
rook-ceph-osd-11-7f88545866-72s2x                                 2/2     Running     0          5d21h   172.17.89.74     10.240.128.12   <none>           <none>
rook-ceph-osd-19-5675f5d695-d7g4m                                 2/2     Running     0          5d21h   172.17.89.73     10.240.128.12   <none>           <none>
rook-ceph-osd-3-5656f8598f-jph8k                                  2/2     Running     0          15h     172.17.89.76     10.240.128.12   <none>           <none>
rook-ceph-osd-4-547ff5697f-bc2xm                                  2/2     Running     0          5d21h   172.17.89.75     10.240.128.12   <none>           <none>
rook-ceph-osd-9-5fbb9c9d9c-lkwgs                                  2/2     Running     0          15h     172.17.89.111    10.240.128.12   <none>           <none>

rook-ceph-osd-13-7574d8f5c-qjfj9                                  2/2     Running     3          5d19h   172.17.104.9     10.240.0.19     <none>           <none>
rook-ceph-osd-14-7d6858967f-x9jsq                                 2/2     Running     0          5d19h   172.17.104.12    10.240.0.19     <none>           <none>
rook-ceph-osd-21-6474d6c979-xc6p2                                 2/2     Running     1          5d19h   172.17.104.11    10.240.0.19     <none>           <none>
rook-ceph-osd-22-57b89578f6-gdvz4                                 2/2     Running     3          5d19h   172.17.104.10    10.240.0.19     <none>           <none>


rook-ceph-osd-15-76fbdc74f4-62qqc                                 2/2     Running     0          5d21h   172.17.88.144    10.240.64.9     <none>           <none>
rook-ceph-osd-17-7998f86fc6-rsmmb                                 2/2     Running     1          5d21h   172.17.88.142    10.240.64.9     <none>           <none>
rook-ceph-osd-6-6647966dd6-d6bcz                                  2/2     Running     2          5d21h   172.17.88.140    10.240.64.9     <none>           <none>
rook-ceph-osd-7-7b758d7bb9-x9xgn                                  2/2     Running     0          5d21h   172.17.88.143    10.240.64.9     <none>           <none>


rook-ceph-osd-16-6f5c8f7548-5ssps                                 2/2     Running     3          6d22h   172.17.88.79     10.240.0.17     <none>           <none>
rook-ceph-osd-18-7b87f6c669-24mlb                                 2/2     Running     3          6d22h   172.17.88.78     10.240.0.17     <none>           <none>
rook-ceph-osd-20-6957574bc9-79mdm                                 2/2     Running     1          6d22h   172.17.88.80     10.240.0.17     <none>           <none>
rook-ceph-osd-23-85f6c89956-kvsh4                                 2/2     Running     2          5d19h   172.17.88.94     10.240.0.17     <none>           <none>

rook-ceph-osd-2-8fcdd546f-4qqjn                                   2/2     Running     0          15h     172.17.113.137   10.240.128.14   <none>           <none>
rook-ceph-osd-5-756f9cd4ff-nvctd                                  2/2     Running     0          15h     172.17.113.136   10.240.128.14   <none>           <none>


You can see that node "10.240.128.12" got 2 more OSDs ( if we delete these newest ones on 10.240.128.12 , they will move to 10.240.128.14.

In second attempt output was 

rook-ceph-osd-0-fd95fb959-28hdw                                   2/2     Running     0          85m     172.17.88.155    10.240.64.9     <none>           <none>
rook-ceph-osd-15-76fbdc74f4-62qqc                                 2/2     Running     0          6d1h    172.17.88.144    10.240.64.9     <none>           <none>
rook-ceph-osd-17-7998f86fc6-rsmmb                                 2/2     Running     1          6d1h    172.17.88.142    10.240.64.9     <none>           <none>
rook-ceph-osd-6-6647966dd6-d6bcz                                  2/2     Running     3          6d1h    172.17.88.140    10.240.64.9     <none>           <none>
rook-ceph-osd-7-7b758d7bb9-x9xgn                                  2/2     Running     1          6d1h    172.17.88.143    10.240.64.9     <none>           <none>



rook-ceph-osd-1-d45fbd654-ll4qx                                   2/2     Running     0          85m     172.17.103.73    10.240.64.11    <none>           <none>
rook-ceph-osd-12-5d9cbdc748-lmzhw                                 2/2     Running     0          85m     172.17.103.74    10.240.64.11    <none>           <none>
rook-ceph-osd-8-78c8d778f9-xs97w                                  2/2     Running     1          85m     172.17.103.72    10.240.64.11    <none>           <none>




rook-ceph-osd-10-78647675cf-cwh4h                                 2/2     Running     0          5d23h   172.17.89.109    10.240.128.12   <none>           <none>
rook-ceph-osd-11-7f88545866-72s2x                                 2/2     Running     0          6d      172.17.89.74     10.240.128.12   <none>           <none>
rook-ceph-osd-19-5675f5d695-d7g4m                                 2/2     Running     0          6d      172.17.89.73     10.240.128.12   <none>           <none>
rook-ceph-osd-4-547ff5697f-bc2xm                                  2/2     Running     0          6d      172.17.89.75     10.240.128.12   <none>           <none>



rook-ceph-osd-13-7574d8f5c-qjfj9                                  2/2     Running     3          5d23h   172.17.104.9     10.240.0.19     <none>           <none>
rook-ceph-osd-14-7d6858967f-x9jsq                                 2/2     Running     0          5d23h   172.17.104.12    10.240.0.19     <none>           <none>
rook-ceph-osd-21-6474d6c979-xc6p2                                 2/2     Running     1          5d23h   172.17.104.11    10.240.0.19     <none>           <none>
rook-ceph-osd-22-57b89578f6-gdvz4                                 2/2     Running     3          5d23h   172.17.104.10    10.240.0.19     <none>           <none>



rook-ceph-osd-16-6f5c8f7548-5ssps                                 2/2     Running     4          7d1h    172.17.88.79     10.240.0.17     <none>           <none>
rook-ceph-osd-18-7b87f6c669-24mlb                                 2/2     Running     4          7d1h    172.17.88.78     10.240.0.17     <none>           <none>
rook-ceph-osd-20-6957574bc9-79mdm                                 2/2     Running     1          7d1h    172.17.88.80     10.240.0.17     <none>           <none>
rook-ceph-osd-23-85f6c89956-kvsh4                                 2/2     Running     2          5d23h   172.17.88.94     10.240.0.17     <none>           <none>

rook-ceph-osd-2-8fcdd546f-4qqjn                                   2/2     Running     0          18h     172.17.113.137   10.240.128.14   <none>           <none>
rook-ceph-osd-3-5656f8598f-zmx9k                                  2/2     Running     0          95m     172.17.113.145   10.240.128.14   <none>           <none>
rook-ceph-osd-5-756f9cd4ff-nvctd                                  2/2     Running     0          18h     172.17.113.136   10.240.128.14   <none>           <none>
rook-ceph-osd-9-5fbb9c9d9c-r8rs9                                  2/2     Running     0          95m     172.17.113.143   10.240.128.14   <none>           <none>


Is it real to expect that after replacing node ( ibmcloud ks worker replace --worker kube-c3hidk8d0cktobbskgr0-dmodff1-default-00000f73 --cluster dm-odff1 ) new nodes get same number of OSDs and that OSDs are equally balanced across nodes?

Comment 9 Mudit Agarwal 2021-07-29 16:35:07 UTC
Not a 4.8 blocker

Comment 10 Travis Nielsen 2021-07-29 22:14:57 UTC
A few questions:
1. Please show the output of "ceph osd tree" from the toolbox to show if there is any other hierarchy than the nodes
2. You followed the docs to purge the old OSDs from Ceph, right? For example, ceph osd tree should not still show the osds from the bad node
3. Please share the CephCluster CR. This will show the topology spread constraints and other settings used to create the new OSDs. This might be related to an OCS bug that was fixed recently for spreading OSDs evenly across racks, though I'll have to look for that BZ another day...

Comment 11 Elvir Kuric 2021-08-04 19:12:34 UTC
`. see [1] 
2. only "ibmcloud ks worker replace < worker id >" is executed from machine ( laptop ) which rights to run commands against cluster on IBM cloud, then it does in background where it replaces machine in OCP cluster. 
3. [2] 

[1] 

# ceph osd tree
ID  CLASS WEIGHT    TYPE NAME                                    STATUS REWEIGHT PRI-AFF 
 -1       375.00000 root default                                                         
 -5       375.00000     region us-south                                                  
-30       125.00000         zone us-south-1                                              
-55        15.62500             host ocs-deviceset-0-data-0mfz2p                         
 23   hdd  15.62500                 osd.23                           up  1.00000 1.00000 
-57        15.62500             host ocs-deviceset-0-data-4tb4cl                         
 22   hdd  15.62500                 osd.22                           up  1.00000 1.00000 
-29        15.62500             host ocs-deviceset-0-data-6mbqx6                         
 13   hdd  15.62500                 osd.13                           up  1.00000 1.00000 
-47        15.62500             host ocs-deviceset-1-data-39j6rg                         
 18   hdd  15.62500                 osd.18                           up  1.00000 1.00000 
-41        15.62500             host ocs-deviceset-1-data-4265dj                         
 16   hdd  15.62500                 osd.16                           up  1.00000 1.00000 
-45        15.62500             host ocs-deviceset-2-data-1ljbq8                         
 21   hdd  15.62500                 osd.21                           up  1.00000 1.00000 
-37        15.62500             host ocs-deviceset-2-data-2gnr96                         
 14   hdd  15.62500                 osd.14                           up  1.00000 1.00000 
-49        15.62500             host ocs-deviceset-2-data-7gbks9                         
 20   hdd  15.62500                 osd.20                           up  1.00000 1.00000 
 -4       125.00000         zone us-south-2                                              
 -3        15.62500             host ocs-deviceset-0-data-26kq8b                         
  0   hdd  15.62500                 osd.0                            up  1.00000 1.00000 
 -9        15.62500             host ocs-deviceset-0-data-5bxw5z                         
  1   hdd  15.62500                 osd.1                            up  1.00000 1.00000 
-39        15.62500             host ocs-deviceset-1-data-0dkqxr                         
 15   hdd  15.62500                 osd.15                           up  1.00000 1.00000 
-51        15.62500             host ocs-deviceset-1-data-2hwzcm                         
  8   hdd  15.62500                 osd.8                            up  1.00000 1.00000 
-27        15.62500             host ocs-deviceset-1-data-65q279                         
  7   hdd  15.62500                 osd.7                            up  1.00000 1.00000 
-17        15.62500             host ocs-deviceset-2-data-0vqtjj                         
  6   hdd  15.62500                 osd.6                            up  1.00000 1.00000 
-35        15.62500             host ocs-deviceset-2-data-4d2sp6                         
 12   hdd  15.62500                 osd.12                           up  1.00000 1.00000 
-53        15.62500             host ocs-deviceset-2-data-6x467v                         
 17   hdd  15.62500                 osd.17                           up  1.00000 1.00000 
-12       125.00000         zone us-south-3                                              
-43        15.62500             host ocs-deviceset-0-data-1bxjfw                         
 19   hdd  15.62500                 osd.19                           up  1.00000 1.00000 
-19        15.62500             host ocs-deviceset-0-data-3gsspv                         
  2   hdd  15.62500                 osd.2                            up  1.00000 1.00000 
-33        15.62500             host ocs-deviceset-0-data-7cljwc                         
 11   hdd  15.62500                 osd.11                           up  1.00000 1.00000 
-25        15.62500             host ocs-deviceset-1-data-12dvjw                         
  3   hdd  15.62500                 osd.3                            up  1.00000 1.00000 
-23        15.62500             host ocs-deviceset-1-data-575th8                         
  9   hdd  15.62500                 osd.9                            up  1.00000 1.00000 
-11        15.62500             host ocs-deviceset-1-data-7jvrkg                         
  4   hdd  15.62500                 osd.4                            up  1.00000 1.00000 
-21        15.62500             host ocs-deviceset-2-data-3ddvnb                         
 10   hdd  15.62500                 osd.10                           up  1.00000 1.00000 
-15        15.62500             host ocs-deviceset-2-data-59qvrt                         
  5   hdd  15.62500                 osd.5                            up  1.00000 1.0000

[2] 
$ oc get storagecluster -n openshift-storage -o yaml 
apiVersion: v1
items:
- apiVersion: ocs.openshift.io/v1
  kind: StorageCluster
  metadata:
    annotations:
      uninstall.ocs.openshift.io/cleanup-policy: delete
      uninstall.ocs.openshift.io/mode: graceful
    creationTimestamp: "2021-07-06T08:56:17Z"
    finalizers:
    - storagecluster.ocs.openshift.io
    generation: 3
    managedFields:
    - apiVersion: ocs.openshift.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:spec:
          .: {}
          f:encryption: {}
          f:externalStorage: {}
          f:monPVCTemplate:
            .: {}
            f:metadata: {}
            f:spec:
              .: {}
              f:accessModes: {}
              f:resources:
                .: {}
                f:requests: {}
              f:storageClassName: {}
              f:volumeMode: {}
            f:status: {}
      manager: manager
      operation: Update
      time: "2021-07-06T08:56:17Z"
    - apiVersion: ocs.openshift.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:spec:
          f:monPVCTemplate:
            f:spec:
              f:resources:
                f:requests:
                  f:storage: {}
      manager: oc
      operation: Update
      time: "2021-07-06T09:32:58Z"
    - apiVersion: ocs.openshift.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:uninstall.ocs.openshift.io/cleanup-policy: {}
            f:uninstall.ocs.openshift.io/mode: {}
          f:finalizers: {}
        f:spec:
          f:arbiter: {}
          f:encryption:
            f:kms: {}
          f:managedResources:
            .: {}
            f:cephBlockPools: {}
            f:cephConfig: {}
            f:cephFilesystems: {}
            f:cephObjectStoreUsers: {}
            f:cephObjectStores: {}
          f:storageDeviceSets: {}
          f:version: {}
        f:status:
          .: {}
          f:conditions: {}
          f:failureDomain: {}
          f:failureDomainKey: {}
          f:failureDomainValues: {}
          f:images:
            .: {}
            f:ceph:
              .: {}
              f:actualImage: {}
              f:desiredImage: {}
            f:noobaaCore:
              .: {}
              f:actualImage: {}
              f:desiredImage: {}
            f:noobaaDB:
              .: {}
              f:actualImage: {}
              f:desiredImage: {}
          f:nodeTopologies:
            .: {}
            f:labels:
              .: {}
              f:kubernetes.io/hostname: {}
              f:topology.kubernetes.io/region: {}
              f:topology.kubernetes.io/zone: {}
          f:phase: {}
          f:relatedObjects: {}
      manager: ocs-operator
      operation: Update
      time: "2021-07-06T09:59:57Z"
    name: ocs-storagecluster
    namespace: openshift-storage
    resourceVersion: "21454329"
    selfLink: /apis/ocs.openshift.io/v1/namespaces/openshift-storage/storageclusters/ocs-storagecluster
    uid: 143ed9e8-cd7a-4242-9108-c95bdda70106
  spec:
    arbiter: {}
    encryption:
      kms: {}
    externalStorage: {}
    managedResources:
      cephBlockPools: {}
      cephConfig: {}
      cephFilesystems: {}
      cephObjectStoreUsers: {}
      cephObjectStores: {}
    monPVCTemplate:
      metadata: {}
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 25Gi
        storageClassName: ibmc-vpc-block-metro-10iops-tier
        volumeMode: Filesystem
      status: {}
    storageDeviceSets:
    - config: {}
      count: 8
      dataPVCTemplate:
        metadata: {}
        spec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 16000Gi
          storageClassName: ibmc-vpc-block-metro-3iops-tier
          volumeMode: Block
        status: {}
      name: ocs-deviceset
      placement: {}
      portable: true
      preparePlacement: {}
      replica: 3
      resources: {}
    version: 4.7.0
  status:
    conditions:
    - lastHeartbeatTime: "2021-08-04T19:11:24Z"
      lastTransitionTime: "2021-08-04T16:11:30Z"
      message: Reconcile completed successfully
      reason: ReconcileCompleted
      status: "True"
      type: ReconcileComplete
    - lastHeartbeatTime: "2021-08-04T19:11:24Z"
      lastTransitionTime: "2021-07-30T20:04:18Z"
      message: Reconcile completed successfully
      reason: ReconcileCompleted
      status: "True"
      type: Available
    - lastHeartbeatTime: "2021-08-04T19:11:24Z"
      lastTransitionTime: "2021-07-30T20:04:18Z"
      message: Reconcile completed successfully
      reason: ReconcileCompleted
      status: "False"
      type: Progressing
    - lastHeartbeatTime: "2021-08-04T19:11:24Z"
      lastTransitionTime: "2021-07-30T20:04:18Z"
      message: Reconcile completed successfully
      reason: ReconcileCompleted
      status: "False"
      type: Degraded
    - lastHeartbeatTime: "2021-08-04T19:11:24Z"
      lastTransitionTime: "2021-07-30T20:04:18Z"
      message: Reconcile completed successfully
      reason: ReconcileCompleted
      status: "True"
      type: Upgradeable
    failureDomain: zone
    failureDomainKey: topology.kubernetes.io/zone
    failureDomainValues:
    - us-south-3
    - us-south-2
    - us-south-1
    images:
      ceph:
        actualImage: registry.redhat.io/rhceph/rhceph-4-rhel8@sha256:725f93133acc0fb1ca845bd12e77f20d8629cad0e22d46457b2736578698eb6c
        desiredImage: registry.redhat.io/rhceph/rhceph-4-rhel8@sha256:725f93133acc0fb1ca845bd12e77f20d8629cad0e22d46457b2736578698eb6c
      noobaaCore:
        actualImage: registry.redhat.io/ocs4/mcg-core-rhel8@sha256:6ff8645efdde95fa97d496084d3555b7680895f0b79c147f2a880b43742af3a4
        desiredImage: registry.redhat.io/ocs4/mcg-core-rhel8@sha256:6ff8645efdde95fa97d496084d3555b7680895f0b79c147f2a880b43742af3a4
      noobaaDB:
        actualImage: registry.redhat.io/rhel8/postgresql-12@sha256:f486bbe07f1ddef166bab5a2a6bdcd0e63e6e14d15b42d2425762f83627747bf
        desiredImage: registry.redhat.io/rhel8/postgresql-12@sha256:f486bbe07f1ddef166bab5a2a6bdcd0e63e6e14d15b42d2425762f83627747bf
    nodeTopologies:
      labels:
        kubernetes.io/hostname:
        - 10.240.128.7
        - 10.240.64.4
        - 10.240.64.5
        - 10.240.0.6
        - 10.240.0.7
        - 10.240.128.6
        - 10.240.0.12
        - 10.240.0.13
        - 10.240.128.10
        - 10.240.0.14
        - 10.240.128.11
        - 10.240.0.15
        - 10.240.0.17
        - 10.240.64.10
        - 10.240.64.9
        - 10.240.128.12
        - 10.240.128.13
        - 10.240.0.19
        - 10.240.128.14
        - 10.240.64.11
        - 10.240.64.12
        topology.kubernetes.io/region:
        - us-south
        topology.kubernetes.io/zone:
        - us-south-3
        - us-south-2
        - us-south-1
    phase: Ready
    relatedObjects:
    - apiVersion: ceph.rook.io/v1
      kind: CephCluster
      name: ocs-storagecluster-cephcluster
      namespace: openshift-storage
      resourceVersion: "21453982"
      uid: 83035032-0845-40f7-8e40-a3caa0de138d
    - apiVersion: noobaa.io/v1alpha1
      kind: NooBaa
      name: noobaa
      namespace: openshift-storage
      resourceVersion: "21454327"
      uid: bc2a2fce-e507-41f4-8543-233169a292b0
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

Comment 12 Travis Nielsen 2021-08-05 00:10:55 UTC
Could you also share the CephCluster CR in addition to the StorageCluster CR? The CephCluster would have the topology spread constraints to show exactly how the OSDs are expected to be spread across hosts. Or if the TSCs are only specified for zones, then this explains why the hosts are not evenly spread. At least the OSDs are spread evenly across zones as expected.

Comment 13 Elvir Kuric 2021-09-17 16:00:41 UTC
(In reply to Travis Nielsen from comment #12)
> Could you also share the CephCluster CR in addition to the StorageCluster
this is roks installation - do you mind to write commands you want me to run on this cluster. I do not have CephCluster CR as cluster is installed via addon
> CR? The CephCluster would have the topology spread constraints to show
> exactly how the OSDs are expected to be spread across hosts. Or if the TSCs
> are only specified for zones, then this explains why the hosts are not
> evenly spread. At least the OSDs are spread evenly across zones as expected.

Also, from #11 we see below [1] , cluster originally has 6 nodes, but in storagecluster configuration it preserves old nodes once --replace is issued and new node is created. Per [1] one can conclude that we have many nodes ( but there are actually only 6 ) and all other were at some point in time cluster member but not any more. 

--- [1] 
 kubernetes.io/hostname:
        - 10.240.128.7
        - 10.240.64.4
        - 10.240.64.5
        - 10.240.0.6
        - 10.240.0.7
        - 10.240.128.6
        - 10.240.0.12
        - 10.240.0.13
        - 10.240.128.10
        - 10.240.0.14
        - 10.240.128.11
        - 10.240.0.15
        - 10.240.0.17
        - 10.240.64.10
        - 10.240.64.9
        - 10.240.128.12
        - 10.240.128.13
        - 10.240.0.19
        - 10.240.128.14
        - 10.240.64.11
        - 10.240.64.12
---

Comment 14 Travis Nielsen 2021-09-17 19:48:58 UTC
The CephCluster CR can be retrieved with this:

oc -n openshift-storage get cephcluster -o yaml

Comment 17 Mudit Agarwal 2021-10-06 09:57:57 UTC
Please reopen once we have all the details.


Note You need to log in before you can comment on or make changes to this bug.