Bug 2142013

Summary: Not all osds are up after upscale
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Filip Balák <fbalak>
Component: odf-managed-serviceAssignee: Leela Venkaiah Gangavarapu <lgangava>
Status: CLOSED CURRENTRELEASE QA Contact: Filip Balák <fbalak>
Severity: high Docs Contact:
Priority: high    
Version: 4.10CC: aeyal, lgangava, ocs-bugs, odf-bz-bot, rchikatw
Target Milestone: ---Keywords: TestBlocker
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-03-14 15:28:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Filip Balák 2022-11-11 12:03:56 UTC
Description of problem:
If upscale is performed with command:
$ rosa edit service --id=<service_ID>  --size="<new_size in TiB>"
then some osds are not up and user doesn't have enough capacity. For example when size is changed from 4 to 8 and then to 20:
$ oc rsh -n openshift-storage $(oc get pods -n openshift-storage|grep tool|awk '{print$1}') ceph -s
  cluster:
    id:     f009e4c6-06e5-4f09-9476-3d55e9d439b0
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum a,b,c (age 3h)
    mgr: a(active, since 3h)
    mds: 1/1 daemons up, 1 hot standby
    osd: 15 osds: 9 up (since 9m), 11 in (since 14s)
 
  data:
    volumes: 1/1 healthy
    pools:   4 pools, 609 pgs
    objects: 23 objects, 14 KiB
    usage:   156 MiB used, 36 TiB / 36 TiB avail
    pgs:     609 active+clean
 
  io:
    client:   1.2 KiB/s rd, 2 op/s rd, 0 op/s wr

$ oc rsh -n openshift-storage $(oc get pods -n openshift-storage|grep tool|awk '{print$1}') ceph df
--- RAW STORAGE ---
CLASS    SIZE   AVAIL     USED  RAW USED  %RAW USED
ssd    36 TiB  36 TiB  155 MiB   155 MiB          0
TOTAL  36 TiB  36 TiB  155 MiB   155 MiB          0
 
--- POOLS ---
POOL                                                                ID  PGS  STORED  OBJECTS     USED  %USED  MAX AVAIL
device_health_metrics                                                1    1     0 B        0      0 B      0     10 TiB
ocs-storagecluster-cephfilesystem-metadata                           2   32  21 KiB       22  155 KiB      0     10 TiB
ocs-storagecluster-cephfilesystem-data0                              3  512     0 B        0      0 B      0     10 TiB
cephblockpool-storageconsumer-3717e957-2e13-4339-b433-dc9e65fdc3ae   4   64    19 B        1   12 KiB      0     10 TiB

Version-Release number of selected component (if applicable):
ocs-osd-deployer.v2.0.8

How reproducible:
1/1

Steps to Reproduce:
1. Deploy 4 TiB provider cluster
2. Upscale it to 20
$ rosa edit service --id=<service_ID>  --size="20"
3. Check ceph status:
$ oc rsh -n openshift-storage $(oc get pods -n openshift-storage|grep tool|awk '{print$1}') ceph -s
  cluster:
    id:     f009e4c6-06e5-4f09-9476-3d55e9d439b0
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum a,b,c (age 3h)
    mgr: a(active, since 3h)
    mds: 1/1 daemons up, 1 hot standby
    osd: 15 osds: 9 up (since 9m), 11 in (since 14s)
 
  data:
    volumes: 1/1 healthy
    pools:   4 pools, 609 pgs
    objects: 23 objects, 14 KiB
    usage:   156 MiB used, 36 TiB / 36 TiB avail
    pgs:     609 active+clean
 
  io:
    client:   1.2 KiB/s rd, 2 op/s rd, 0 op/s wr
 
4. Check events of osd pods of down pods.

Actual results:
Some osds are down because of insufficient memory:

0/18 nodes are available: 12 Insufficient memory, 12 node(s) had no available volume zone, 12 node(s) had volume node affinity conflict, 15 node(s) didn't match Pod's node affinity/selector, 3 Insufficient cpu, 3 node(s) had untolerated taint {node-role.kubernetes.io/infra: }, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/18 nodes are available: 15 Preemption is not helpful for scheduling, 3 Insufficient memory.

Users have only 10 TiB available instead of 20 TiB.

Expected results:
All new osds are up

Additional info:
$ oc rsh -n openshift-storage $(oc get pods -n openshift-storage|grep tool|awk '{print$1}') ceph osd tree
ID   CLASS  WEIGHT    TYPE NAME                               STATUS  REWEIGHT  PRI-AFF
 -1         36.00000  root default                                                     
 -5         36.00000      region us-east-1                                             
 -4         12.00000          zone us-east-1a                                          
-17          4.00000              host default-0-data-1997d8                           
  3    ssd   4.00000                  osd.3                       up   1.00000  1.00000
 -3          4.00000              host default-2-data-0kmcrj                           
  0    ssd   4.00000                  osd.0                       up   1.00000  1.00000
-23          4.00000              host default-2-data-3vz9vb                           
 11    ssd   4.00000                  osd.11                      up   1.00000  1.00000
-14         12.00000          zone us-east-1b                                          
-27          4.00000              host default-0-data-2jt598                           
  9    ssd   4.00000                  osd.9                       up   1.00000  1.00000
-13          4.00000              host default-1-data-0sq66t                           
  2    ssd   4.00000                  osd.2                       up   1.00000  1.00000
-21          4.00000              host default-2-data-1fk8ck                           
  5    ssd   4.00000                  osd.5                       up   1.00000  1.00000
-10         12.00000          zone us-east-1c                                          
 -9          4.00000              host default-0-data-0vlnfx                           
  1    ssd   4.00000                  osd.1                       up   1.00000  1.00000
-25          4.00000              host default-0-data-4z6vps                           
  6    ssd   4.00000                  osd.6                       up   1.00000  1.00000
-19          4.00000              host default-1-data-12xm7l                           
  4    ssd   4.00000                  osd.4                       up   1.00000  1.00000
  7                0  osd.7                                     down         0  1.00000
  8                0  osd.8                                     down         0  1.00000
 10                0  osd.10                                    down         0  1.00000
 12                0  osd.12                                    down         0  1.00000
 13                0  osd.13                                    down   1.00000  1.00000
 14                0  osd.14                                    down   1.00000  1.00000
$ oc rsh -n openshift-storage $(oc get pods -n openshift-storage|grep tool|awk '{print$1}') ceph df
--- RAW STORAGE ---
CLASS    SIZE   AVAIL     USED  RAW USED  %RAW USED
ssd    36 TiB  36 TiB  156 MiB   156 MiB          0
TOTAL  36 TiB  36 TiB  156 MiB   156 MiB          0
 
--- POOLS ---
POOL                                                                ID  PGS  STORED  OBJECTS     USED  %USED  MAX AVAIL
device_health_metrics                                                1    1     0 B        0      0 B      0     10 TiB
ocs-storagecluster-cephfilesystem-metadata                           2   32  21 KiB       22  155 KiB      0     10 TiB
ocs-storagecluster-cephfilesystem-data0                              3  512     0 B        0      0 B      0     10 TiB
cephblockpool-storageconsumer-3717e957-2e13-4339-b433-dc9e65fdc3ae   4   64    19 B        1   12 KiB      0     10 TiB

Comment 6 Filip Balák 2023-02-28 08:39:56 UTC
All osds are in after upscale from 4 -> 20. --> VERIFIED

Tested with:
ocs-osd-deployer.v2.0.11

Comment 7 Ritesh Chikatwar 2023-03-14 15:28:50 UTC
Closing this bug as fixed in v2.0.11 and tested by QE.