2023268 – [Managed Service Tracker] OSDs are not evenly distributed

Bug 2023268 - [Managed Service Tracker] OSDs are not evenly distributed

Summary: [Managed Service Tracker] OSDs are not evenly distributed

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	Red Hat OpenShift Data Foundation
Classification:	Red Hat Storage
Component:	odf-managed-service
Sub Component:
Version:	4.8
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Ohad
QA Contact:	Neha Berry
Docs Contact:
URL:
Whiteboard:
Depends On:	2004801
Blocks:
TreeView+	depends on / blocked

Reported:	2021-11-15 10:43 UTC by Filip Balák
Modified:	2023-08-09 17:00 UTC (History)
CC List:	10 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	2100713 (view as bug list)
Environment:
Last Closed:	2023-01-20 09:45:26 UTC
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	2004801	1	None	None	None	2024-06-27 07:14:10 UTC

Internal Links: 2004801

Description Filip Balák 2021-11-15 10:43:45 UTC

Description of problem:
After installation of ODF Managed Service addon on 1 TiB ROSA cluster I see that there are 2 OSDs on 1 node. rack-0 is not used.

$ oc get nodes --show-labels
NAME                                         STATUS   ROLES          AGE   VERSION                LABELS
ip-10-0-148-15.us-east-2.compute.internal    Ready    master         92m   v1.22.0-rc.0+a44d0f0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m5.2xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2a,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-148-15.us-east-2.compute.internal,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.kubernetes.io/instance-type=m5.2xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-east-2a,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2a
ip-10-0-151-136.us-east-2.compute.internal   Ready    infra,worker   62m   v1.22.0-rc.0+a44d0f0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=r5.xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2a,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-151-136.us-east-2.compute.internal,kubernetes.io/os=linux,node-role.kubernetes.io/infra=,node-role.kubernetes.io/worker=,node-role.kubernetes.io=infra,node.kubernetes.io/instance-type=r5.xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-east-2a,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2a
ip-10-0-155-80.us-east-2.compute.internal    Ready    master         93m   v1.22.0-rc.0+a44d0f0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m5.2xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2a,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-155-80.us-east-2.compute.internal,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.kubernetes.io/instance-type=m5.2xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-east-2a,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2a
ip-10-0-161-52.us-east-2.compute.internal    Ready    worker         83m   v1.22.0-rc.0+a44d0f0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m5.2xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2a,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-161-52.us-east-2.compute.internal,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=m5.2xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-east-2a,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2a,topology.rook.io/rack=rack0
ip-10-0-189-113.us-east-2.compute.internal   Ready    worker         82m   v1.22.0-rc.0+a44d0f0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m5.2xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2a,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-189-113.us-east-2.compute.internal,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=m5.2xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-east-2a,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2a,topology.rook.io/rack=rack1
ip-10-0-195-98.us-east-2.compute.internal    Ready    worker         86m   v1.22.0-rc.0+a44d0f0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m5.2xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2a,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-195-98.us-east-2.compute.internal,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=m5.2xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-east-2a,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2a,topology.rook.io/rack=rack2
ip-10-0-199-64.us-east-2.compute.internal    Ready    master         93m   v1.22.0-rc.0+a44d0f0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m5.2xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2a,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-199-64.us-east-2.compute.internal,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.kubernetes.io/instance-type=m5.2xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-east-2a,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2a
ip-10-0-219-37.us-east-2.compute.internal    Ready    infra,worker   61m   v1.22.0-rc.0+a44d0f0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=r5.xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2a,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-10-0-219-37.us-east-2.compute.internal,kubernetes.io/os=linux,node-role.kubernetes.io/infra=,node-role.kubernetes.io/worker=,node-role.kubernetes.io=infra,node.kubernetes.io/instance-type=r5.xlarge,node.openshift.io/os_id=rhcos,topology.ebs.csi.aws.com/zone=us-east-2a,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2a

$ oc get pods -n openshift-storage -o wide|grep ceph-osd
rook-ceph-osd-0-56c8dd8864-zw4sr                                  2/2     Running     0             43m   10.129.2.10    ip-10-0-189-113.us-east-2.compute.internal   <none>           <none>
rook-ceph-osd-1-85f94b9957-fk5g8                                  2/2     Running     0             44m   10.131.0.34    ip-10-0-195-98.us-east-2.compute.internal    <none>           <none>
rook-ceph-osd-2-75ff66487-nmmfg                                   2/2     Running     0             43m   10.129.2.9     ip-10-0-189-113.us-east-2.compute.internal   <none>           <none>
rook-ceph-osd-prepare-default-0-data-098bp7--1-b7n2t              0/1     Completed   0             45m   10.131.0.33    ip-10-0-195-98.us-east-2.compute.internal    <none>           <none>

$ oc rsh -n openshift-storage rook-ceph-tools-798b4968cc-pfx4p ceph osd tree
ID  CLASS WEIGHT  TYPE NAME                                  STATUS REWEIGHT PRI-AFF 
 -1       3.00000 root default                                                       
 -6       3.00000     region us-east-2                                               
 -5       3.00000         zone us-east-2a                                            
-12       2.00000             rack rack1                                             
-11       1.00000                 host default-1-data-0wlkbt                         
  0   ssd 1.00000                     osd.0                      up  1.00000 1.00000 
-15       1.00000                 host default-2-data-0jqgkn                         
  2   ssd 1.00000                     osd.2                      up  1.00000 1.00000 
 -4       1.00000             rack rack2                                             
 -3       1.00000                 host default-0-data-098bp7                         
  1   ssd 1.00000                     osd.1                      up  1.00000 1.00000 

Version-Release number of selected component (if applicable):
ocs-operator.v4.8.2
ocs-osd-deployer.v1.1.1

How reproducible:
Not sure

Comment 2 Sahina Bose 2021-11-16 07:21:08 UTC

2/3 OSDs on the same node is not expected. Is this a product bug?

Comment 3 Sahina Bose 2021-11-16 07:28:48 UTC

Can you attach the StorageCluster CR?

Comment 5 N Balachandran 2021-11-16 09:12:11 UTC

Please attach the CephCluster CR as well.

Comment 7 Kesavan 2021-11-17 09:59:31 UTC

The possible reason would be during the ODF MS addon installation, one of the node would have went down and which lead OSDs to schedule on the available two nodes using TSC.
Currently TSC doesn't have any mechanism to check for the minimum nodes before scheduling, so StorageCluster creation is still being proceeded with 3 replicas on a cluster with less than 3 nodes

Comment 8 Sahina Bose 2021-11-17 10:14:16 UTC

Jose, this looks like a regression on introducing TopologySpreadConstraints. Can we solve it in the product?

Comment 11 Ramakrishnan Periyasamy 2021-11-18 09:59:54 UTC

Observed this problem in scale tests too, here is the bz https://bugzilla.redhat.com/show_bug.cgi?id=2004801

Comment 12 Jose A. Rivera 2022-01-24 15:37:50 UTC

I'm not 100% sure if this is a regression, but it's certainly a problem we should resolve. Since the requirements on the managed service(s) is changing frequently, I'll leave it to you guys to prioritize this BZ. As long as there is a fully ACKed OCS/ODF BZ it can go into any release of ocs-operator.

Comment 14 Orit Wasserman 2022-06-23 10:42:22 UTC

(In reply to Red Hat Bugzilla from comment #13)
> remove performed by PnT Account Manager <pnt-expunge>

In a single zone AWS deployment, flexible scaling should be enabled and we would not added rack labels.
In addition 2 OSDs on the same node will mean that there will be no OSD to store the third replica, i.e the cluster always in degraded mode.
This makes this a regression.

Comment 15 Orit Wasserman 2022-06-23 10:42:40 UTC

(In reply to Red Hat Bugzilla from comment #13)
> remove performed by PnT Account Manager <pnt-expunge>

In a single zone AWS deployment, flexible scaling should be enabled and we would not added rack labels.
In addition 2 OSDs on the same node will mean that there will be no OSD to store the third replica, i.e the cluster always in degraded mode.
This makes this a regression.

Comment 17 Dhruv Bindra 2022-07-19 05:41:02 UTC

The tracking bug is fixed in the product and this needs to verify and close.

Comment 18 Filip Balák 2022-07-19 11:50:47 UTC

I am turning this back to NEW. BZ 2100713 was closed but the reason is that it is a duplicate of BZ 2004801 which is still in NEW state.

Note You need to log in before you can comment on or make changes to this bug.