Bug 2248487 - CephBlockPool creation failed when custom deviceClass used in StorageCluster
Summary: CephBlockPool creation failed when custom deviceClass used in StorageCluster
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: rook
Version: 4.14
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Parth Arora
QA Contact: Neha Berry
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-11-07 10:00 UTC by Vijay Avuthu
Modified: 2024-07-26 04:25 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
.CephBlockPool creation fails when custom deviceClass is used in StorageCluster Due to a known issue, CephBlockPool creation fails when custom deviceClass is used in StorageCluster.
Clone Of:
Environment:
Last Closed: 2024-03-27 15:26:34 UTC
Embargoed:


Attachments (Terms of Use)

Description Vijay Avuthu 2023-11-07 10:00:11 UTC
Description of problem (please be detailed as possible and provide log
snippests):
CephBlockPool creation failed when custom deviceClass used in StorageCluster


Version of all relevant components (if applicable):
ocs-registry:4.14.0-161


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?
Yes, ( edit CephBlockPool and change deviceClass to the same as in StorageCluster . In this scenario, change to "fast")

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
1/1

Can this issue reproduce from the UI?
1/1

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. install odf using ocs-ci with conf "conf/ocsci/use_device_class_fast.yaml"
2. from UI, create BlockPool
3. check BlockPool is in Ready state or not


Actual results:

$ oc get CephBlockPool 
NAME                               PHASE
ocs-storagecluster-cephblockpool   Ready
testpool1                          Failure
testpool2                          Failure


Expected results:

BlockPool should be in Ready state

Additional info:

$ oc describe CephBlockPool testpool1 
Name:         testpool1
Namespace:    openshift-storage
Labels:       <none>
Annotations:  <none>
API Version:  ceph.rook.io/v1
Kind:         CephBlockPool
Metadata:
  Creation Timestamp:  2023-11-07T07:42:36Z
  Finalizers:
    cephblockpool.ceph.rook.io
  Generation:  2
  Managed Fields:
    API Version:  ceph.rook.io/v1
    Fields Type:  FieldsV1
    fieldsV1:

Status:
  Phase:  Failure
Events:
  Type     Reason           Age                  From                             Message
  ----     ------           ----                 ----                             -------
  Warning  ReconcileFailed  15m (x23 over 104m)  rook-ceph-block-pool-controller  failed to reconcile CephBlockPool "openshift-storage/testpool1". failed to create pool "testpool1".: failed to create pool "testpool1".: failed to create pool "testpool1": failed to create replicated crush rule "testpool1": failed to create crush rule testpool1: exit status 22

> $ oc get CephBlockPool  testpool1 -o yaml
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  creationTimestamp: "2023-11-07T07:42:36Z"
  finalizers:
  - cephblockpool.ceph.rook.io
  generation: 2
  name: testpool1
  namespace: openshift-storage
  resourceVersion: "49462"
  uid: 714b288a-bc37-4e84-95ab-ce3d02a2322b
spec:
  compressionMode: none
  deviceClass: ssd      <-------------- should be same as in StorageCluster
  erasureCoded:
    codingChunks: 0
    dataChunks: 0
  failureDomain: rack
  mirroring: {}
  parameters:
    compression_mode: none
  quotas: {}
  replicated:
    size: 3
  statusCheck:
    mirror: {}
status:
  phase: Failure


> rook operator log

2023-11-07 07:42:36.828525 I | ceph-spec: adding finalizer "cephblockpool.ceph.rook.io" on "testpool1"
2023-11-07 07:42:36.842911 W | ceph-block-pool-controller: failed to set pool "testpool1" status to "Progressing". failed to update object "openshift-storage/testpool1" status: Operation cannot be fulfilled on cephblockpools.ceph.rook.io "testpool1": the object has been modified; please apply your changes to the latest version and try again
2023-11-07 07:42:36.853990 I | ceph-spec: parsing mon endpoints: a=172.30.32.9:3300,b=172.30.14.52:3300,c=172.30.63.229:3300
2023-11-07 07:42:37.121402 W | ceph-block-pool-controller: compressionMode is DEPRECATED, use Parameters instead
2023-11-07 07:42:37.121436 I | ceph-block-pool-controller: creating pool "testpool1" in namespace "openshift-storage"
2023-11-07 07:42:37.394713 E | ceph-block-pool-controller: failed to reconcile CephBlockPool "openshift-storage/testpool1". failed to create pool "testpool1".: failed to create pool "testpool1".: failed to create pool "testpool1": failed to create replicated crush rule "testpool1": failed to create crush rule testpool1: exit status 22

> storageCluster yaml
$ oc get storagecluster ocs-storagecluster -o yaml
apiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
  annotations:
    uninstall.ocs.openshift.io/cleanup-policy: delete
    uninstall.ocs.openshift.io/mode: graceful
  creationTimestamp: "2023-11-07T07:22:13Z"
  finalizers:
  - storagecluster.ocs.openshift.io
  generation: 2
  name: ocs-storagecluster
  namespace: openshift-storage
  ownerReferences:
  - apiVersion: odf.openshift.io/v1alpha1
    kind: StorageSystem
    name: ocs-storagecluster-storagesystem
    uid: 2af12681-839b-4e38-b7a6-fabea433d3c2
  resourceVersion: "120517"
  uid: c1e4c6a3-cb1e-4075-af27-3e3ce4f5df5c
spec:
  arbiter: {}
  encryption:
    kms: {}
  externalStorage: {}
  managedResources:
    cephBlockPools: {}
    cephCluster: {}
    cephConfig: {}
    cephDashboard: {}
    cephFilesystems: {}
    cephNonResilientPools: {}
    cephObjectStoreUsers: {}
    cephObjectStores: {}
    cephRBDMirror: {}
    cephToolbox: {}
  mirroring: {}
  resources:
    mds: {}
    mgr: {}
    mon: {}
    noobaa-core: {}
    noobaa-db: {}
    noobaa-endpoint:
      limits:
        cpu: "1"
        memory: 500Mi
      requests:
        cpu: "1"
        memory: 500Mi
    rgw: {}
  storageDeviceSets:
  - config: {}
    count: 1
    dataPVCTemplate:
      metadata: {}
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 100Gi
        storageClassName: thin-csi-odf
        volumeMode: Block
      status: {}
    deviceClass: fast
    name: ocs-deviceset
    placement: {}
    portable: true
    preparePlacement: {}
    replica: 3
    resources: {}

> deviceClass for pool should be same as in StorageCluster. pool becomes Ready once we change the deviceClass

mustgather: https://url.corp.redhat.com/0d278a6

Comment 2 Parth Arora 2023-11-07 15:30:55 UTC
deviceClass: ssd      <-------------- should be same as in StorageCluste

Use  `deviceClass: fast` in cephblockpool CR creation as this is been defined in storagecluster

Comment 3 Parth Arora 2023-11-07 15:31:43 UTC
Refer to this kcs for any confusion: https://access.redhat.com/node/7041497/draft#comments

Comment 7 Parth Arora 2024-01-16 09:19:34 UTC
(In reply to Parth Arora from comment #2)
> deviceClass: ssd      <-------------- should be same as in StorageCluste
> 
> Use  `deviceClass: fast` in cephblockpool CR creation as this is been
> defined in storagecluster

Have you tried the changes suggested above?

Comment 8 Travis Nielsen 2024-03-27 15:26:34 UTC
Please reopen if there is still an issue here to investigate

Comment 9 Red Hat Bugzilla 2024-07-26 04:25:04 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.