Bug 1997738 - RBD pvc creation fails on VMware
Summary: RBD pvc creation fails on VMware
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ceph
Version: 4.9
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ODF 4.9.0
Assignee: Scott Ostapovicz
QA Contact: Anna Sandler
URL:
Whiteboard:
Depends On: 1986794 2000434
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-25 17:49 UTC by Pratik Surve
Modified: 2023-08-09 16:37 UTC (History)
9 users (show)

Fixed In Version: v4.9.0-164.ci
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-12-13 17:45:28 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2021:5086 0 None None None 2021-12-13 17:46:09 UTC

Description Pratik Surve 2021-08-25 17:49:58 UTC
Description of problem (please be detailed as possible and provide log
snippests):

RBD PVC creation fails on VMware 

Version of all relevant components (if applicable):

OCS operator:- ocs-operator.v4.9.0-112.ci
OCP version:- 4.9.0-0.nightly-2021-08-18-144658

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue be reproducible?
yes

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Deploy OCS 4.9 cluster over vmware
2. Create rbd pvc
3.


Actual results:
Events:
  Type     Reason                Age                    From                                                                                                                Message
  ----     ------                ----                   ----                                                                                                                -------
  Warning  ProvisioningFailed    66m                    openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-6d68d86c5c-sj8sm_9103b034-edbd-4796-ab1c-115bb0cf713c  failed to provision volume with StorageClass "ocs-storagecluster-ceph-rbd": rpc error: code = DeadlineExceeded desc = context deadline exceeded
  Warning  ProvisioningFailed    42m (x14 over 66m)     openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-6d68d86c5c-sj8sm_9103b034-edbd-4796-ab1c-115bb0cf713c  failed to provision volume with StorageClass "ocs-storagecluster-ceph-rbd": rpc error: code = Aborted desc = an operation with the given Volume ID pvc-3fb011ec-4e83-402f-a2d6-184e6d31a083 already exists
  Normal   ExternalProvisioning  3m51s (x270 over 68m)  persistentvolume-controller                                                                                         waiting for a volume to be created, either by external provisioner "openshift-storage.rbd.csi.ceph.com" or manually created by system administrator
  Normal   Provisioning          2m53s (x26 over 68m)   openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-6d68d86c5c-sj8sm_9103b034-edbd-4796-ab1c-115bb0cf713c  External provisioner is provisioning volume for claim "openshift-storage/test2"

Expected results:
pvc should be in bound state

Additional info:

Comment 4 Mudit Agarwal 2021-08-26 01:29:00 UTC
Niels, can this be another instance of https://bugzilla.redhat.com/show_bug.cgi?id=1986794

We issue a pvc creation request but it never comes back:

>> 2021-08-25T16:10:26.610860120Z I0825 16:10:26.610816       1 utils.go:176] ID: 22 Req-ID: pvc-c0d0f028-61a0-425e-9e78-0d1baa30cabe GRPC call: /csi.v1.Controller/CreateVolume
>> 2021-08-25T16:10:26.611284529Z I0825 16:10:26.611261       1 utils.go:180] ID: 22 Req-ID: pvc-c0d0f028-61a0-425e-9e78-0d1baa30cabe GRPC request: {"capacity_range":{"required_bytes":42949672960},"name":"pvc-c0d0f028-61a0-425e-9e78-0d1baa30cabe","parameters":{"clusterID":"openshift-storage","csi.storage.k8s.io/pv/name":"pvc-c0d0f028-61a0-425e-9e78-0d1baa30cabe","csi.storage.k8s.io/pvc/name":"my-prometheus-claim-prometheus-k8s-0","csi.storage.k8s.io/pvc/namespace":"openshift-monitoring","imageFeatures":"layering","imageFormat":"2","pool":"ocs-storagecluster-cephblockpool","thickProvision":"false"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
>> 2021-08-25T16:10:26.611514312Z I0825 16:10:26.611501       1 rbd_util.go:1202] ID: 22 Req-ID: pvc-c0d0f028-61a0-425e-9e78-0d1baa30cabe setting disableInUseChecks: false image features: [layering] mounter: rbd
2021-08-25T16:10:26.626951213Z E0825 16:10:26.626906       1 omap.go:77] ID: 22 Req-ID: pvc-c0d0f028-61a0-425e-9e78-0d1baa30cabe omap not found (pool="ocs-storagecluster-cephblockpool", namespace="", name="csi.volumes.default"): rados: ret=-2, No such file or directory
>> 2021-08-25T16:10:26.636517322Z I0825 16:10:26.636468       1 omap.go:154] ID: 22 Req-ID: pvc-c0d0f028-61a0-425e-9e78-0d1baa30cabe set omap keys (pool="ocs-storagecluster-cephblockpool", namespace="", name="csi.volumes.default"): map[csi.volume.pvc-c0d0f028-61a0-425e-9e78-0d1baa30cabe:f5bb7432-05be-11ec-b0e7-0a580a810260])
>> 2021-08-25T16:10:26.641030165Z I0825 16:10:26.640965       1 omap.go:154] ID: 22 Req-ID: pvc-c0d0f028-61a0-425e-9e78-0d1baa30cabe set omap keys (pool="ocs-storagecluster-cephblockpool", namespace="", name="csi.volume.f5bb7432-05be-11ec-b0e7-0a580a810260"): map[csi.imagename:csi-vol-f5bb7432-05be-11ec-b0e7-0a580a810260 csi.volname:pvc-c0d0f028-61a0-425e-9e78-0d1baa30cabe csi.volume.owner:openshift-monitoring])
>> 2021-08-25T16:10:26.641106844Z I0825 16:10:26.641096       1 rbd_journal.go:484] ID: 22 Req-ID: pvc-c0d0f028-61a0-425e-9e78-0d1baa30cabe generated Volume ID (0001-0011-openshift-storage-0000000000000002-f5bb7432-05be-11ec-b0e7-0a580a810260) and image name (csi-vol-f5bb7432-05be-11ec-b0e7-0a580a810260) for request name (pvc-c0d0f028-61a0-425e-9e78-0d1baa30cabe)
>> 2021-08-25T16:10:26.641242352Z I0825 16:10:26.641208       1 rbd_util.go:242] ID: 22 Req-ID: pvc-c0d0f028-61a0-425e-9e78-0d1baa30cabe rbd: create ocs-storagecluster-cephblockpool/csi-vol-f5bb7432-05be-11ec-b0e7-0a580a810260 size 40960M (features: [layering]) using mon 172.30.120.106:6789,172.30.246.236:6789,172.30.193.17:6789

Comment 5 Niels de Vos 2021-09-01 06:24:43 UTC
(In reply to Mudit Agarwal from comment #4)
> Niels, can this be another instance of
> https://bugzilla.redhat.com/show_bug.cgi?id=1986794

Yes, quite possible. Bug 1986794 shows hangs while creating an RBD image (before thick-provisioning started), it hangs at the same location here.

Comment 6 Niels de Vos 2021-09-01 06:26:28 UTC
We'll continue working in bz 1986794 for now. Once that is resolved, the problem reported in this bug might be fixed as well.

Comment 7 Mudit Agarwal 2021-09-20 07:52:08 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=2000434 is ON_QA which means we have a fix in Ceph.

This can be moved to ON_QA once we have an OCS build with the Ceph fix.

Comment 14 Anna Sandler 2021-10-13 22:51:17 UTC
created a test SC with thick provisioner and a pool
created a PVC under this SC and it was is in bound state 

default                    rbd-test-pvc                                Bound    pvc-046869ae-96f0-4f83-8ba2-158ab88b1322   1Gi        RWO            test-sc                       52s

moving to verified

Comment 16 errata-xmlrpc 2021-12-13 17:45:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Data Foundation 4.9.0 enhancement, security, and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:5086


Note You need to log in before you can comment on or make changes to this bug.