Bug 2219628
| Summary: | [RDR] After workloads are deleted, VRG deletion remains stuck for several hours, rbd false image count is shown and ceph command hangs on secondary | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Aman Agrawal <amagrawa> |
| Component: | Multi-Cloud Object Gateway | Assignee: | Romy Ayalon <rayalon> |
| Status: | ASSIGNED --- | QA Contact: | krishnaram Karthick <kramdoss> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.13 | CC: | akupczyk, bniver, kmanohar, muagarwa, nbecker, nojha, odf-bz-bot, prsurve, rayalon, sostapov, srangana |
| Target Milestone: | --- | Flags: | amagrawa:
needinfo?
(nojha) amagrawa: needinfo? (akupczyk) rayalon: needinfo? (srangana) amagrawa: needinfo? (srangana) |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
(In reply to kmanohar from comment #6) > Same issue has been observed on RDR Longevity setup > > OCP Version - 4.13.0-0.nightly-2023-06-05-164816 > ODF - ODF 4.13.0-219.snaptrim > SUBMARINER version:- v0.15.1 > VOLSYNC version:- volsync-product.v0.7.1 > > > Issue seen in volumereplication yaml > > vr yaml output > -------------- > > oc get vr busybox-pvc-61 -o yaml > > apiVersion: replication.storage.openshift.io/v1alpha1 > kind: VolumeReplication > metadata: > creationTimestamp: "2023-07-10T08:04:25Z" > finalizers: > - replication.storage.openshift.io > generation: 1 > name: busybox-pvc-61 > namespace: appset-busybox-4 > ownerReferences: > - apiVersion: ramendr.openshift.io/v1alpha1 > blockOwnerDeletion: true > controller: true > kind: VolumeReplicationGroup > name: busybox-4-placement-drpc > uid: 6f21ad83-16e0-4eb9-98bf-e43b9fb9bdf0 > resourceVersion: "36486402" > uid: a85e701c-4109-49a5-9dd6-fcb682a818bf > spec: > autoResync: false > dataSource: > apiGroup: "" > kind: PersistentVolumeClaim > name: busybox-pvc-61 > replicationHandle: "" > replicationState: primary > volumeReplicationClass: rbd-volumereplicationclass-2263283542 > status: > conditions: > - lastTransitionTime: "2023-07-10T08:04:26Z" > message: "" > observedGeneration: 1 > reason: FailedToPromote > status: "False" > type: Completed > - lastTransitionTime: "2023-07-10T08:04:26Z" > message: "" > observedGeneration: 1 > reason: Error > status: "True" > type: Degraded > - lastTransitionTime: "2023-07-10T08:04:26Z" > message: "" > observedGeneration: 1 > reason: NotResyncing > status: "False" > type: Resyncing > message: 'rados: ret=-11, Resource temporarily unavailable' > observedGeneration: 1 > state: Unknown > > Must gather logs > ---------------- > > c1 - > http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-2219628/july10/ > c1/ > > c2 - > http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-2219628/july10/ > c2/ > > hub - > http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-2219628/july10/ > hub/ > > Live setup is available for debugging This issue has been reported separately and is being tracked by BZ2221716. @akupczyk Could you pls update this BZ instead with your observations from the longevity setup Pratik shared with you offline? |
Same issue has been observed on RDR Longevity setup OCP Version - 4.13.0-0.nightly-2023-06-05-164816 ODF - ODF 4.13.0-219.snaptrim SUBMARINER version:- v0.15.1 VOLSYNC version:- volsync-product.v0.7.1 Issue seen in volumereplication yaml vr yaml output -------------- oc get vr busybox-pvc-61 -o yaml apiVersion: replication.storage.openshift.io/v1alpha1 kind: VolumeReplication metadata: creationTimestamp: "2023-07-10T08:04:25Z" finalizers: - replication.storage.openshift.io generation: 1 name: busybox-pvc-61 namespace: appset-busybox-4 ownerReferences: - apiVersion: ramendr.openshift.io/v1alpha1 blockOwnerDeletion: true controller: true kind: VolumeReplicationGroup name: busybox-4-placement-drpc uid: 6f21ad83-16e0-4eb9-98bf-e43b9fb9bdf0 resourceVersion: "36486402" uid: a85e701c-4109-49a5-9dd6-fcb682a818bf spec: autoResync: false dataSource: apiGroup: "" kind: PersistentVolumeClaim name: busybox-pvc-61 replicationHandle: "" replicationState: primary volumeReplicationClass: rbd-volumereplicationclass-2263283542 status: conditions: - lastTransitionTime: "2023-07-10T08:04:26Z" message: "" observedGeneration: 1 reason: FailedToPromote status: "False" type: Completed - lastTransitionTime: "2023-07-10T08:04:26Z" message: "" observedGeneration: 1 reason: Error status: "True" type: Degraded - lastTransitionTime: "2023-07-10T08:04:26Z" message: "" observedGeneration: 1 reason: NotResyncing status: "False" type: Resyncing message: 'rados: ret=-11, Resource temporarily unavailable' observedGeneration: 1 state: Unknown Must gather logs ---------------- c1 - http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-2219628/july10/c1/ c2 - http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-2219628/july10/c2/ hub - http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-2219628/july10/hub/ Live setup is available for debugging