Bug 2019931
| Summary: | [DR] When Relocate action is performed PVC's are in Termination state and workload is not moved to preferred cluster | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Pratik Surve <prsurve> | ||||||||||||
| Component: | ceph | Assignee: | Deepika Upadhyay <dupadhya> | ||||||||||||
| Status: | CLOSED WORKSFORME | QA Contact: | krishnaram Karthick <kramdoss> | ||||||||||||
| Severity: | urgent | Docs Contact: | |||||||||||||
| Priority: | unspecified | ||||||||||||||
| Version: | 4.9 | CC: | bmekhiss, bniver, dupadhya, ebenahar, jespy, kseeger, madam, mmuench, mrajanna, muagarwa, ocs-bugs, odf-bz-bot, olakra, owasserm, sagrawal, sostapov, srangana, sunkumar, tmuthami | ||||||||||||
| Target Milestone: | --- | Flags: | sostapov:
needinfo?
(prsurve) |
||||||||||||
| Target Release: | --- | ||||||||||||||
| Hardware: | Unspecified | ||||||||||||||
| OS: | Unspecified | ||||||||||||||
| Whiteboard: | |||||||||||||||
| Fixed In Version: | Doc Type: | Known Issue | |||||||||||||
| Doc Text: |
.Relocate action results in PVC's in Termination state and workload is not moved to a preferred cluster
While relocating a disaster recovery (DR) protected workload, results in workload not stopping on the current primary cluster and PVCs of the workload remaining in the terminating state. This prevents pods and PVCs from being relocated to the preferred cluster. To recover the issue perform a failover action to move the workload to the preferred cluster. The workload would be recovered on the preferred cluster but may include a loss of data as the action is a failover.
|
Story Points: | --- | ||||||||||||
| Clone Of: | |||||||||||||||
| : | 2030744 (view as bug list) | Environment: | |||||||||||||
| Last Closed: | 2022-03-30 12:57:48 UTC | Type: | Bug | ||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||
| Documentation: | --- | CRM: | |||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
| Embargoed: | |||||||||||||||
| Bug Depends On: | |||||||||||||||
| Bug Blocks: | 2011326, 2030744 | ||||||||||||||
| Attachments: |
|
||||||||||||||
|
Description
Pratik Surve
2021-11-03 16:28:01 UTC
The user requested the workload to be relocated to the preferred cluster. In summary, during the relocation, the VR is requested by VRG to move VR state to secondary. During that process, we demote the image, we resync if necessary, and then report success.
Our investigation revealed that the primary image was deleted due to a resync operation. The VR requested a resync due to split-brain error as seen from the VR log snippet on the Primary. Out of the 7 PVCs, the image for the mysql-pv-claim pvc failed with split-brain error.
```
"2021-11-02T11:39:10.473Z","logger":"controllers.VolumeReplication","caller":"controllers/volumereplication_controller.go:164","msg":"adding finalizer to PersistentVolumeClaim object","Request.Name":"mysql-pv-claim","Request.Namespace":"busybox-workloads","Finalizer":"replication.storage.openshift.io/pvc-protection"}
{"level":"info","timestamp":"2021-11-02T11:39:11.490Z","logger":"controllers.VolumeReplication","caller":"controller/controller.go:298","msg":"volume is marked primary","Reques
t.Name":"mysql-pv-claim","Request.Namespace":"busybox-workloads"}
"2021-11-02T13:56:21.002Z","logger":"controllers.VolumeReplication","caller":"controller/controller.go:298","msg":"volume handle","Request.Name":"my
sql-pv-claim","Request.Namespace":"busybox-workloads","VolumeHandleName":"0001-0011-openshift-storage-0000000000000001-f038968b-389c-11ec-bca9-0a580a800241"}
{"level":"info","timestamp":"2021-11-02T13:56:22.384Z","logger":"controllers.VolumeReplication","caller":"controller/controller.go:298","msg":"volume is not ready to use","Requ
est.Name":"mysql-pv-claim","Request.Namespace":"busybox-workloads"}
"2021-11-02T13:56:37.875Z","logger":"controllers.VolumeReplication","caller":"controllers/volumereplication_controller.go:236","msg":"failed to resync volume","Request.Name":"mysql-pv-claim","Request.Namespace":"busybox-workloads","error":"rpc error: code = Unavailable desc = awaiting initial resync due to split brain"}
```
We beleive that the resync ended up deleting the local image and the VR ended up in "kind of" deadlock waiting for the image to reappear with no further progress as there is no primary.
```
"2021-11-02T13:57:13.199Z","logger":"controllers.VolumeReplication","caller":"controllers/volumereplication_controller.go:198","msg":"failed to enable volume replication","Request.Name":"mysql-pv-claim","Request.Namespace":"busybox-workloads","error":"rpc error: code = NotFound desc = volume 0001-0011-openshift-storage-0000000000000001-f038968b-389c-11ec-bca9-0a580a800241 not found"}
```
The rbd-mirror log shows the following:
```
debug 2021-11-02T13:56:35.104+0000 7fa98c2ac700 -1 rbd::mirror::image_replayer::snapshot::Replayer: 0x564bbfe2b400 scan_remote_mirror_snapshots: split-brain detected: failed to find matching non-primary snapshot in remote image: local_snap_id_start=16491, local_snap_ns=[mirror state=primary (demoted), complete=1, mirror_peer_uuids=d3aaa712-4bc9-4aef-967e-56dcf3ae3585, primary_mirror_uuid=, primary_snap_id=head, last_copied_object_number=0, snap_seqs={}]
```
The error is "failed to find matching non-primary snapshot in remote image".
```
rbd snap ls csi-vol-f038968b-389c-11ec-bca9-0a580a800241 --pool=ocs-storagecluster-cephblockpool
```
show no snapshots in the remote cluster or the local cluster.
Next, we will be investigating RBD to see if we can determine the reason behind the split-brain error.
One last thing to note; on the primary cluster, the VR requested a resync due to split-brain error (which ended up deleting the primary image). We believe that's wrong. A resyc should be requested from the secondary cluster and not the primary.
-- Log files attached ---
Created attachment 1839887 [details]
rbd-mirror.log (from primary)
Created attachment 1839888 [details]
VR log from Primary
Created attachment 1839889 [details]
VR log from secondary
Created attachment 1839890 [details]
rbdplugin log from primary
Created attachment 1839891 [details]
csi-rbdplugin from secondary
Adding one more observation from the csi-rbdplugin logs from the current primary (comment #6): The following request is the one that detects a split-brain report from RBD, and issues a resync that ends up deleting the image with no known primary to sync from. I1102 13:56:37.529982 1 utils.go:177] ID: 6096 Req-ID: 0001-0011-openshift-storage-0000000000000001-f038968b-389c-11ec-bca9-0a580a800241 GRPC call: /replication.Controller/ResyncVolume I1102 13:56:37.530223 1 utils.go:179] ID: 6096 Req-ID: 0001-0011-openshift-storage-0000000000000001-f038968b-389c-11ec-bca9-0a580a800241 GRPC request: {"parameters":{"mirroringMode":"snapshot","schedulingInterval":"5m"},"secrets":"***stripped***","volume_id":"0001-0011-openshift-storage-0000000000000001-f038968b-389c-11ec-bca9-0a580a800241"} I1102 13:56:37.539866 1 omap.go:87] ID: 6096 Req-ID: 0001-0011-openshift-storage-0000000000000001-f038968b-389c-11ec-bca9-0a580a800241 got omap values: (pool="ocs-storagecluster-cephblockpool", namespace="", name="csi.volume.f038968b-389c-11ec-bca9-0a580a800241"): map[csi.imageid:63cb7d9a035e csi.imagename:csi-vol-f038968b-389c-11ec-bca9-0a580a800241 csi.volname:pvc-76502a33-701c-423c-a097-1d1078a29fd6 csi.volume.owner:busybox-workloads] I1102 13:56:37.611589 1 replicationcontrollerserver.go:612] ID: 6096 Req-ID: 0001-0011-openshift-storage-0000000000000001-f038968b-389c-11ec-bca9-0a580a800241 local status: daemon up=true, image mirroring state="error", description="split-brain" and lastUpdate=2021-11-02 13:56:35 +0000 UTC ===> NOTE: The response from RBD on image status reads "up+error" with a description of split-brain, causing the resync to be issued. E1102 13:56:37.874855 1 utils.go:186] ID: 6096 Req-ID: 0001-0011-openshift-storage-0000000000000001-f038968b-389c-11ec-bca9-0a580a800241 GRPC error: rpc error: code = Unavailable desc = awaiting initial resync due to split brain Further notes: - As the image was demoted and later promoted on a peer cluster, the subsequent demote on the peer cluster should not result in a split-brain report from RBD. The status should have ideally been up+stopped (post syncing the last snapshot). - As Benamar points out in comment #2 the RBD mirror daemon decides the image is in split-brain due to missing snapshots, which seems to be the issue causing the problem here For doc text confirmation. Pratik please confirm that this works. It can be tested with 4.10. So this issue is no longer reproducible? Removing the test blocker flag. If this issue is no longer reproducible, please close it as such and reopen it if it occurs again. |