Bug 2012143

Summary: [DR] VolumeReplication reports resyncing for a split brained image, due to possible RBD states state=starting_replay or up+stopping_replay
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Shyamsundar <srangana>
Component: csi-driverAssignee: Madhu Rajanna <mrajanna>
Status: CLOSED CURRENTRELEASE QA Contact: Pratik Surve <prsurve>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.9CC: jespy, madam, muagarwa, ocs-bugs, odf-bz-bot, rperiyas, sunkumar
Target Milestone: ---Keywords: AutomationBackLog
Target Release: ODF 4.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: v4.9.0-193.ci Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-01-07 17:46:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Shyamsundar 2021-10-08 12:12:49 UTC
Description of problem (please be detailed as possible and provide log
snippests):

RBD image that is mirror enabled is left as split-brain on old-primary post being marked as Secondary in VolumeReplication CR, and VolumeReplication CR reporting status as "Degraded: true" and "Resyncing: true", this is post image has been force promoted on the peer cluster which hence requires a resync to avoid the split-brain.

Order of operations:
- VolRep created as primary on cluster "one" (named as per log files attached)
- Later VolRep created as primary on cluster "two"
  - This would force promote the volume on "two"
- VolRep on "one" marked Secondary
  - This would technically enter split-brain as remote is primary as well, and should have issued a resync once this is detected
- VolRep reports "Degraded: true" and "Resyncing: true", indicating image is now resyncing
- Image is still left as split-brain on "one"

RBD image name investigated: ocs-storagecluster-cephblockpool/csi-vol-97e46d10-25cf-11ec-9fa9-0a580a8103a8
PVC name: busybox-pvc-2
PV name: pvc-6a03c816-27a2-43cf-9486-2e1d74e8276c

RBD mirror image status on cluster ("one") where volume was primary previously, and has been issued a demote to Secondary (volume would have split brained as it was force promoted on the remote peer):

bash-4.4$ rbd mirror image status ocs-storagecluster-cephblockpool/csi-vol-97e46d10-25cf-11ec-9fa9-0a580a8103a8
csi-vol-97e46d10-25cf-11ec-9fa9-0a580a8103a8:
  global_id:   a991b001-0470-492e-93d5-2fd0ccfa4063
  state:       up+stopping_replay
  description: split-brain
  last_update: 2021-10-05 13:08:02
  peer_sites:
    name: afc6aaac-199c-472e-bf35-390eb2799b3e
    state: up+stopped
    description: local image is primary
    last_update: 2021-10-05 13:07:44

Relevant VolRep logs from cluster "one" (log file: csi-volrep-one-265t7.txt):

Initially marked for replication primary:
{"level":"info","timestamp":"2021-10-05T11:44:32.939Z","logger":"controllers.VolumeReplication","caller":"controller/controller.go:298","msg":"volume handle","Request.Name":"busybox-pvc-2","Request.Namespace":"busybox-pratik","VolumeHandleName":"0001-0011-openshift-storage-0000000000000001-97e46d10-25cf-11ec-9fa9-0a580a8103a8"}
{"level":"info","timestamp":"2021-10-05T11:44:32.939Z","logger":"controllers.VolumeReplication","caller":"controllers/volumereplication_controller.go:160","msg":"adding finalizer to volumeReplication object","Request.Name":"busybox-pvc-2","Request.Namespace":"busybox-pratik","Finalizer":"replication.storage.openshift.io"}
{"level":"info","timestamp":"2021-10-05T11:44:32.952Z","logger":"controllers.VolumeReplication","caller":"controllers/volumereplication_controller.go:164","msg":"adding finalizer to PersistentVolumeClaim object","Request.Name":"busybox-pvc-2","Request.Namespace":"busybox-pratik","Finalizer":"replication.storage.openshift.io/pvc-protection"}
{"level":"info","timestamp":"2021-10-05T11:44:33.950Z","logger":"controllers.VolumeReplication","caller":"controller/controller.go:298","msg":"volume is marked primary","Request.Name":"busybox-pvc-2","Request.Namespace":"busybox-pratik"}

Moved to Secondary and resync:
{"level":"info","timestamp":"2021-10-05T12:20:50.660Z","logger":"controllers.VolumeReplication","caller":"controller/controller.go:298","msg":"volume handle","Request.Name":"busybox-pvc-2","Request.Namespace":"busybox-pratik","VolumeHandleName":"0001-0011-openshift-storage-0000000000000001-97e46d10-25cf-11ec-9fa9-0a580a8103a8"}
{"level":"info","timestamp":"2021-10-05T12:20:51.661Z","logger":"controllers.VolumeReplication","caller":"controller/controller.go:298","msg":"volume is not ready to use","Request.Name":"busybox-pvc-2","Request.Namespace":"busybox-pratik"}

{"level":"info","timestamp":"2021-10-05T12:21:15.562Z","logger":"controllers.VolumeReplication","caller":"controller/controller.go:298","msg":"volume handle","Request.Name":"busybox-pvc-2","Request.Namespace":"busybox-pratik","VolumeHandleName":"0001-0011-openshift-storage-0000000000000001-97e46d10-25cf-11ec-9fa9-0a580a8103a8"}
{"level":"info","timestamp":"2021-10-05T12:21:15.698Z","logger":"controllers.VolumeReplication","caller":"controller/controller.go:298","msg":"volume is not ready to use, requeuing for resync","Request.Name":"busybox-pvc-2","Request.Namespace":"busybox-pratik"}

Disable and delete:
{"level":"info","timestamp":"2021-10-05T12:21:56.971Z","logger":"controllers.VolumeReplication","caller":"controller/controller.go:298","msg":"volume handle","Request.Name":"busybox-pvc-2","Request.Namespace":"busybox-pratik","VolumeHandleName":"0001-0011-openshift-storage-0000000000000001-97e46d10-25cf-11ec-9fa9-0a580a8103a8"}
{"level":"error","timestamp":"2021-10-05T12:21:57.025Z","logger":"controllers.VolumeReplication","caller":"controllers/volumereplication_controller.go:170","msg":"failed to disable volume replication","Request.Name":"busybox-pvc-2","Request.Namespace":"busybox-pratik","error":"rpc error: code = InvalidArgument desc = secondary image status is up=true and state=starting_replay"}
{"level":"error","timestamp":"2021-10-05T12:21:57.025Z","logger":"controllers.VolumeReplication","caller":"controller/controller.go:298","msg":"failed to disable replication","Request.Name":"busybox-pvc-2","Request.Namespace":"busybox-pratik","error":"rpc error: code = InvalidArgument desc = secondary image status is up=true and state=starting_replay"}
{"level":"error","timestamp":"2021-10-05T12:21:57.025Z","logger":"controller-runtime.manager.controller.volumereplication","caller":"controller/controller.go:253","msg":"Reconciler error","reconciler group":"replication.storage.openshift.io","reconciler kind":"VolumeReplication","name":"busybox-pvc-2","namespace":"busybox-pratik","error":"rpc error: code = InvalidArgument desc = secondary image status is up=true and state=starting_replay"}

{"level":"info","timestamp":"2021-10-05T12:23:22.506Z","logger":"controllers.VolumeReplication","caller":"controller/controller.go:298","msg":"volume handle","Request.Name":"busybox-pvc-2","Request.Namespace":"busybox-pratik","VolumeHandleName":"0001-0011-openshift-storage-0000000000000001-97e46d10-25cf-11ec-9fa9-0a580a8103a8"}
{"level":"info","timestamp":"2021-10-05T12:23:22.561Z","logger":"controllers.VolumeReplication","caller":"controllers/volumereplication_controller.go:175","msg":"removing finalizer from PersistentVolumeClaim object","Request.Name":"busybox-pvc-2","Request.Namespace":"busybox-pratik","Finalizer":"replication.storage.openshift.io/pvc-protection"}
{"level":"info","timestamp":"2021-10-05T12:23:22.572Z","logger":"controllers.VolumeReplication","caller":"controllers/volumereplication_controller.go:182","msg":"removing finalizer from volumeReplication object","Request.Name":"busybox-pvc-2","Request.Namespace":"busybox-pratik","Finalizer":"replication.storage.openshift.io"}
{"level":"info","timestamp":"2021-10-05T12:23:22.585Z","logger":"controllers.VolumeReplication","caller":"controller/controller.go:298","msg":"volumeReplication object is terminated, skipping reconciliation","Request.Name":"busybox-pvc-2","Request.Namespace":"busybox-pratik"}
{"level":"info","timestamp":"2021-10-05T12:23:22.585Z","logger":"controllers.VolumeReplication","caller":"controller/controller.go:298","msg":"volumeReplication resource not found","Request.Name":"busybox-pvc-2","Request.Namespace":"busybox-pratik"}

The issue seems to be VolRep (and/or ceph-csi) reporting moved to Secondary and Resyncing true and Degraded true, based on image state which possibly does not cover state=starting_replay or up+stopping_replay? 

Also, If disable failed, I am wondering why delete finally succeeded?

List of log files attached:
csi-volrep-one-265t7.txt (CSI volume replication operator sidecar from cluster "one")
csi-rbdplugin-one-265t7.txt (CSI provisioner logs from rbdplugin container, in cluster "one")
csi-volrep-two-dcqd2.txt (CSI volume replication operator sidecar from cluster "two", for reference)
csi-rbdplugin-two-dcqd2.txt (CSI provisioner logs from rbdplugin container, in cluster "two", for reference)
rbd-mirror-one.log (RBD mirror daemon logs from cluster "one")
rbd-mirror-two.log (RBD mirror daemon logs from cluster "two")

Version of all relevant components (if applicable):

4.9 build #164

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

This is a silent error, and hence undetectable and would cause data corrpution in the future if the workload is moved to cluster "one".


Is there any workaround available to the best of your knowledge?

None


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?

3

The issue was reproduced in a mirroring setup across BLR and RDU, which hence has high latency interconnects and may impact the ability to reproduce the problem.

Can this issue reproducible?

Unsure as yet, other instance of split-brain have been root caused to different issues, so it does occur, but with the above RBD mirroring states is unsure.

Comment 19 Ramakrishnan Periyasamy 2022-08-17 06:51:06 UTC
Automation priority is low since DR automation is yet to be implemented.