Bug 1424687
Summary: | [rbd-mirror] : after split-brain is detected, unable to resync image using 'rbd mirror image resync <image-spec>' | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Rachana Patel <racpatel> |
Component: | RBD-Mirror | Assignee: | Ilya Dryomov <idryomov> |
Status: | CLOSED ERRATA | QA Contact: | Vasishta <vashastr> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 2.2 | CC: | ceph-eng-bugs, cephqe-warriors, flucifre, hnallurv, kdreyer, ocs-bugs, vashastr |
Target Milestone: | rc | ||
Target Release: | 2.2 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | RHEL: ceph-10.2.5-28.el7cp Ubuntu: ceph_10.2.5-20redhat1xenial | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-03-14 15:49:45 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Rachana Patel
2017-02-18 04:43:51 UTC
in repro, step 1, it reads "(site B has bidirectional relation with A while C has one-directional)". What is the difference there? multiple secondaries are not a blocker for release 2.2. @Federico: "site B has bidirectional relation with A while C has one-directional" means site B was configured to mirror primary images from site A and site A was configured to to mirror primary images from site B. Site C was configured to only mirror primary images from site A. This resync issue is an issue regardless of whether or not multiple secondaries are in-use should you hit a split-brain condition. Thanks Jason, understood. One-directional A->B with an optional A->C is the key use case. If we can get bi-directional A->B and B->A for different images/pools in this release, that is great. Do not worry about multiple secondaries at this late stage, we can punt those bugs to 2.3. Executed bewlow case to verify defect precondition ============ --> have 3 cluster. Site A being primary and Site B and site C are secondary sites (site B has bidirectional relation with A while C has one-directional) --> enable pool level or image level mirroring for few images. --> create images and let it sync to secondary.(A->B, A->C) 1) orderly shutdown a)failover --> demote image on A, promote image on B --> shutdown cluster A --> I/O on image from cluster B b)Failback --> bring up cluster A and let image sync to A --> demote image on B , promote image on A --> resync image on C --> do I/O on image from cluster A and let it sync to cluster B & C 2) nonorderly shutdown a)failover --> bring down cluster A --> force promote image on B --> **WORKAROUND** - restart rbd-mirror on cluster B --> do I/O on image from cluster B b)Failback --> bring cluster A back --> demote Image on A, resync Image on A --> demote image on cluster B, promote image on cluster A --> resync image from cluster C resync worked in both cases, hence moving back to verified verified with version - 10.2.5-29.el7cp.x86_64 (In reply to Rachana Patel from comment #13) > Executed bewlow case to verify defect > > precondition > ============ > --> have 3 cluster. Site A being primary and Site B and site C are secondary > sites > (site B has bidirectional relation with A while C has one-directional) > --> enable pool level or image level mirroring for few images. > --> create images and let it sync to secondary.(A->B, A->C) > > > 1) orderly shutdown > a)failover > --> demote image on A, promote image on B > --> shutdown cluster A > --> I/O on image from cluster B > > b)Failback > --> bring up cluster A and let image sync to A > --> demote image on B , promote image on A > --> resync image on C this should be 'resync image on cluster C from cluster A' > --> do I/O on image from cluster A and let it sync to cluster B & C > > 2) nonorderly shutdown > a)failover > --> bring down cluster A > --> force promote image on B > --> **WORKAROUND** - restart rbd-mirror on cluster B > --> do I/O on image from cluster B > > b)Failback > --> bring cluster A back > --> demote Image on A, resync Image on A > --> demote image on cluster B, promote image on cluster A > --> resync image from cluster C it should be 'resync image from cluster A to cluster C' > > > resync worked in both cases, hence moving back to verified > verified with version - 10.2.5-29.el7cp.x86_64 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0514.html |