Note: this issue impacts the integration of the Cinder RBD replication driver. A forced-promotion results in a read-only image for OpenStack instances attempting to access the volume.
Upstream, master branch PR: https://github.com/ceph/ceph/pull/11090
Added dependency on two upstream issues that this issue depends upon for a clean cherry-pick.
Verified with 10.2.3-8.el7cp.x86_64, able to write to promoted image hence moving to verifed state
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2016-2815.html
Description:- ============ Able to reproduce issue where remote is unreachable and Admin performs Failover after a Non-Orderly Shutdown hence reopening this bug version ======= 10.2.5-13.el7cp.x86_64 Steps to Reproduce: =================== 1. have 3 cluster. Site A being primary and Site B and site C are secondary sites (site B has bidirectional relation with A while C has one-directional) 2. enable pool level or image level mirroring for few images. 3. create images and let it sync to secondary 4. perform a Non-orderly shutdown of master and Failover to site B images on site B are read-only even after successful promotion of images.
This has worked in 2.1 (see comment 15) and failing in 2.2 now. Looks like a regression. So, setting the target release to 2.2.
@Harish: your assumption is not correct -- when it was validated for 2.1, the remote cluster was never shut down. This issue has always been an issue and has never been fixed (thus it has never regressed). Moving back to 2.3 since this will not be fixed in time for 2.2.
@rachan, @harish, let's rewrite the test case. Shutdown should not be "orderly" for this test. If you want to be gentler on the primary cluster than pulling plugs would be, I think pulling the network link to the secondary is just fine. Jason, do you agree?
Since this BZ was attached to a shipped errata, but the issue is unfixed, I recommend we open another BZ to track this, because we cannot attach it to any advisory now. Jason and Harish, are you ok with this plan?