Bug 1421311 - [rbd-mirror] : renaming of image is not synced to secondary sites
Summary: [rbd-mirror] : renaming of image is not synced to secondary sites
Keywords:
Status: CLOSED DUPLICATE of bug 1365034
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RBD
Version: 2.2
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: 2.2
Assignee: Jason Dillaman
QA Contact: ceph-qe-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-11 00:10 UTC by Rachana Patel
Modified: 2017-07-30 15:26 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-02-12 17:42:38 UTC
Embargoed:


Attachments (Terms of Use)

Description Rachana Patel 2017-02-11 00:10:01 UTC
Description of problem:
=======================
Did rename of multiple images on primary site, all rename were synced to secondary site except one. In that case, rename was not synced to secondary site and on secondary sites description say 'failed to commit journal event'


Version-Release number of selected component (if applicable):
==============================================================
10.2.5-13.el7cp.x86_64


How reproducible:
=================
only once/intermittent


Steps to Reproduce:
===================
1. had a ceph cluster where one site is primary for mirrorring and 2 sites are secondary.(each site have one MON, 3 OSD and 1rbd-mirror node)
2. Enabled pool level mirroring on pool data1
3. created few images in pool data1. all images were synced to both secondaries
4. rename all those images on primary site.


Actual results:
===============
only one rename was not synced to seconary site

primary site:-
-------------
rename data1/dataset10 to data1/dataset10new


secondary site
--------------
[root@magna099 ubuntu]# rbd mirror image status data1/dataset10 --cluster slave2
dataset10:
global_id: 1aefcc7a-1f08-40be-9073-2715d49bdc9f
state: up+error
description: failed to commit journal event
last_update: 2017-02-09 19:53:34
[root@magna099 ubuntu]# rbd ls data1 --cluster slave2 | grep 10
dataset10
dataset101
dataset102

[root@magna100 ubuntu]# rbd mirror image status data1/dataset10new --cluster slave1
rbd: error opening image dataset10new: (2) No such file or directory
[root@magna100 ubuntu]# rbd mirror image status data1/dataset10 --cluster slave1
dataset10:
global_id: 1aefcc7a-1f08-40be-9073-2715d49bdc9f
state: up+error
description: failed to commit journal event
last_update: 2017-02-09 19:49:56


Expected results:
=================
rename should sync to secondary site


Additional info:

Comment 2 Jason Dillaman 2017-02-12 17:42:38 UTC
Issue occurred when a "snap protect" was used against an image that did not support the layering feature. This recorded an error in the journal which resulted in a split-brain as expected.

*** This bug has been marked as a duplicate of bug 1365034 ***

Comment 3 Jason Dillaman 2017-02-12 17:49:28 UTC
Journal records:

# journal_id: 377a238e1f29
89 {"tag_id":101,"commit_tid":1,"type":7,"entry":"AgEXAAAABwAAAAEAAAAAAAAABwAAAHNuYXAxMDA="}
93 {"tag_id":101,"commit_tid":2,"type":3,"entry":"AgEYAAAAAwAAAAEAAAAAAAAAAQAAAAAAAADa\/\/\/\/"}
89 {"tag_id":102,"commit_tid":3,"type":7,"entry":"AgEWAAAABwAAAAEAAAAAAAAABgAAAHNuYXA5MA=="}
93 {"tag_id":102,"commit_tid":4,"type":3,"entry":"AgEYAAAAAwAAAAEAAAAAAAAAAQAAAAAAAADa\/\/\/\/"}
89 {"tag_id":103,"commit_tid":5,"type":7,"entry":"AgEXAAAABwAAAAEAAAAAAAAABwAAAHNuYXAxMDA="}
93 {"tag_id":103,"commit_tid":6,"type":3,"entry":"AgEYAAAAAwAAAAEAAAAAAAAAAQAAAAAAAADa\/\/\/\/"}
98 {"tag_id":104,"commit_tid":7,"type":10,"entry":"AgEcAAAACgAAAAEAAAAAAAAADAAAAGRhdGFzZXQxMG5ldw=="}
89 {"tag_id":104,"commit_tid":8,"type":3,"entry":"AgEYAAAAAwAAAAEAAAAAAAAAAQAAAAAAAAAAAAAA"}
86 {"tag_id":105,"commit_tid":9,"type":11,"entry":"AgEUAAAACwAAAAEAAAAAAAAAAAAAgAcAAAA="}
90 {"tag_id":105,"commit_tid":10,"type":3,"entry":"AgEYAAAAAwAAAAEAAAAAAAAAAQAAAAAAAAAAAAAA"}
87 {"tag_id":106,"commit_tid":11,"type":11,"entry":"AgEUAAAACwAAAAEAAAAAAAAAAAAAAAUAAAA="}
90 {"tag_id":106,"commit_tid":12,"type":3,"entry":"AgEYAAAAAwAAAAEAAAAAAAAAAQAAAAAAAAAAAAAA"}
87 {"tag_id":107,"commit_tid":13,"type":11,"entry":"AgEUAAAACwAAAAEAAAAAAAAAAAAAgAcAAAA="}
90 {"tag_id":107,"commit_tid":14,"type":3,"entry":"AgEYAAAAAwAAAAEAAAAAAAAAAQAAAAAAAAAAAAAA"}

The first uncommitted event entry is the request for snap protect (type 7), the second uncommitted event entry records the failure result code of "-ENOSYS" (last four bytes from base64 entry string are 0xDA 0xFF 0xFF 0xFF ---> -38 ---> -ENOSYS).


Note You need to log in before you can comment on or make changes to this bug.