Bug 1501958 - [CephFS]:- Cluster ended up in "damaged" mds when subtree pinning is in progress and tried to do mds failover
Summary: [CephFS]:- Cluster ended up in "damaged" mds when subtree pinning is in progr...
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: CephFS
Version: 3.0
Hardware: x86_64
OS: Linux
Target Milestone: z2
: 3.0
Assignee: Patrick Donnelly
QA Contact: Ramakrishnan Periyasamy
Depends On:
TreeView+ depends on / blocked
Reported: 2017-10-13 15:08 UTC by shylesh
Modified: 2018-04-26 17:39 UTC (History)
9 users (show)

Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2018-04-26 17:38:39 UTC

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 21821 0 None None None 2017-10-17 17:37:45 UTC
Red Hat Product Errata RHBA-2018:1259 0 None None None 2018-04-26 17:39:38 UTC

Comment 5 Yan, Zheng 2017-10-16 13:25:58 UTC
External Bug ID: Ceph Project Bug Tracker 21812

Seem like that standby replay mds submitted log entry

Comment 6 Yan, Zheng 2017-10-17 02:50:56 UTC
I wrongly interpret the log. looks like two mds wrote to object 200.00004273 at the same time. something must be wrong with blacklist

In osd.3.log at magna103:/var/log/ceph
2017-10-13 14:10:16.312400 7f308a412700 10 osd.3 pg_epoch: 849 pg[2.3( v 849'910079 (841'908563,849'910079] local-lis/les=668/669 n=76028 ec=3/3 lis/c 668/668 les/c/f 669/670/0 668/668/371) [3,0,8] r=0 lpr=668 luod=849'910050 lua=849'910055 crt=849'910079 lcod 848'910049 mlcod 845'910047 active+clean]  sending reply on osd_op(mds.0.2269:12216 2.3 2:c78e7855:::200.00004273:head [write 842784~1373 [fadvise_dontneed]] snapc 0=[] ondisk+write+known_if_redirected+full_force e849) v8 0x9914830a80


2017-10-13 14:11:10.061530 7f309ac33700 10 osd.3 pg_epoch: 851 pg[2.3( v 851'910221 (841'908663,851'910221] local-lis/les=668/669 n=76028 ec=3/3 lis/c 668/668 les/c/f 669/670/0 668/668/371) [3,0,8] r=0 lpr=668 luod=851'910216 lua=851'910215 crt=851'910221 lcod 851'910215 mlcod 851'910214 active+clean]  sending reply on osd_op(mds.0.2207:27831 2.3 2:c78e7855:::200.00004273:head [write 842784~2354 [fadvise_dontneed]] snapc 0=[] ondisk+write+known_if_redirected+full_force e846) v8 0x99126e2700

mds.0.2269 first wrote an log entry at offset 842784, then mds.0.2207 wrote another log entry at the same offset. mds.0.2207 was the laggy mds, which should be blacklisted.

Comment 7 Yan, Zheng 2017-10-17 03:00:38 UTC
sudo ceph -c /etc/ceph/cfs.conf daemon mon.magna023 config get mds_blacklist_interval
    "mds_blacklist_interval": "5.000000"

5 seconds are too short, you should use default value. the issue was caused by wrong config.

Comment 9 John Spray 2017-10-17 12:55:15 UTC
mds_blacklist_interval is only used on monitor daemons.  You do not need to modify this from the default.

Setting a short blacklist interval is effectively the same as preventing the monitors from blacklisting failed MDSs, and it will break the system.

Comment 19 Ramakrishnan Periyasamy 2018-04-03 09:33:12 UTC
provided QA_ACK, clearing need info, Please move to bug to ON_QA

Comment 21 Ramakrishnan Periyasamy 2018-04-03 10:10:26 UTC
Ken, could you please move this bug to ON_QA

Comment 23 tserlin 2018-04-03 14:30:16 UTC
(In reply to Ramakrishnan Periyasamy from comment #21)
> Ken, could you please move this bug to ON_QA



Comment 24 Ramakrishnan Periyasamy 2018-04-03 15:31:45 UTC
Moving this bug to verified state, updated the command output in comment 20

tested in ceph version  ceph-12.2.4-4.el7cp

Comment 28 errata-xmlrpc 2018-04-26 17:38:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.