1501958 – [CephFS]:- Cluster ended up in "damaged" mds when subtree pinning is in progress and tried to do mds failover

Bug 1501958 - [CephFS]:- Cluster ended up in "damaged" mds when subtree pinning is in progress and tried to do mds failover

Summary: [CephFS]:- Cluster ended up in "damaged" mds when subtree pinning is in progr...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	CephFS
Sub Component:
Version:	3.0
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	urgent
Target Milestone:	z2
Target Release:	3.0
Assignee:	Patrick Donnelly
QA Contact:	Ramakrishnan Periyasamy
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-10-13 15:08 UTC by shylesh
Modified:	2018-04-26 17:39 UTC (History)
CC List:	9 users (show)
Fixed In Version:	RHEL: ceph-12.2.4-4.el7cp Ubuntu: ceph_12.2.4-5redhat1xenial
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-04-26 17:38:39 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Ceph Project Bug Tracker	21821	0	None	None	None	2017-10-17 17:37:45 UTC
Red Hat Product Errata	RHBA-2018:1259	0	None	None	None	2018-04-26 17:39:38 UTC

Comment 5 Yan, Zheng 2017-10-16 13:25:58 UTC

External Bug ID: Ceph Project Bug Tracker 21812

Seem like that standby replay mds submitted log entry

Comment 6 Yan, Zheng 2017-10-17 02:50:56 UTC

I wrongly interpret the log. looks like two mds wrote to object 200.00004273 at the same time. something must be wrong with blacklist

In osd.3.log at magna103:/var/log/ceph
<pre>
2017-10-13 14:10:16.312400 7f308a412700 10 osd.3 pg_epoch: 849 pg[2.3( v 849'910079 (841'908563,849'910079] local-lis/les=668/669 n=76028 ec=3/3 lis/c 668/668 les/c/f 669/670/0 668/668/371) [3,0,8] r=0 lpr=668 luod=849'910050 lua=849'910055 crt=849'910079 lcod 848'910049 mlcod 845'910047 active+clean]  sending reply on osd_op(mds.0.2269:12216 2.3 2:c78e7855:::200.00004273:head [write 842784~1373 [fadvise_dontneed]] snapc 0=[] ondisk+write+known_if_redirected+full_force e849) v8 0x9914830a80

... 

2017-10-13 14:11:10.061530 7f309ac33700 10 osd.3 pg_epoch: 851 pg[2.3( v 851'910221 (841'908663,851'910221] local-lis/les=668/669 n=76028 ec=3/3 lis/c 668/668 les/c/f 669/670/0 668/668/371) [3,0,8] r=0 lpr=668 luod=851'910216 lua=851'910215 crt=851'910221 lcod 851'910215 mlcod 851'910214 active+clean]  sending reply on osd_op(mds.0.2207:27831 2.3 2:c78e7855:::200.00004273:head [write 842784~2354 [fadvise_dontneed]] snapc 0=[] ondisk+write+known_if_redirected+full_force e846) v8 0x99126e2700
</pre>

mds.0.2269 first wrote an log entry at offset 842784, then mds.0.2207 wrote another log entry at the same offset. mds.0.2207 was the laggy mds, which should be blacklisted.

Comment 7 Yan, Zheng 2017-10-17 03:00:38 UTC

sudo ceph -c /etc/ceph/cfs.conf daemon mon.magna023 config get mds_blacklist_interval
{
    "mds_blacklist_interval": "5.000000"
}

5 seconds are too short, you should use default value. the issue was caused by wrong config.

Comment 9 John Spray 2017-10-17 12:55:15 UTC

mds_blacklist_interval is only used on monitor daemons.  You do not need to modify this from the default.

Setting a short blacklist interval is effectively the same as preventing the monitors from blacklisting failed MDSs, and it will break the system.

Comment 19 Ramakrishnan Periyasamy 2018-04-03 09:33:12 UTC

provided QA_ACK, clearing need info, Please move to bug to ON_QA

Comment 21 Ramakrishnan Periyasamy 2018-04-03 10:10:26 UTC

Ken, could you please move this bug to ON_QA

Comment 23 tserlin 2018-04-03 14:30:16 UTC

(In reply to Ramakrishnan Periyasamy from comment #21)
> Ken, could you please move this bug to ON_QA

Done.

Thomas

Comment 24 Ramakrishnan Periyasamy 2018-04-03 15:31:45 UTC

Moving this bug to verified state, updated the command output in comment 20

tested in ceph version  ceph-12.2.4-4.el7cp

Comment 28 errata-xmlrpc 2018-04-26 17:38:39 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1259

Note You need to log in before you can comment on or make changes to this bug.