1507629 – mds gets significantly behind on trimming while creating millions of files

Bug 1507629 - mds gets significantly behind on trimming while creating millions of files

Summary: mds gets significantly behind on trimming while creating millions of files

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	CephFS
Sub Component:
Version:	3.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	medium
Target Milestone:	z2
Target Release:	3.0
Assignee:	Yan, Zheng
QA Contact:	Ramakrishnan Periyasamy
Docs Contact:	Erin Donnelly
URL:
Whiteboard:
Depends On:	1548067
Blocks:	1557269
TreeView+	depends on / blocked

Reported:	2017-10-30 18:29 UTC by Patrick Donnelly
Modified:	2021-09-09 12:47 UTC (History)
CC List:	8 users (show)
Fixed In Version:	RHEL: ceph-12.2.4-5.el7cp Ubuntu: ceph_12.2.1-6redhat1xenial
Doc Type:	Bug Fix
Doc Text:	Previously, Metadata Server (MDS) daemons could get behind on trimming for large metadata workloads in larger clusters. With this update, MDS no longer gets behind on trimming for large metadata workloads
Clone Of:
Environment:
Last Closed:	2018-04-26 17:38:39 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
ceph status logs (103.95 KB, text/plain) 2018-04-03 06:28 UTC, Ramakrishnan Periyasamy	no flags	Details
ceph status after upgrading cluster to ceph version 12.2.4-6.el7cp (205.12 KB, text/plain) 2018-04-09 09:13 UTC, Ramakrishnan Periyasamy	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Ceph Project Bug Tracker	21975	None	None	None	2017-10-30 18:34:02 UTC
Github	ceph ceph pull 18783	None	closed	luminous: mds: trim 'N' log segments according to how many log segments are there	2021-02-05 05:28:11 UTC
Red Hat Issue Tracker	RHCEPH-1532	None	None	None	2021-09-09 12:47:57 UTC
Red Hat Product Errata	RHBA-2018:1259	None	None	None	2018-04-26 17:39:38 UTC

Comment 5 Yan, Zheng 2017-11-21 13:59:33 UTC

looks good

Comment 12 Ramakrishnan Periyasamy 2018-03-19 10:48:57 UTC

Provided qa_ack, clearing needinfo tag.

Comment 17 Ramakrishnan Periyasamy 2018-04-03 06:28:52 UTC

Created attachment 1416603 [details]
ceph status logs

Comment 19 Yan, Zheng 2018-04-04 02:28:45 UTC

It seems that there were lots slow osd requests. I don't think bumping mds_log_max_segments will help in this case.

Comment 20 Ramakrishnan Periyasamy 2018-04-04 05:19:47 UTC

Moving this bug to assigned state, since MDS journal trimming is slow.

Comment 21 Yan, Zheng 2018-04-04 07:29:49 UTC

Sorry, I spoke too early. I guest set mds_log_max_segments to around 100 may silent the warning.

Comment 23 Yan, Zheng 2018-04-05 04:24:16 UTC

The test always already had slow osd request warnings. In my option, having mds behind on trim warnings is somewhat expected.

Comment 24 Ramakrishnan Periyasamy 2018-04-05 08:36:51 UTC

In existing setup updated "mds_log_max_segments": "128" in all MDS, so far after running IO for more than 5hrs not seeing the trim message in logs.

In downstream setup always getting lot of slow request messages, is there any way it can be avoided ? do we have any upstream bug for this ?

Comment 29 Ramakrishnan Periyasamy 2018-04-09 09:13:35 UTC

Created attachment 1419159 [details]
ceph status after upgrading cluster to ceph version 12.2.4-6.el7cp

Comment 30 Ramakrishnan Periyasamy 2018-04-09 09:14:33 UTC

Moving this bug to verified state.

Not observed trim notifications in ceph status

[root@magna113 ceph]# ceph --admin-daemon ceph-mds.magna113.asok config get mds_log_max_segments 
{
    "mds_log_max_segments": "128"
}

Comment 34 errata-xmlrpc 2018-04-26 17:38:39 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1259

Note You need to log in before you can comment on or make changes to this bug.