Bug 1507629 - mds gets significantly behind on trimming while creating millions of files
Summary: mds gets significantly behind on trimming while creating millions of files
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: CephFS
Version: 3.0
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: z2
: 3.0
Assignee: Yan, Zheng
QA Contact: Ramakrishnan Periyasamy
Erin Donnelly
URL:
Whiteboard:
Depends On: 1548067
Blocks: 1557269
TreeView+ depends on / blocked
 
Reported: 2017-10-30 18:29 UTC by Patrick Donnelly
Modified: 2021-09-09 12:47 UTC (History)
8 users (show)

Fixed In Version: RHEL: ceph-12.2.4-5.el7cp Ubuntu: ceph_12.2.1-6redhat1xenial
Doc Type: Bug Fix
Doc Text:
Previously, Metadata Server (MDS) daemons could get behind on trimming for large metadata workloads in larger clusters. With this update, MDS no longer gets behind on trimming for large metadata workloads
Clone Of:
Environment:
Last Closed: 2018-04-26 17:38:39 UTC
Embargoed:


Attachments (Terms of Use)
ceph status logs (103.95 KB, text/plain)
2018-04-03 06:28 UTC, Ramakrishnan Periyasamy
no flags Details
ceph status after upgrading cluster to ceph version 12.2.4-6.el7cp (205.12 KB, text/plain)
2018-04-09 09:13 UTC, Ramakrishnan Periyasamy
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 21975 0 None None None 2017-10-30 18:34:02 UTC
Github ceph ceph pull 18783 0 None closed luminous: mds: trim 'N' log segments according to how many log segments are there 2021-02-05 05:28:11 UTC
Red Hat Issue Tracker RHCEPH-1532 0 None None None 2021-09-09 12:47:57 UTC
Red Hat Product Errata RHBA-2018:1259 0 None None None 2018-04-26 17:39:38 UTC

Comment 5 Yan, Zheng 2017-11-21 13:59:33 UTC
looks good

Comment 12 Ramakrishnan Periyasamy 2018-03-19 10:48:57 UTC
Provided qa_ack, clearing needinfo tag.

Comment 17 Ramakrishnan Periyasamy 2018-04-03 06:28:52 UTC
Created attachment 1416603 [details]
ceph status logs

Comment 19 Yan, Zheng 2018-04-04 02:28:45 UTC
It seems that there were lots slow osd requests. I don't think bumping mds_log_max_segments will help in this case.

Comment 20 Ramakrishnan Periyasamy 2018-04-04 05:19:47 UTC
Moving this bug to assigned state, since MDS journal trimming is slow.

Comment 21 Yan, Zheng 2018-04-04 07:29:49 UTC
Sorry, I spoke too early. I guest set mds_log_max_segments to around 100 may silent the warning.

Comment 23 Yan, Zheng 2018-04-05 04:24:16 UTC
The test always already had slow osd request warnings. In my option, having mds behind on trim warnings is somewhat expected.

Comment 24 Ramakrishnan Periyasamy 2018-04-05 08:36:51 UTC
In existing setup updated "mds_log_max_segments": "128" in all MDS, so far after running IO for more than 5hrs not seeing the trim message in logs.

In downstream setup always getting lot of slow request messages, is there any way it can be avoided ? do we have any upstream bug for this ?

Comment 29 Ramakrishnan Periyasamy 2018-04-09 09:13:35 UTC
Created attachment 1419159 [details]
ceph status after upgrading cluster to ceph version 12.2.4-6.el7cp

Comment 30 Ramakrishnan Periyasamy 2018-04-09 09:14:33 UTC
Moving this bug to verified state.

Not observed trim notifications in ceph status

[root@magna113 ceph]# ceph --admin-daemon ceph-mds.magna113.asok config get mds_log_max_segments 
{
    "mds_log_max_segments": "128"
}

Comment 34 errata-xmlrpc 2018-04-26 17:38:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1259


Note You need to log in before you can comment on or make changes to this bug.