Bug 1507629

Summary:

mds gets significantly behind on trimming while creating millions of files

Product:

[Red Hat Storage] Red Hat Ceph Storage

Reporter:

Patrick Donnelly <pdonnell>

Component:

CephFS

Assignee:

Yan, Zheng <zyan>

Status:

CLOSED ERRATA

QA Contact:

Ramakrishnan Periyasamy <rperiyas>

Severity:

medium

Docs Contact:

Erin Donnelly <edonnell>

Priority:

high

Version:

3.0

CC:

ceph-eng-bugs, ceph-qe-bugs, edonnell, hnallurv, john.spray, kdreyer, pdonnell, zyan

Target Milestone:

Target Release:

3.0

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

RHEL: ceph-12.2.4-5.el7cp Ubuntu: ceph_12.2.1-6redhat1xenial

Doc Type:

Bug Fix

Doc Text:

Previously, Metadata Server (MDS) daemons could get behind on trimming for large metadata workloads in larger clusters. With this update, MDS no longer gets behind on trimming for large metadata workloads

Story Points:

---

Clone Of:

Environment:

Last Closed:

2018-04-26 17:38:39 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

1548067

Bug Blocks:

1557269

Attachments:

Description	Flags
ceph status logs	none
ceph status after upgrading cluster to ceph version 12.2.4-6.el7cp	none

Comment 5 Yan, Zheng 2017-11-21 13:59:33 UTC

looks good

Comment 12 Ramakrishnan Periyasamy 2018-03-19 10:48:57 UTC

Provided qa_ack, clearing needinfo tag.

Comment 17 Ramakrishnan Periyasamy 2018-04-03 06:28:52 UTC

Created attachment 1416603 [details]
ceph status logs

Comment 19 Yan, Zheng 2018-04-04 02:28:45 UTC

It seems that there were lots slow osd requests. I don't think bumping mds_log_max_segments will help in this case.

Comment 20 Ramakrishnan Periyasamy 2018-04-04 05:19:47 UTC

Moving this bug to assigned state, since MDS journal trimming is slow.

Comment 21 Yan, Zheng 2018-04-04 07:29:49 UTC

Sorry, I spoke too early. I guest set mds_log_max_segments to around 100 may silent the warning.

Comment 23 Yan, Zheng 2018-04-05 04:24:16 UTC

The test always already had slow osd request warnings. In my option, having mds behind on trim warnings is somewhat expected.

Comment 24 Ramakrishnan Periyasamy 2018-04-05 08:36:51 UTC

In existing setup updated "mds_log_max_segments": "128" in all MDS, so far after running IO for more than 5hrs not seeing the trim message in logs.

In downstream setup always getting lot of slow request messages, is there any way it can be avoided ? do we have any upstream bug for this ?

Comment 29 Ramakrishnan Periyasamy 2018-04-09 09:13:35 UTC

Created attachment 1419159 [details]
ceph status after upgrading cluster to ceph version 12.2.4-6.el7cp

Comment 30 Ramakrishnan Periyasamy 2018-04-09 09:14:33 UTC

Moving this bug to verified state.

Not observed trim notifications in ceph status

[root@magna113 ceph]# ceph --admin-daemon ceph-mds.magna113.asok config get mds_log_max_segments 
{
    "mds_log_max_segments": "128"
}

Comment 34 errata-xmlrpc 2018-04-26 17:38:39 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1259