1785472 – MDS may assert due to creating too many openfiletable objects

Bug 1785472 - MDS may assert due to creating too many openfiletable objects

Summary: MDS may assert due to creating too many openfiletable objects

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	CephFS
Sub Component:
Version:	4.0
Hardware:	All
OS:	All
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	4.1
Assignee:	Yan, Zheng
QA Contact:	subhash
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-12-20 00:33 UTC by Patrick Donnelly
Modified:	2020-05-19 17:31 UTC (History)
CC List:	7 users (show)
Fixed In Version:	ceph-14.2.8-3.el8, ceph-14.2.8-3.el7
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-05-19 17:31:40 UTC
Embargoed:
Dependent Products:
Flags:	hyelloji: needinfo-

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Ceph Project Bug Tracker	43348	0	None	None	None	2019-12-20 00:33:28 UTC
Red Hat Product Errata	RHSA-2020:2231	0	None	None	None	2020-05-19 17:31:54 UTC

Description Patrick Donnelly 2019-12-20 00:33:29 UTC

Description of problem:

See: https://tracker.ceph.com/issues/36094

Comment 2 Patrick Donnelly 2020-02-22 00:12:42 UTC

Zheng, please address Kens' comment.

Comment 3 Yan, Zheng 2020-02-24 14:17:34 UTC

Steps to Reproduce:

1. git clone upstream ceph source
2. git checkout -b nautilus origin/nautilus
3. git show 4802061055d676fb8b98cb03a3e441b3537acfc0 | patch -p1 -R #revert fix
4. ./do_cmake.sh; cd build; make -j8 # compile ceph
5. ../src/vstart.sh -n # setup vstart cluster
6. edit ./ceph.conf and add 'osd_deep_scrub_large_omap_object_key_threshold = 1000' to mds section
7. killall ceph-mds; bin/ceph-mds -i a # restart mds
8. mount cephfs at /mnt/ceph # must be kernel mount, you can find admin keyring at ./keyring
9. run following python script with sudo


#!/bin/python
import os
import sys
import time

mnt_path = '/mnt/ceph'
ceph_path = os.getcwd()


def flush_journal():
    os.chdir(ceph_path)
    os.system('bin/ceph daemon mds.a flush journal')
    os.chdir(mnt_path)

fd = -1

os.chdir(mnt_path)
os.system('echo 3 > /proc/sys/vm/drop_caches')

while True:
    fds = []
    for i in range(1, 1010):
        f = os.open("file%d" % i, os.O_CREAT | os.O_RDWR, 0666)
        fds.append(f)

    if fd >= 0:
        os.close(fd)
        os.system('echo 3 > /proc/sys/vm/drop_caches')
        time.sleep(10)
        flush_journal()


    fd= os.open("file0", os.O_CREAT | os.O_RDWR, 0666)
    flush_journal()

    for f in fds:
        os.close(f)
    os.system('echo 3 > /proc/sys/vm/drop_caches')
    time.sleep(10)
    flush_journal()

9. run 'rados -p cephfs.a.meta ls | grep mds0_openfiles' in another terminal. keep if object count keep increasing (about every 30s)


Actual Results:

count of 'mds0_openfiles.xxx' objects keeps increasing


Expected Results:

Above python script only create two mds0_openfiles objects.

Comment 8 errata-xmlrpc 2020-05-19 17:31:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:2231

Note You need to log in before you can comment on or make changes to this bug.