Bug 1785472
| Summary: | MDS may assert due to creating too many openfiletable objects | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Patrick Donnelly <pdonnell> |
| Component: | CephFS | Assignee: | Yan, Zheng <zyan> |
| Status: | CLOSED ERRATA | QA Contact: | subhash <vpoliset> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.0 | CC: | ceph-eng-bugs, ceph-qe-bugs, hyelloji, kdreyer, sweil, tserlin, zyan |
| Target Milestone: | rc | Flags: | hyelloji:
needinfo-
|
| Target Release: | 4.1 | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | ceph-14.2.8-3.el8, ceph-14.2.8-3.el7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-05-19 17:31:40 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Patrick Donnelly
2019-12-20 00:33:29 UTC
Zheng, please address Kens' comment. Steps to Reproduce:
1. git clone upstream ceph source
2. git checkout -b nautilus origin/nautilus
3. git show 4802061055d676fb8b98cb03a3e441b3537acfc0 | patch -p1 -R #revert fix
4. ./do_cmake.sh; cd build; make -j8 # compile ceph
5. ../src/vstart.sh -n # setup vstart cluster
6. edit ./ceph.conf and add 'osd_deep_scrub_large_omap_object_key_threshold = 1000' to mds section
7. killall ceph-mds; bin/ceph-mds -i a # restart mds
8. mount cephfs at /mnt/ceph # must be kernel mount, you can find admin keyring at ./keyring
9. run following python script with sudo
#!/bin/python
import os
import sys
import time
mnt_path = '/mnt/ceph'
ceph_path = os.getcwd()
def flush_journal():
os.chdir(ceph_path)
os.system('bin/ceph daemon mds.a flush journal')
os.chdir(mnt_path)
fd = -1
os.chdir(mnt_path)
os.system('echo 3 > /proc/sys/vm/drop_caches')
while True:
fds = []
for i in range(1, 1010):
f = os.open("file%d" % i, os.O_CREAT | os.O_RDWR, 0666)
fds.append(f)
if fd >= 0:
os.close(fd)
os.system('echo 3 > /proc/sys/vm/drop_caches')
time.sleep(10)
flush_journal()
fd= os.open("file0", os.O_CREAT | os.O_RDWR, 0666)
flush_journal()
for f in fds:
os.close(f)
os.system('echo 3 > /proc/sys/vm/drop_caches')
time.sleep(10)
flush_journal()
9. run 'rados -p cephfs.a.meta ls | grep mds0_openfiles' in another terminal. keep if object count keep increasing (about every 30s)
Actual Results:
count of 'mds0_openfiles.xxx' objects keeps increasing
Expected Results:
Above python script only create two mds0_openfiles objects.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:2231 |