Bug 1491246 - [CephFS] "rm -rf" is slow on directory which has lot of subdirectories and files.
Summary: [CephFS] "rm -rf" is slow on directory which has lot of subdirectories and fi...
Status: ASSIGNED
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: CephFS
Version: 3.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: z1
: 3.3
Assignee: Patrick Donnelly
QA Contact: Ramakrishnan Periyasamy
Bara Ancincova
URL:
Whiteboard: DevNeeded
Keywords:
Depends On:
Blocks: 1494421
TreeView+ depends on / blocked
 
Reported: 2017-09-13 11:44 UTC by Ramakrishnan Periyasamy
Modified: 2019-06-06 02:25 UTC (History)
9 users (show)

(edit)
.Deleting directories that contain symbolic links is slow

An attempt to delete directories and subdirectories on a Ceph File System that include a number of hard links by using the `rm -rf` command is significantly slower than deleting directories that do not contain any hard links.
Clone Of:
(edit)
Last Closed:


Attachments (Terms of Use)

Description Ramakrishnan Periyasamy 2017-09-13 11:44:58 UTC
Description of problem:
Deleted a directory which has sub directories and files using "rm -rf" which is slow. The behaviour is same on Kernel and Fuse mount.

Created directory "level035000" using Crefi Tool and created ~4500 directories. After creation of the directory performed rename, chmod,chgrp,chown,etc operations on the sub directories and files. Commands used are below. 

Crefi Command To create: "sudo python crefi.py --fop create --multi -b 4500 -d 1 -n 10 -t binary  --random --min=120K --max=256K /mnt/cephfs/level035000/"

Crefi command to rename/symlink/hardlink: "for i in create rename chmod chgrp chown setxattr symlink hardlink ; do sudo python crefi.py --fop $i --multi -b 4500 -d 1 -n 10 -t binary /mnt/cephfs/level035000/; done"

Tried deleting the directory "sudo rm -rf level035000" felt it is slow. 

Another folder similar to above scenario took almost 3.7 hrs of command execution time. Total directory size was 15G.
[ubuntu@host086 cephfs]$ time sudo rm -rf level015000
real    222m12.605s
user    0m17.085s
sys     0m53.402s

Version-Release number of selected component (if applicable):
ceph: 12.2.0-1.el7cp (b661348f156f148d764b998b65b90451f096cb27) luminous (rc)

How reproducible:
3/3

Steps to Reproduce:
NA

Actual results:
NA

Expected results:
NA

Additional info:

Created a directory "fio" and ran fio Tool from 3 clients which created 180GB of data in the folder. When tried to remove fio directory "rm -rf fio" command execution completed within a minute.
[ubuntu@host080 cephfs]$ ll -h | grep G
drwxr-xr-x. 1 root root  187G Sep 13 09:31 fio
drwxr-xr-x. 1 root root  8.3G Sep 12 08:55 level035000
[ubuntu@host080 cephfs]$ time sudo rm -rf fio
real	0m0.239s
user	0m0.002s
sys	0m0.011s

Comment 2 Brett Niver 2017-09-13 14:52:24 UTC
What is cluster configuration?

Comment 4 Patrick Donnelly 2017-09-15 19:16:24 UTC
Okay, ya that's a bug. Thanks!

http://tracker.ceph.com/issues/21406

Comment 5 Patrick Donnelly 2017-09-15 19:17:11 UTC
Sorry, wrong BZ. Ignore prior comment.

Comment 8 Yan, Zheng 2017-09-29 14:50:08 UTC
"rm -rf" inside kernel mount point is fast. the slowness of "rm -rf" inside fuse mount is likely caused by http://tracker.ceph.com/issues/20938.

Besides, debug_mds=20 slows down mds significantly. please retry with debug_mds=0

Comment 9 Ramakrishnan Periyasamy 2017-10-05 10:15:55 UTC
Hi Zheng,
As you mentioned, with debug_mds=0 it is faster but the folder with symlink and hardlink it is slow.

Below are some of the observations for folders with bigger file size & folder with lesser file size, symlink and hardlink. It is collected in Fuse Client.

=============================================================
"rm -rf" command on normal directory with files.

[ubuntu@host070 dir1]$ ll -h
total 2.0K
drwxr-xr-x. 1 root root 306M Sep 21 18:06 dir2
drwxr-xr-x. 1 root root 160G Oct  5 04:43 dir3
drwxr-xr-x. 1 root root 187G Oct  5 09:37 dir4
drwxr-xr-x. 1 root root 187G Oct  5 08:17 dir5

[ubuntu@host070 dir4]$ ll -h
total 187G
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.0.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.1.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.10.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.11.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.12.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.13.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.14.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.15.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.16.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.17.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.18.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.19.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.2.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.20.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.21.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.22.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.23.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.24.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.25.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.26.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.27.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.28.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.29.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.3.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.30.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.31.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.4.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.5.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.6.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.7.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.8.0
-rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.9.0
-rw-r--r--. 1 root root 5.0G Sep 18 14:12 random-writes.0.0
-rw-r--r--. 1 root root 1.7G Sep 18 14:32 random-writes.27.0
-rw-r--r--. 1 root root 5.0G Sep 18 14:29 random-writes.28.0
-rw-r--r--. 1 root root 5.0G Sep 18 14:23 random-writes.29.0
-rw-r--r--. 1 root root 5.0G Sep 18 14:19 random-writes.30.0
-rw-r--r--. 1 root root 5.0G Sep 18 14:16 random-writes.31.0
======Removing the entire 187G data with debug_mds=20===============
[ubuntu@host070 dir1]$ time sudo rm -rf dir4/*
real	0m5.299s
user	0m0.007s
sys	0m2.658s
========Removing the entire 187G data with debug_mds=0==============
[ubuntu@host070 dir1]$ time sudo rm -rf dir5/*
real	0m2.587s
user	0m0.003s
sys	0m0.007s
=========Removing 372M data with debug_mds=0========================
"rm -rf" command on directory which has symlink and hardlink.

[ubuntu@host070 dir3]$ ll -h | grep level0279
drwxr-xr-x. 1 root root 372M Sep 21 23:17 level0279
[ubuntu@host070 dir3]$ cd level0279
[ubuntu@host070 level0279]$ ll -h
total 12M
-rw---x---. 3 43132 12092 149K Sep 22 00:09 59c44885%%0XB14U302F
-r-x----w-. 3 41016  8935 474K Sep 22 00:09 59c44885%%183NMRCS5B
-rw--wxrw-. 3 51526 46458 212K Sep 22 00:09 59c44885%%1X983SCKIW
--wx--x--x. 3 28659  8887 189K Sep 22 00:09 59c44885%%432NBZP9QH
----r-xrwx. 3 35001 64032 439K Sep 22 00:09 59c44885%%4RF9YN756M
---x---r--. 3 45861 22082 430K Sep 22 00:09 59c44885%%4RQ3FDM4W2
-r---wx--x. 3  5878 20451 422K Sep 22 00:09 59c44885%%5NUV7S5MYO
-rw--wx-w-. 3 28651 35053 188K Sep 22 00:09 59c44885%%5TZX2PEBPP
-rwxrwx-wx. 3 31620 35639 505K Sep 22 00:09 59c44885%%AS20F0RSN3
--wxr---wx. 3 28392 44139 374K Sep 22 00:09 59c44885%%AYJ60OTFTE
---x-wx-w-. 3 44971  5178 132K Sep 22 00:09 59c44885%%C4KZ9A67XH
-r---w-rw-. 3 63946 15811 152K Sep 22 00:09 59c44885%%CP794GT2P6
--wx--xr-x. 3 51744 17793 364K Sep 22 00:09 59c44885%%DHLY9ULKW4
---xr---w-. 3  3293 38872 213K Sep 22 00:09 59c44885%%DJPAAIB6BD
---x-w----. 3 36892 37832 301K Sep 22 00:09 59c44885%%EC0ZKX1F6T
---xr----x. 3 41108 59912 414K Sep 22 00:09 59c44885%%EMSD9X156H
---xrwxr-x. 3 52623 17613 377K Sep 22 00:09 59c44885%%ESP47FJBU9
-r-xr-xr-x. 3 49811 25065 338K Sep 22 00:09 59c44885%%EUA11EB29C
-rw-------. 3 60709 40485 346K Sep 22 00:09 59c44885%%EX40B8PGVQ
-r--rw-rw-. 3 36248 53070 339K Sep 22 00:09 59c44885%%FEKG7EQEF1
-r--rwxrwx. 3 53555 40892 199K Sep 22 00:09 59c44885%%G0LQ38YBCK
----rwx--x. 3 48924 29532 430K Sep 22 00:09 59c44885%%HTV1TIVW3H
---xr--r-x. 3 18724 56432 262K Sep 22 00:09 59c44885%%K9IPKHEBA0
-----w-r--. 3 55797 61916 151K Sep 22 00:09 59c44885%%KKB52DMV7P
--wx-w---x. 3 11196 41155 429K Sep 22 00:09 59c44885%%KZVWGPZ7A0
-r---wxrwx. 3 52323 39793 247K Sep 22 00:09 59c44885%%MNYC0SKQBG
----r--rw-. 3 40807 41068 248K Sep 22 00:09 59c44885%%MOCWA8J1DW
-r-xr---wx. 3 54665 15726 402K Sep 22 00:09 59c44885%%NNNKBZW0GZ
-r--rw--wx. 3 53790 65491 163K Sep 22 00:09 59c44885%%NZD9PGYCKV
---xrwxrwx. 3 12889 21464 325K Sep 22 00:09 59c44885%%PKE6SL0RCF
----r-xrw-. 3 36561  1691 327K Sep 22 00:09 59c44885%%PP3Q54PWKK
--wxrwxr--. 3 46244 43723 372K Sep 22 00:09 59c44885%%Q7CIEONNHP
--wxr---wx. 3 36265 26640 271K Sep 22 00:09 59c44885%%SKBJNOLGFQ
-rw---xr--. 3 37457 57553 321K Sep 22 00:09 59c44885%%T0NCZMZ5M0
--w-rw-r-x. 3  5065 21835 168K Sep 22 00:09 59c44885%%VNH3MAK3P8
--w--wxr-x. 3 16289 60364 410K Sep 22 00:09 59c44885%%VXNX66L5GQ
-rwxrw-r--. 3 31447 15821 326K Sep 22 00:09 59c44885%%W6D26P8R6K
--wx---rw-. 3 33809 15692 150K Sep 22 00:09 59c44885%%WBM12AAPVJ
-rw-r-x-w-. 3 32545 39873 480K Sep 22 00:09 59c44885%%WYLS7T8DDH
-r--r-x-wx. 3 14503  1487 156K Sep 22 00:09 59c44885%%ZG8B4KT7MV
d-------w-. 1  2929 37360    0 Sep 22 12:51 hardlink_to_files
dr-xrwxrw-. 1 12023 34582 361M Sep 21 23:17 level1279
dr---w---x. 1 52093 47560 4.5K Sep 22 09:44 symlink_to_files
[ubuntu@host070 level0279]$ cd ..
[ubuntu@host070 dir3]$ time sudo rm -rf level0279/
real    0m8.030s
user    0m0.026s
sys     0m0.279s
=================Removing 372M data with debug_mds=20========================
[ubuntu@host070 dir3]$ ll -h | grep level0278
drwxr-xr-x. 1 root root 372M Sep 21 23:17 level0278
[ubuntu@host070 dir3]$ cd level0278
[ubuntu@host070 level0278]$ ll -h
total 13M
---------x. 3 28433  2166 260K Sep 22 00:09 59c44882%%0UT13SX2DB
-rwx------. 3 59254  5450 273K Sep 22 00:09 59c44882%%0VAWIT92V5
-r------w-. 3 63035 20242 143K Sep 22 00:09 59c44882%%2R128EH2ME
-rwx---rwx. 3 58399 44531 234K Sep 22 00:09 59c44882%%560WBJRL3A
---x-w-r-x. 3 53004 27127 485K Sep 22 00:09 59c44882%%5JYTVBW2UR
------x--x. 3 25085 28449 131K Sep 22 00:09 59c44882%%68FDULC5SZ
-rwx-wx-w-. 3 58919 53311 130K Sep 22 00:09 59c44882%%6PE7FQZR9H
--w-r-----. 3  6254 61042 318K Sep 22 00:09 59c44882%%6T5GQDRWFR
--w--wxr--. 3 12897 21431 490K Sep 22 00:09 59c44882%%6YZH3NWOBL
----r--r--. 3  5850 42003 153K Sep 22 00:09 59c44882%%7CQZWOIZ32
--wxr---w-. 3  8775 64576 445K Sep 22 00:09 59c44882%%7M1QPJDDPQ
-rwx-w--w-. 3 27947 55791 157K Sep 22 00:09 59c44882%%9AQ8A5OUOE
--wx-wxr--. 3 40436 15397 344K Sep 22 00:09 59c44882%%9EURCXHYNO
--wx-wx---. 3  1629  1775 182K Sep 22 00:09 59c44882%%AKZO30DLU9
-rw---xrw-. 3 20444  1869 171K Sep 22 00:09 59c44882%%AVFWSZDFFV
-rw----r--. 3 48014 49300 474K Sep 22 00:09 59c44882%%BIL6NBPKE7
--w-------. 3 46605 64292 270K Sep 22 00:09 59c44882%%CUD8IVF0OS
-rwx--xr-x. 3  3240 11537 340K Sep 22 00:09 59c44882%%DV0241U65K
--w-r----x. 3  9733 30912 406K Sep 22 00:09 59c44882%%EXVO0MSR8Z
-r-xrw--w-. 3 24826 37510 480K Sep 22 00:09 59c44882%%G09JQ6AD8H
--wx----w-. 3  3594  1943 290K Sep 22 00:09 59c44882%%H607VW5XSS
---xrwxr--. 3 31316 31973 305K Sep 22 00:09 59c44882%%HJD1UQ5UJO
-rwx--xr--. 3 20203 55182 403K Sep 22 00:09 59c44882%%JDXB5RUXN8
-r-xrwx-w-. 3 25090 61487 495K Sep 22 00:09 59c44882%%KLIE01PJFA
-----w-rw-. 3 49627 62829 212K Sep 22 00:09 59c44882%%M8TNL0PURE
-r--r-x---. 3 31849 63084 310K Sep 22 00:09 59c44882%%M9O81KXRHZ
--wxr-xrw-. 3 26670 61713 251K Sep 22 00:09 59c44882%%OLF3YP2C64
-rw-rw--w-. 3 49193 13091 400K Sep 22 00:09 59c44882%%OUEACV20WF
--wx-w-rw-. 3 39617 31389 438K Sep 22 00:09 59c44882%%P0JD28PYYB
-r--rwxrw-. 3 46701 37357 489K Sep 22 00:09 59c44882%%Q6C2CCQWR2
---xr--rw-. 3 36814 23522 235K Sep 22 00:09 59c44882%%SVY8FIDQVJ
-r--r-xrwx. 3 50763 17334 349K Sep 22 00:09 59c44882%%UAICM5KO4R
-----w-rw-. 3 20154 31778 172K Sep 22 00:09 59c44882%%WEUH2766M3
-rw---x--x. 3 47895 57075 438K Sep 22 00:09 59c44882%%XFJZYRXM8J
--w-r--r--. 3  3903 27945 221K Sep 22 00:09 59c44882%%YCLX54GGET
-rwx-w--wx. 3 55652 28052 500K Sep 22 00:09 59c44882%%YIMVE2U88C
-r--r-xrw-. 3 44250  3985 468K Sep 22 00:09 59c44882%%YKSK3VYQG6
-rw-rwxr--. 3 43655 40571 160K Sep 22 00:09 59c44882%%YSO4Q3YET5
-r---wx-wx. 3  5510  2952 155K Sep 22 00:09 59c44882%%ZCB628QMTQ
--w--w-r--. 3 17630 13592 357K Sep 22 00:09 59c44882%%ZHPKIOUVKV
dr---wx-w-. 1 64268 29259    0 Sep 22 12:50 hardlink_to_files
d--x----w-. 1 19554 34419 359M Sep 21 23:17 level1278
drw-rwx-wx. 1 23140 59398 4.5K Sep 22 09:43 symlink_to_files
[ubuntu@host070 level0278]$ cd ..
[ubuntu@host070 dir3]$
[ubuntu@host070 dir3]$ time sudo rm -rf level0278/
real    1m7.481s
user    0m0.018s
sys     0m0.484s
======================================================================

-- Ramakrishnan

Comment 14 Yan, Zheng 2017-10-09 14:39:05 UTC
how long did "rm -rf" the test directory take in your test when debug_mds==0. In my local test, it took about 10 minutes. There were about 200000 files in the test directory. 300 requests per second is not too bad in my option.

Comment 16 Ramakrishnan Periyasamy 2017-11-16 14:44:21 UTC
Hi Patrick/Zheng,

Require doc text for this bug to be release noted. Please provide it at the earliest, we are starting 3.0 RC testing from tomo.

Regards,
Ramakrishnan

Comment 17 Yan, Zheng 2017-11-17 00:59:18 UTC
what content should the doc include?

Comment 19 Yan, Zheng 2017-11-20 11:00:31 UTC
I can't reproduce this performance issue locally. So I can't provide doc for Cause, Consequence, Workaround, Result

Comment 21 Ramakrishnan Periyasamy 2017-11-20 11:26:30 UTC
I've filed this bug for deletion of directories and sub-directories with softlink and hardlinks was slow using "rm -rf *". Compared to directories with normal files directories with hardlink and softlink files delete operation took more time to completely.

@Patrick, Please check either this is ok for release notes, if anything misleading please comment.

Comment 23 Yan, Zheng 2017-11-22 13:27:22 UTC
symbol link shouldn't affect deletion speend

Comment 32 Patrick Donnelly 2018-07-11 21:49:25 UTC
No time yet to reproduce. Moving.

Comment 33 Ben England 2019-02-21 18:18:19 UTC
another way to isolate this is to look at whether rm -rf is slow because of unlink() response time.  rm -rf is single-threaded so the response time of unlink() system call in Cephfs will determine the throughput.   The client has to talk to the MDS for each file, and it has to talk to the primary OSD to commit the change, and it then has to talk to the secondary OSDs in the PG.  The network hops and context switching alone will probably account for much of the 300 files/sec = 3.3 msec/file.  This single-threaded performance issue is common to many distributed filesystems, including Gluster and other distributed filesystems that I've studied.  It is a consequence of the fact that POSIX requires changes to directories to be *immediately* visible to all clients.  If we relax that constraint and allow batched updates to directories, or if we allow parallelized processing, then performance can be improved.  For example, this python script removes files from a directory in parallel, try it.

https://github.com/parallel-fs-utils/multi-thread-posix/blob/master/parallel-rm-rf.py


Note You need to log in before you can comment on or make changes to this bug.