Description of problem: Deleted a directory which has sub directories and files using "rm -rf" which is slow. The behaviour is same on Kernel and Fuse mount. Created directory "level035000" using Crefi Tool and created ~4500 directories. After creation of the directory performed rename, chmod,chgrp,chown,etc operations on the sub directories and files. Commands used are below. Crefi Command To create: "sudo python crefi.py --fop create --multi -b 4500 -d 1 -n 10 -t binary --random --min=120K --max=256K /mnt/cephfs/level035000/" Crefi command to rename/symlink/hardlink: "for i in create rename chmod chgrp chown setxattr symlink hardlink ; do sudo python crefi.py --fop $i --multi -b 4500 -d 1 -n 10 -t binary /mnt/cephfs/level035000/; done" Tried deleting the directory "sudo rm -rf level035000" felt it is slow. Another folder similar to above scenario took almost 3.7 hrs of command execution time. Total directory size was 15G. [ubuntu@host086 cephfs]$ time sudo rm -rf level015000 real 222m12.605s user 0m17.085s sys 0m53.402s Version-Release number of selected component (if applicable): ceph: 12.2.0-1.el7cp (b661348f156f148d764b998b65b90451f096cb27) luminous (rc) How reproducible: 3/3 Steps to Reproduce: NA Actual results: NA Expected results: NA Additional info: Created a directory "fio" and ran fio Tool from 3 clients which created 180GB of data in the folder. When tried to remove fio directory "rm -rf fio" command execution completed within a minute. [ubuntu@host080 cephfs]$ ll -h | grep G drwxr-xr-x. 1 root root 187G Sep 13 09:31 fio drwxr-xr-x. 1 root root 8.3G Sep 12 08:55 level035000 [ubuntu@host080 cephfs]$ time sudo rm -rf fio real 0m0.239s user 0m0.002s sys 0m0.011s
What is cluster configuration?
Okay, ya that's a bug. Thanks! http://tracker.ceph.com/issues/21406
Sorry, wrong BZ. Ignore prior comment.
"rm -rf" inside kernel mount point is fast. the slowness of "rm -rf" inside fuse mount is likely caused by http://tracker.ceph.com/issues/20938. Besides, debug_mds=20 slows down mds significantly. please retry with debug_mds=0
Hi Zheng, As you mentioned, with debug_mds=0 it is faster but the folder with symlink and hardlink it is slow. Below are some of the observations for folders with bigger file size & folder with lesser file size, symlink and hardlink. It is collected in Fuse Client. ============================================================= "rm -rf" command on normal directory with files. [ubuntu@host070 dir1]$ ll -h total 2.0K drwxr-xr-x. 1 root root 306M Sep 21 18:06 dir2 drwxr-xr-x. 1 root root 160G Oct 5 04:43 dir3 drwxr-xr-x. 1 root root 187G Oct 5 09:37 dir4 drwxr-xr-x. 1 root root 187G Oct 5 08:17 dir5 [ubuntu@host070 dir4]$ ll -h total 187G -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.0.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.1.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.10.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.11.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.12.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.13.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.14.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.15.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.16.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.17.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.18.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.19.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.2.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.20.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.21.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.22.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.23.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.24.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.25.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.26.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.27.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.28.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.29.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.3.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.30.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.31.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.4.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.5.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.6.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.7.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.8.0 -rw-r--r--. 1 root root 5.0G Sep 18 15:27 random-read.9.0 -rw-r--r--. 1 root root 5.0G Sep 18 14:12 random-writes.0.0 -rw-r--r--. 1 root root 1.7G Sep 18 14:32 random-writes.27.0 -rw-r--r--. 1 root root 5.0G Sep 18 14:29 random-writes.28.0 -rw-r--r--. 1 root root 5.0G Sep 18 14:23 random-writes.29.0 -rw-r--r--. 1 root root 5.0G Sep 18 14:19 random-writes.30.0 -rw-r--r--. 1 root root 5.0G Sep 18 14:16 random-writes.31.0 ======Removing the entire 187G data with debug_mds=20=============== [ubuntu@host070 dir1]$ time sudo rm -rf dir4/* real 0m5.299s user 0m0.007s sys 0m2.658s ========Removing the entire 187G data with debug_mds=0============== [ubuntu@host070 dir1]$ time sudo rm -rf dir5/* real 0m2.587s user 0m0.003s sys 0m0.007s =========Removing 372M data with debug_mds=0======================== "rm -rf" command on directory which has symlink and hardlink. [ubuntu@host070 dir3]$ ll -h | grep level0279 drwxr-xr-x. 1 root root 372M Sep 21 23:17 level0279 [ubuntu@host070 dir3]$ cd level0279 [ubuntu@host070 level0279]$ ll -h total 12M -rw---x---. 3 43132 12092 149K Sep 22 00:09 59c44885%%0XB14U302F -r-x----w-. 3 41016 8935 474K Sep 22 00:09 59c44885%%183NMRCS5B -rw--wxrw-. 3 51526 46458 212K Sep 22 00:09 59c44885%%1X983SCKIW --wx--x--x. 3 28659 8887 189K Sep 22 00:09 59c44885%%432NBZP9QH ----r-xrwx. 3 35001 64032 439K Sep 22 00:09 59c44885%%4RF9YN756M ---x---r--. 3 45861 22082 430K Sep 22 00:09 59c44885%%4RQ3FDM4W2 -r---wx--x. 3 5878 20451 422K Sep 22 00:09 59c44885%%5NUV7S5MYO -rw--wx-w-. 3 28651 35053 188K Sep 22 00:09 59c44885%%5TZX2PEBPP -rwxrwx-wx. 3 31620 35639 505K Sep 22 00:09 59c44885%%AS20F0RSN3 --wxr---wx. 3 28392 44139 374K Sep 22 00:09 59c44885%%AYJ60OTFTE ---x-wx-w-. 3 44971 5178 132K Sep 22 00:09 59c44885%%C4KZ9A67XH -r---w-rw-. 3 63946 15811 152K Sep 22 00:09 59c44885%%CP794GT2P6 --wx--xr-x. 3 51744 17793 364K Sep 22 00:09 59c44885%%DHLY9ULKW4 ---xr---w-. 3 3293 38872 213K Sep 22 00:09 59c44885%%DJPAAIB6BD ---x-w----. 3 36892 37832 301K Sep 22 00:09 59c44885%%EC0ZKX1F6T ---xr----x. 3 41108 59912 414K Sep 22 00:09 59c44885%%EMSD9X156H ---xrwxr-x. 3 52623 17613 377K Sep 22 00:09 59c44885%%ESP47FJBU9 -r-xr-xr-x. 3 49811 25065 338K Sep 22 00:09 59c44885%%EUA11EB29C -rw-------. 3 60709 40485 346K Sep 22 00:09 59c44885%%EX40B8PGVQ -r--rw-rw-. 3 36248 53070 339K Sep 22 00:09 59c44885%%FEKG7EQEF1 -r--rwxrwx. 3 53555 40892 199K Sep 22 00:09 59c44885%%G0LQ38YBCK ----rwx--x. 3 48924 29532 430K Sep 22 00:09 59c44885%%HTV1TIVW3H ---xr--r-x. 3 18724 56432 262K Sep 22 00:09 59c44885%%K9IPKHEBA0 -----w-r--. 3 55797 61916 151K Sep 22 00:09 59c44885%%KKB52DMV7P --wx-w---x. 3 11196 41155 429K Sep 22 00:09 59c44885%%KZVWGPZ7A0 -r---wxrwx. 3 52323 39793 247K Sep 22 00:09 59c44885%%MNYC0SKQBG ----r--rw-. 3 40807 41068 248K Sep 22 00:09 59c44885%%MOCWA8J1DW -r-xr---wx. 3 54665 15726 402K Sep 22 00:09 59c44885%%NNNKBZW0GZ -r--rw--wx. 3 53790 65491 163K Sep 22 00:09 59c44885%%NZD9PGYCKV ---xrwxrwx. 3 12889 21464 325K Sep 22 00:09 59c44885%%PKE6SL0RCF ----r-xrw-. 3 36561 1691 327K Sep 22 00:09 59c44885%%PP3Q54PWKK --wxrwxr--. 3 46244 43723 372K Sep 22 00:09 59c44885%%Q7CIEONNHP --wxr---wx. 3 36265 26640 271K Sep 22 00:09 59c44885%%SKBJNOLGFQ -rw---xr--. 3 37457 57553 321K Sep 22 00:09 59c44885%%T0NCZMZ5M0 --w-rw-r-x. 3 5065 21835 168K Sep 22 00:09 59c44885%%VNH3MAK3P8 --w--wxr-x. 3 16289 60364 410K Sep 22 00:09 59c44885%%VXNX66L5GQ -rwxrw-r--. 3 31447 15821 326K Sep 22 00:09 59c44885%%W6D26P8R6K --wx---rw-. 3 33809 15692 150K Sep 22 00:09 59c44885%%WBM12AAPVJ -rw-r-x-w-. 3 32545 39873 480K Sep 22 00:09 59c44885%%WYLS7T8DDH -r--r-x-wx. 3 14503 1487 156K Sep 22 00:09 59c44885%%ZG8B4KT7MV d-------w-. 1 2929 37360 0 Sep 22 12:51 hardlink_to_files dr-xrwxrw-. 1 12023 34582 361M Sep 21 23:17 level1279 dr---w---x. 1 52093 47560 4.5K Sep 22 09:44 symlink_to_files [ubuntu@host070 level0279]$ cd .. [ubuntu@host070 dir3]$ time sudo rm -rf level0279/ real 0m8.030s user 0m0.026s sys 0m0.279s =================Removing 372M data with debug_mds=20======================== [ubuntu@host070 dir3]$ ll -h | grep level0278 drwxr-xr-x. 1 root root 372M Sep 21 23:17 level0278 [ubuntu@host070 dir3]$ cd level0278 [ubuntu@host070 level0278]$ ll -h total 13M ---------x. 3 28433 2166 260K Sep 22 00:09 59c44882%%0UT13SX2DB -rwx------. 3 59254 5450 273K Sep 22 00:09 59c44882%%0VAWIT92V5 -r------w-. 3 63035 20242 143K Sep 22 00:09 59c44882%%2R128EH2ME -rwx---rwx. 3 58399 44531 234K Sep 22 00:09 59c44882%%560WBJRL3A ---x-w-r-x. 3 53004 27127 485K Sep 22 00:09 59c44882%%5JYTVBW2UR ------x--x. 3 25085 28449 131K Sep 22 00:09 59c44882%%68FDULC5SZ -rwx-wx-w-. 3 58919 53311 130K Sep 22 00:09 59c44882%%6PE7FQZR9H --w-r-----. 3 6254 61042 318K Sep 22 00:09 59c44882%%6T5GQDRWFR --w--wxr--. 3 12897 21431 490K Sep 22 00:09 59c44882%%6YZH3NWOBL ----r--r--. 3 5850 42003 153K Sep 22 00:09 59c44882%%7CQZWOIZ32 --wxr---w-. 3 8775 64576 445K Sep 22 00:09 59c44882%%7M1QPJDDPQ -rwx-w--w-. 3 27947 55791 157K Sep 22 00:09 59c44882%%9AQ8A5OUOE --wx-wxr--. 3 40436 15397 344K Sep 22 00:09 59c44882%%9EURCXHYNO --wx-wx---. 3 1629 1775 182K Sep 22 00:09 59c44882%%AKZO30DLU9 -rw---xrw-. 3 20444 1869 171K Sep 22 00:09 59c44882%%AVFWSZDFFV -rw----r--. 3 48014 49300 474K Sep 22 00:09 59c44882%%BIL6NBPKE7 --w-------. 3 46605 64292 270K Sep 22 00:09 59c44882%%CUD8IVF0OS -rwx--xr-x. 3 3240 11537 340K Sep 22 00:09 59c44882%%DV0241U65K --w-r----x. 3 9733 30912 406K Sep 22 00:09 59c44882%%EXVO0MSR8Z -r-xrw--w-. 3 24826 37510 480K Sep 22 00:09 59c44882%%G09JQ6AD8H --wx----w-. 3 3594 1943 290K Sep 22 00:09 59c44882%%H607VW5XSS ---xrwxr--. 3 31316 31973 305K Sep 22 00:09 59c44882%%HJD1UQ5UJO -rwx--xr--. 3 20203 55182 403K Sep 22 00:09 59c44882%%JDXB5RUXN8 -r-xrwx-w-. 3 25090 61487 495K Sep 22 00:09 59c44882%%KLIE01PJFA -----w-rw-. 3 49627 62829 212K Sep 22 00:09 59c44882%%M8TNL0PURE -r--r-x---. 3 31849 63084 310K Sep 22 00:09 59c44882%%M9O81KXRHZ --wxr-xrw-. 3 26670 61713 251K Sep 22 00:09 59c44882%%OLF3YP2C64 -rw-rw--w-. 3 49193 13091 400K Sep 22 00:09 59c44882%%OUEACV20WF --wx-w-rw-. 3 39617 31389 438K Sep 22 00:09 59c44882%%P0JD28PYYB -r--rwxrw-. 3 46701 37357 489K Sep 22 00:09 59c44882%%Q6C2CCQWR2 ---xr--rw-. 3 36814 23522 235K Sep 22 00:09 59c44882%%SVY8FIDQVJ -r--r-xrwx. 3 50763 17334 349K Sep 22 00:09 59c44882%%UAICM5KO4R -----w-rw-. 3 20154 31778 172K Sep 22 00:09 59c44882%%WEUH2766M3 -rw---x--x. 3 47895 57075 438K Sep 22 00:09 59c44882%%XFJZYRXM8J --w-r--r--. 3 3903 27945 221K Sep 22 00:09 59c44882%%YCLX54GGET -rwx-w--wx. 3 55652 28052 500K Sep 22 00:09 59c44882%%YIMVE2U88C -r--r-xrw-. 3 44250 3985 468K Sep 22 00:09 59c44882%%YKSK3VYQG6 -rw-rwxr--. 3 43655 40571 160K Sep 22 00:09 59c44882%%YSO4Q3YET5 -r---wx-wx. 3 5510 2952 155K Sep 22 00:09 59c44882%%ZCB628QMTQ --w--w-r--. 3 17630 13592 357K Sep 22 00:09 59c44882%%ZHPKIOUVKV dr---wx-w-. 1 64268 29259 0 Sep 22 12:50 hardlink_to_files d--x----w-. 1 19554 34419 359M Sep 21 23:17 level1278 drw-rwx-wx. 1 23140 59398 4.5K Sep 22 09:43 symlink_to_files [ubuntu@host070 level0278]$ cd .. [ubuntu@host070 dir3]$ [ubuntu@host070 dir3]$ time sudo rm -rf level0278/ real 1m7.481s user 0m0.018s sys 0m0.484s ====================================================================== -- Ramakrishnan
how long did "rm -rf" the test directory take in your test when debug_mds==0. In my local test, it took about 10 minutes. There were about 200000 files in the test directory. 300 requests per second is not too bad in my option.
Hi Patrick/Zheng, Require doc text for this bug to be release noted. Please provide it at the earliest, we are starting 3.0 RC testing from tomo. Regards, Ramakrishnan
what content should the doc include?
I can't reproduce this performance issue locally. So I can't provide doc for Cause, Consequence, Workaround, Result
I've filed this bug for deletion of directories and sub-directories with softlink and hardlinks was slow using "rm -rf *". Compared to directories with normal files directories with hardlink and softlink files delete operation took more time to completely. @Patrick, Please check either this is ok for release notes, if anything misleading please comment.
symbol link shouldn't affect deletion speend
No time yet to reproduce. Moving.
another way to isolate this is to look at whether rm -rf is slow because of unlink() response time. rm -rf is single-threaded so the response time of unlink() system call in Cephfs will determine the throughput. The client has to talk to the MDS for each file, and it has to talk to the primary OSD to commit the change, and it then has to talk to the secondary OSDs in the PG. The network hops and context switching alone will probably account for much of the 300 files/sec = 3.3 msec/file. This single-threaded performance issue is common to many distributed filesystems, including Gluster and other distributed filesystems that I've studied. It is a consequence of the fact that POSIX requires changes to directories to be *immediately* visible to all clients. If we relax that constraint and allow batched updates to directories, or if we allow parallelized processing, then performance can be improved. For example, this python script removes files from a directory in parallel, try it. https://github.com/parallel-fs-utils/multi-thread-posix/blob/master/parallel-rm-rf.py