Description of problem: NFS-Ganesha service stops while running IO's on directories which were not removed when rm operation was done on them. Observed This problem while trying to verify bug: 1422822 Ran different IO's on those directories from 3 clients using crefi,smallfile IO observed NFS-Ganesha service stop. Console output of directory delete command failed: [root@host1 hello]# ll total 0 drwxr-xr-x. 2 nfsnobody nfsnobody 0 May 4 16:17 fol1 drwxr-xr-x. 2 4294967294 4294967294 0 May 11 12:29 new_dir drwxr-xr-x. 2 4294967294 4294967294 0 May 12 17:30 nfs drwxr-xr-x. 2 4294967294 4294967294 0 May 12 17:30 nfs1 drwxr-xr-x. 2 4294967294 4294967294 0 May 12 17:40 nfs2 drwxrwxrwx. 2 root root 0 May 11 15:14 s3_bucket1 drwxrwxrwx. 2 root root 0 May 11 15:17 s3_bucket2 drwxrwxrwx. 2 root root 0 May 12 17:41 s3_bucket4 [root@host1 hello]# rm -rf * rm: cannot remove ‘fol1’: Directory not empty rm: cannot remove ‘new_dir’: Directory not empty rm: cannot remove ‘nfs’: Directory not empty rm: cannot remove ‘nfs1’: Directory not empty rm: cannot remove ‘nfs2’: Directory not empty rm: cannot remove ‘s3_bucket1’: Directory not empty rm: cannot remove ‘s3_bucket2’: Directory not empty rm: cannot remove ‘s3_bucket4’: Directory not empty [root@host1 hello]# screen -ls IO tools and commands used on different directories: client1: 1. for i in {create,chmod,setxattr,getattr,ls-l,delete,create,rename,overwrite,append,read,append,overwrite,read,rename,delete-renamed,create,mkdir,readdir,rmdir,stat, delete}; do ./smallfile_cli.py --operation $i --top /hello/fol1/ --files-per-dir 50 --dirs-per-dir 10 --threads 3 --file-size 128 ; done client2: 1. for i in {create,rename,chmod,chown,truncate}; do ./crefi.py --fop $i --multi -b 10 -d 50 -n 50 -t text --random --min=1k --max=10k /hello/nfs/; done 2. for i in {create,chmod,setxattr,getattr,ls-l,delete,create,rename,overwrite,append,read,delete,cleanup}; do ./smallfile_cli.py --operation $i --top /hello/new_dir/ -- files-per-dir 50 --dirs-per-dir 10 --threads 3 --file-size 128 ; done client3: 1. for i in {create,chmod,setxattr,getattr,ls-l,delete,create,rename,overwrite,append,read,append,overwrite,read,rename,delete-renamed,create,mkdir,readdir,rmdir,s tat,delete}; do ./smallfile_cli.py --operation $i --top /hello/s3_bucket1/ --files-per-dir 50 --dirs-per-dir 10 --threads 3 --file-size 128 ; done 2. for i in {create,rename,chmod,chown,truncate}; do ./crefi.py --fop $i --multi -b 10 -d 50 -n 50 -t text --random --min=1k --max=10k /hello/s3_bucket2/; done 1 NFS service stop observed after starting IO's as described above. Version-Release number of selected component (if applicable): ceph: ceph-common-10.2.7-12.el7cp.x86_64 nfs: nfs-ganesha-2.4.5-4.el7cp.x86_64 How reproducible: 2/2 Steps to Reproduce: Steps provided in description section Actual results: NFS service should not stop Expected results: Should not observe NFS service stop Additional info: Can be duplicate of bug 1422822, since observed this problem on directories which were not removed.
discussed at program meeting and closely related to other NFS bug, will be ON_QA by May 22nd
Moving this bug to verified state. Bug verified in "ceph version 10.2.7-21.el7cp (ebe0fca146985f59e6ab136a860d1f063a26c700)" build. Steps followed for verification: 1. Started client IO's from 3 clients as mentioned below in different directories Client 1: a. for i in {create,chmod,setxattr,getattr,ls-,delete,create,rename,overwrite, append,read,append,overwrite,read,rename,delete-renamed,create,mkdir,readdir, rmdir,stat,delete}; do ./smallfile_cli.py --operation $i --top /hello/fol1/ --files-per-dir 50 --dirs-per-dir 10 --threads 3 --file-size 128 ; done b. for i in {create,rename,chmod,chown,truncate}; do ./crefi.py --fop $i --multi -b 10 -d 50 -n 50 -t text --random --min=1k --max=10k /hello/nfs1/; done =========== Client 2: a. for i in {create,rename,chmod,chown,truncate}; do ./crefi.py --fop $i --multi -b 10 -d 50 -n 50 -t text --random --min=1k --max=10k /hello/new_dir/; done =========== Client 3: a. for i in {create,chmod,setxattr,getattr,ls-l,delete,create,rename,overwrite ,append,read,append,overwrite ,read,rename,delete-renamed,create,mkdir,readdir ,rmdir,stat,delete}; do ./smallfile_cli.py --operation $i --top /hello/nfs/ --files-per-dir 50 --dirs-per-dir 10 --threads 3 --file-size 128 ; done b. for i in {create,rename,chmod,chown,truncate}; do ./crefi.py --fop $i --multi -b 10 -d 10 -n 50 -t text --random --min=1k --max=10k /hello/s3_bucke t2/; done =========== 2. After completion of above IO's deleted all the directories in the mount point. deletion of data took some 2hrs. 3. Recreated the same set of directories again 4. Started the same IO's as mentioned in Step1. Note: Above mentioned steps are executed in setup configured with ganesha.conf file as in documentation, no extra pram's added. No service stop observed after 4hrs of IO's
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1497