Bug 1451305

Summary: [RGW:NFS]: NFS-Ganesha service stops while running IO's on directories which were not removed when rm operation was done on them.
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Ramakrishnan Periyasamy <rperiyas>
Component: RGWAssignee: Matt Benjamin (redhat) <mbenjamin>
Status: CLOSED ERRATA QA Contact: Ramakrishnan Periyasamy <rperiyas>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 2.3CC: cbodley, ceph-eng-bugs, hnallurv, kbader, mbenjamin, owasserm, sweil, tserlin
Target Milestone: rc   
Target Release: 2.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: RHEL: ceph-10.2.7-20.el7cp Ubuntu: ceph_10.2.7-22redhat1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-06-19 13:33:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ramakrishnan Periyasamy 2017-05-16 11:16:09 UTC
Description of problem:

NFS-Ganesha service stops while running IO's on directories which were not removed when rm operation was done on them.

Observed This problem while trying to verify bug: 1422822

Ran different IO's on those directories from 3 clients using crefi,smallfile IO observed NFS-Ganesha service stop.

Console output of directory delete command failed:
[root@host1 hello]# ll
total 0
drwxr-xr-x. 2 nfsnobody  nfsnobody  0 May  4 16:17 fol1
drwxr-xr-x. 2 4294967294 4294967294 0 May 11 12:29 new_dir
drwxr-xr-x. 2 4294967294 4294967294 0 May 12 17:30 nfs
drwxr-xr-x. 2 4294967294 4294967294 0 May 12 17:30 nfs1
drwxr-xr-x. 2 4294967294 4294967294 0 May 12 17:40 nfs2
drwxrwxrwx. 2 root       root       0 May 11 15:14 s3_bucket1
drwxrwxrwx. 2 root       root       0 May 11 15:17 s3_bucket2
drwxrwxrwx. 2 root       root       0 May 12 17:41 s3_bucket4
[root@host1 hello]# rm -rf *
rm: cannot remove ‘fol1’: Directory not empty
rm: cannot remove ‘new_dir’: Directory not empty
rm: cannot remove ‘nfs’: Directory not empty
rm: cannot remove ‘nfs1’: Directory not empty
rm: cannot remove ‘nfs2’: Directory not empty
rm: cannot remove ‘s3_bucket1’: Directory not empty
rm: cannot remove ‘s3_bucket2’: Directory not empty
rm: cannot remove ‘s3_bucket4’: Directory not empty
[root@host1 hello]# screen -ls

IO tools and commands used on different directories:

client1: 
     1. for i in {create,chmod,setxattr,getattr,ls-l,delete,create,rename,overwrite,append,read,append,overwrite,read,rename,delete-renamed,create,mkdir,readdir,rmdir,stat,
delete}; do ./smallfile_cli.py --operation $i --top /hello/fol1/ --files-per-dir 50 --dirs-per-dir 10 --threads 3 --file-size 128 ; done

client2: 
     1. for i in {create,rename,chmod,chown,truncate}; do ./crefi.py --fop $i --multi -b 10 -d 50 -n 50 -t text --random --min=1k --max=10k /hello/nfs/; done
     2. for i in {create,chmod,setxattr,getattr,ls-l,delete,create,rename,overwrite,append,read,delete,cleanup}; do ./smallfile_cli.py --operation $i --top /hello/new_dir/ --
files-per-dir 50 --dirs-per-dir 10 --threads 3 --file-size 128 ; done

client3: 
     1. for i in {create,chmod,setxattr,getattr,ls-l,delete,create,rename,overwrite,append,read,append,overwrite,read,rename,delete-renamed,create,mkdir,readdir,rmdir,s
tat,delete}; do ./smallfile_cli.py --operation $i --top /hello/s3_bucket1/ --files-per-dir 50 --dirs-per-dir 10 --threads 3 --file-size 128 ; done
     2. for i in {create,rename,chmod,chown,truncate}; do ./crefi.py --fop $i --multi -b 10 -d 50 -n 50 -t text --random --min=1k --max=10k /hello/s3_bucket2/; done
1

NFS service stop observed after starting IO's as described above.

Version-Release number of selected component (if applicable):
ceph: ceph-common-10.2.7-12.el7cp.x86_64
nfs: nfs-ganesha-2.4.5-4.el7cp.x86_64

How reproducible:
2/2

Steps to Reproduce:
Steps provided in description section

Actual results:
NFS service should not stop

Expected results:
Should not observe NFS service stop

Additional info:
Can be duplicate of bug 1422822, since observed this problem on directories which were not removed.

Comment 5 John Poelstra 2017-05-17 15:13:01 UTC
discussed at program meeting and closely related to other NFS bug, will be ON_QA by May 22nd

Comment 25 Ramakrishnan Periyasamy 2017-05-26 15:46:12 UTC
Moving this bug to verified state.

Bug verified in "ceph version 10.2.7-21.el7cp (ebe0fca146985f59e6ab136a860d1f063a26c700)" build.

Steps followed for verification:

1. Started client IO's from 3 clients as mentioned below in different directories 

Client 1:
a. for i in {create,chmod,setxattr,getattr,ls-,delete,create,rename,overwrite, append,read,append,overwrite,read,rename,delete-renamed,create,mkdir,readdir, rmdir,stat,delete}; do ./smallfile_cli.py --operation $i --top /hello/fol1/ --files-per-dir 50 --dirs-per-dir 10 --threads 3 --file-size 128 ; done 

b. for i in {create,rename,chmod,chown,truncate}; do ./crefi.py --fop $i --multi -b 10 -d 50 -n 50 -t text --random --min=1k --max=10k /hello/nfs1/; done 
===========
Client 2:
a.  for i in {create,rename,chmod,chown,truncate}; do ./crefi.py --fop $i --multi -b 10 -d 50 -n 50 -t text --random --min=1k --max=10k /hello/new_dir/; done
===========
Client 3:
a. for i in {create,chmod,setxattr,getattr,ls-l,delete,create,rename,overwrite ,append,read,append,overwrite ,read,rename,delete-renamed,create,mkdir,readdir ,rmdir,stat,delete}; do ./smallfile_cli.py --operation $i --top /hello/nfs/ --files-per-dir 50 --dirs-per-dir 10 --threads 3 --file-size 128 ; done   

b. for i in {create,rename,chmod,chown,truncate}; do ./crefi.py --fop $i --multi -b 10 -d 10 -n 50 -t text --random --min=1k --max=10k /hello/s3_bucke
t2/; done 
===========

2. After completion of above IO's deleted all the directories in the mount point. deletion of data took some 2hrs.

3. Recreated the same set of directories again

4. Started the same IO's as mentioned in Step1.

Note: Above mentioned steps are executed in setup configured with ganesha.conf file as in documentation, no extra pram's added.

No service stop observed after 4hrs of IO's

Comment 27 errata-xmlrpc 2017-06-19 13:33:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1497