Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 1381416 - Bonnie test suite failed while deleting files with "drastic I/O error (rmdir)". [NEEDINFO]
Bonnie test suite failed while deleting files with "drastic I/O error (rmdir)".
Status: ASSIGNED
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: nfs-ganesha (Show other bugs)
3.2
x86_64 Linux
unspecified Severity high
: ---
: ---
Assigned To: Frank Filz
Manisha Saini
: Triaged, ZStream
Depends On:
Blocks: 1351530
  Show dependency treegraph
 
Reported: 2016-10-04 00:05 EDT by Shashank Raj
Modified: 2018-10-25 13:13 EDT (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: Known Issue
Doc Text:
When a READDIR is issued on directory which is mutating, the cookie sent as part of request could be of the file already deleted. In such cases, server returns BAD_COOKIE error. Due to this, some applications (like bonnie test-suite) which do not handle that error may error out. Workaround: This is an expected behaviour of NFS server and the applications have to be fixed to fix such errors.
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
kkeithle: needinfo? (asoman)


Attachments (Terms of Use)

  None (edit)
Description Shashank Raj 2016-10-04 00:05:08 EDT
Description of problem:

Bonnie test suite failed while deleting files with "drastic I/O error (rmdir)".

Version-Release number of selected component (if applicable):

[root@dhcp42-59 ~]# rpm -qa|grep glusterfs
glusterfs-api-3.8.4-2.el6rhs.x86_64
glusterfs-libs-3.8.4-2.el6rhs.x86_64
glusterfs-cli-3.8.4-2.el6rhs.x86_64
glusterfs-3.8.4-2.el6rhs.x86_64
glusterfs-server-3.8.4-2.el6rhs.x86_64
glusterfs-geo-replication-3.8.4-2.el6rhs.x86_64
glusterfs-client-xlators-3.8.4-2.el6rhs.x86_64
glusterfs-rdma-3.8.4-2.el6rhs.x86_64
glusterfs-fuse-3.8.4-2.el6rhs.x86_64
glusterfs-ganesha-3.8.4-2.el6rhs.x86_64

[root@dhcp42-59 ~]# rpm -qa|grep ganesha
nfs-ganesha-2.4.0-2.el6rhs.x86_64
nfs-ganesha-gluster-2.4.0-2.el6rhs.x86_64
glusterfs-ganesha-3.8.4-2.el6rhs.x86_64


How reproducible:

2/2

Steps to Reproduce:
1.Create a ganesha cluster, create a volume and enable ganesha on it.
2.Mount the volume with vers=4 on client
3.Start executing bonnie test suite.
4.Observe that it fails during deletion of sequential files with below error message:

executing bonnie
Using uid:0, gid:0.
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...Bonnie: drastic I/O error (rmdir): Directory not empty
Cleaning up test directory after error.

real	13m13.703s
user	0m1.448s
sys	0m34.192s
bonnie failed
0
Total 0 tests were successful

5. No relevant error log messages found on server side.
6. Data on the mount point:

[root@Client2 ~]# cd /mnt/nfs1
[root@Client2 nfs1]# ls
run17420
[root@Client2 nfs1]# cd run17420/
[root@Client2 run17420]# ls
Bonnie.17442
[root@Client2 run17420]# 


Some of the files inside mount point:

-rw-------. 1 nobody nobody 0 Oct  4 14:41 0000003f88YnOqrNJWIYl
-rw-------. 1 nobody nobody 0 Oct  4 14:41 0000003f87pXVF
-rw-------. 1 nobody nobody 0 Oct  4 14:41 0000003f86oljSnaoso
-rw-------. 1 nobody nobody 0 Oct  4 14:41 0000003f85VqpcTKC
-rw-------. 1 nobody nobody 0 Oct  4 14:41 0000003f84WJj2FfAosdb3
-rw-------. 1 nobody nobody 0 Oct  4 14:41 0000003f83B7DbTyhp
-rw-------. 1 nobody nobody 0 Oct  4 14:41 0000003f823PnFTN8hD5AK
-rw-------. 1 nobody nobody 0 Oct  4 14:41 0000003f81ymP5B6WWjB
-rw-------. 1 nobody nobody 0 Oct  4 14:41 0000003f80j2vK

7. Manually removing these files works fine.


Actual results:

Bonnie test suite failed while deleting files with "drastic I/O error (rmdir)".

Expected results:

Bonnie test suite should pass without any errors.

Additional info:
Comment 4 Jiffin 2016-10-24 14:02:09 EDT
Following is the RCA(in my environment I included one of Dan's fix[1] )

While running the bonnie test , the one of readdir call got failed due to bad cookie error. Hence test resulted in failure.
Cause (assumed) :
The last readdir call should return mdcache_avl_lookup_k() MDCACHE_AVL_LAST, but it returning MDCACHE_AVL_NO_ERROR and following readdir results in bad cookie error.

[1] https://review.gerrithub.io/#/c/298859/
Comment 7 Jiffin 2016-11-15 07:58:30 EST
The readdir calls works based on cookie value. It is kind of offset value. For example, if we need to read 1000 directory entries, client will send lots of readdir call and on each readdir call the value of cookie will be incremented.

Here incase of Bonnie test after doing the creates, reads, stats etc it will try to delete the test dir. As part of this removal operation client sends a readdir calls. It goes smoothly and read all the contents & then client again sends another call with a previous cookie value(don't know the reason why client send so), MDCACHE layer(in ganesha) complains it is a BADCOOKIE. bonnie test fails with this error and tries to clean up test dir(again it fails with directory NOTEMPTY error). Please note after this when we directly deletes the same test directory from mount point it works fine.

Explanation from Frank why MDCACHE returns BADCOOKIE error( from IRC logs) :
 MDCACHE populates the entire directory from FSAL in one go, then it feeds protocol layer from dirent cache, when protocol layer has a subsequent request with a non-zero cookie, MDCACHE uses that cookie (which is the dirent hash key) to find the dirent that can break if the directory mutates

If understand correctly , cookie used MDCACHE cannot passed to upper FSAL layer(libgfapi). i.e it is not possible to forward a single readdir request(which is failed in MDCACHE) to FSAL layer.

Solution should be used readdir call based FSAL_COOKIE
Comment 11 Arthy Loganathan 2016-11-23 03:34:41 EST
Executed bonnie test in the latest build,

glusterfs-ganesha-3.8.4-5.el7rhgs.x86_64
nfs-ganesha-gluster-2.4.1-1.el7rhgs.x86_64

And fails with the same error.

executing bonnie
Using uid:0, gid:0.
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...Bonnie: drastic I/O error (rmdir): Directory not empty
Cleaning up test directory after error.

real	34m8.225s
user	0m2.673s
sys	1m18.910s
bonnie failed
0
Total 0 tests were successful
Switching over to the previous working directory
Removing /mnt/test_nfs//run1183/
rmdir: failed to remove ‘/mnt/test_nfs//run1183/’: Directory not empty
rmdir failed:Directory not empty
Comment 12 surabhi 2016-11-23 04:02:20 EST
As mentioned in #C11 , I also see following error with bonnie:

Changing to the specified mountpoint
/mnt/nfs/run16037
executing bonnie
Using uid:0, gid:0.
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...Bonnie: drastic I/O error (rmdir): Directory not empty
Cleaning up test directory after error.

real	27m49.168s
user	0m1.347s
sys	0m32.871s
bonnie failed
0
Total 0 tests were successful
Switching over to the previous working directory
Removing /mnt/nfs/run16037/
rmdir: failed to remove ‘/mnt/nfs/run16037/’: Directory not empty
rmdir failed:Directory not empty



When trying manually using rm -rf it removes the directory successfully.
may be we need to check with script maintainer or debug more on nfs-client issue.
Comment 18 Bhavana 2017-03-13 01:19:16 EDT
Hi Soumya,

I have edited the known issues doc text further. Let me know if there is anything specific that needs to be added as a workaround
Comment 19 Soumya Koduri 2017-03-13 04:30:38 EDT
Hi Bhavana,
Doc text looks good to me.
Comment 20 Kaleb KEITHLEY 2017-08-23 09:05:43 EDT
address test suite issues?

Note You need to log in before you can comment on or make changes to this bug.