Bug 1381416
Summary: | Bonnie test suite failed while deleting files with "drastic I/O error (rmdir)". | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Shashank Raj <sraj> |
Component: | nfs-ganesha | Assignee: | Frank Filz <ffilz> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Manisha Saini <msaini> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | rhgs-3.2 | CC: | amukherj, bmohanra, dang, ffilz, jdanek, jthottan, kkeithle, lbailey, ndevos, rcyriac, rhinduja, rhs-bugs, skoduri, storage-qa-internal |
Target Milestone: | --- | Keywords: | Triaged, ZStream |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Release Note | |
Doc Text: |
When a READDIR is issued on a directory that is mutating, the cookie sent as part of request can belong to an already deleted file. In such cases, server returns a BAD_COOKIE error, which was previously not handled by some applications such as bonnie test-suite, and caused those applications to error out. This is expected behavior. Affected applications have updated to handle these errors.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-11-19 06:35:56 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1351530 |
Description
Shashank Raj
2016-10-04 04:05:08 UTC
Following is the RCA(in my environment I included one of Dan's fix[1] ) While running the bonnie test , the one of readdir call got failed due to bad cookie error. Hence test resulted in failure. Cause (assumed) : The last readdir call should return mdcache_avl_lookup_k() MDCACHE_AVL_LAST, but it returning MDCACHE_AVL_NO_ERROR and following readdir results in bad cookie error. [1] https://review.gerrithub.io/#/c/298859/ The readdir calls works based on cookie value. It is kind of offset value. For example, if we need to read 1000 directory entries, client will send lots of readdir call and on each readdir call the value of cookie will be incremented. Here incase of Bonnie test after doing the creates, reads, stats etc it will try to delete the test dir. As part of this removal operation client sends a readdir calls. It goes smoothly and read all the contents & then client again sends another call with a previous cookie value(don't know the reason why client send so), MDCACHE layer(in ganesha) complains it is a BADCOOKIE. bonnie test fails with this error and tries to clean up test dir(again it fails with directory NOTEMPTY error). Please note after this when we directly deletes the same test directory from mount point it works fine. Explanation from Frank why MDCACHE returns BADCOOKIE error( from IRC logs) : MDCACHE populates the entire directory from FSAL in one go, then it feeds protocol layer from dirent cache, when protocol layer has a subsequent request with a non-zero cookie, MDCACHE uses that cookie (which is the dirent hash key) to find the dirent that can break if the directory mutates If understand correctly , cookie used MDCACHE cannot passed to upper FSAL layer(libgfapi). i.e it is not possible to forward a single readdir request(which is failed in MDCACHE) to FSAL layer. Solution should be used readdir call based FSAL_COOKIE Executed bonnie test in the latest build, glusterfs-ganesha-3.8.4-5.el7rhgs.x86_64 nfs-ganesha-gluster-2.4.1-1.el7rhgs.x86_64 And fails with the same error. executing bonnie Using uid:0, gid:0. Writing a byte at a time...done Writing intelligently...done Rewriting...done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...done. Delete files in sequential order...Bonnie: drastic I/O error (rmdir): Directory not empty Cleaning up test directory after error. real 34m8.225s user 0m2.673s sys 1m18.910s bonnie failed 0 Total 0 tests were successful Switching over to the previous working directory Removing /mnt/test_nfs//run1183/ rmdir: failed to remove ‘/mnt/test_nfs//run1183/’: Directory not empty rmdir failed:Directory not empty As mentioned in #C11 , I also see following error with bonnie: Changing to the specified mountpoint /mnt/nfs/run16037 executing bonnie Using uid:0, gid:0. Writing a byte at a time...done Writing intelligently...done Rewriting...done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...done. Delete files in sequential order...Bonnie: drastic I/O error (rmdir): Directory not empty Cleaning up test directory after error. real 27m49.168s user 0m1.347s sys 0m32.871s bonnie failed 0 Total 0 tests were successful Switching over to the previous working directory Removing /mnt/nfs/run16037/ rmdir: failed to remove ‘/mnt/nfs/run16037/’: Directory not empty rmdir failed:Directory not empty When trying manually using rm -rf it removes the directory successfully. may be we need to check with script maintainer or debug more on nfs-client issue. Hi Soumya, I have edited the known issues doc text further. Let me know if there is anything specific that needs to be added as a workaround Hi Bhavana, Doc text looks good to me. address test suite issues? LGTM |