Bug 1293798
Summary: | While performing rm -rf getting unknown error 527 on tiered volume | ||||||
---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | RajeshReddy <rmekala> | ||||
Component: | gluster-nfs | Assignee: | Niels de Vos <ndevos> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | storage-qa-internal <storage-qa-internal> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | rhgs-3.1 | CC: | mzywusko, ndevos, rmekala | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-01-20 09:19:53 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
RajeshReddy
2015-12-23 06:28:19 UTC
[root@tettnang ~]# mount | grep rajesh transformers.lab.eng.blr.redhat.com:/system_rajesh/linux on /mnt/system_rajesh/linux type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.70.34.44,mountvers=3,mountport=38465,mountproto=tcp,local_lock=none,addr=10.70.34.44) [root@tettnang ~]# sosreport available @ /home/repo/sosreports/bug.1293798 on rhsqe-repo.lab.eng.blr.redhat.com Is this always reproducible or is it a one time happening? Do you have a test-script (shell or distaf) that I can use to reproduce this? We observed this issue on two setups and we are not hitting the issue consistently I was running fallowing script for i in {1..1000} do tar -xvf linux-3.19.tar.gz sleep 180 rm -rf linux-3.19 done Created attachment 1116320 [details]
Script to setup the environment and reproduce the problem
I'm using this to configure three systems and run the test. Its been running for a long time already, but there have not been any errors yet. I'm checking for the errors with:
tail -f /var/log/glusterfs/nfs.log | grep ' E '
I'll keep this running over night, 1000 iterations are a lot and it seems pretty time consuming.
The snippet from your log shows that the problem happened 8 minutes after the NFS-server was started.
- Did the tarball extraction/removal start immediately after that?
- There was a message "option 'rule' is not recognized", do you know where that came from?
Could you check if my script should do the same actions as what you did on your environment?
It seems that I have reproduced the error: rm: cannot remove ‘linux-3.19/arch/arm/boot/dts/include/dt-bindings’: Unknown error 527 Rather awkward, the whole linux-3.19/ directory is gone, so the 'rm' did finish completely. However, "Volume is disabled" does not occur in the nfs.log. Instead, I have messages like these: W [MSGID: 114031] [client-rpc-fops.c:664:client3_3_unlink_cbk] 0-bz1293798-client-5: remote operation failed [Bad file descriptor] A "Bad file descriptor" should get mapped to ESTALE. It is not acceptable that error 527 is passed back to the NFS-client. Rajesh, please check if you have the same "Bad file descriptor" messages in your nfs.log. No errors in the <volume>-tier.log. In the bricks however: E [MSGID: 113005] [posix.c:1700:posix_unlink] 0-bz1293798-posix: post operation fstat failed on fd=-1 [Bad file descriptor] E [MSGID: 115062] [server-rpc-fops.c:1097:server_unlink_cbk] 0-bz1293798-server: 375880: UNLINK /linux-3.19/tools/testing/selftests/powerpc/copyloops/memcpy_power7.S (4eb32857-ce11-482b-a83d-2a0fd0639eb1/memcpy_power7.S) ==> (Bad file descriptor) [Bad file descriptor] Checking when this was last touched, it seems that upstream has a fix for it: http://review.gluster.org/13034 The chain of bugs that fixes this in a newer build of RHGS: bug 1293256 (Gluster mainline) + bug 1295347 (Gluster 3.7) + bug 1293237 (RHGS, glusterfs-3.7.5-13) This bug was opened while tests were done with an older release (-12). I expect that this problem can not be reproduced with -13 or newer. *** This bug has been marked as a duplicate of bug 1293237 *** May i know On which build you reproduced the problem To reproduce, I used the version you reported this against in comment #0: glusterfs-server-3.7.5-12.el7rhgs.x86_64 With -13 I did not get the "unknown error 527" messages anymore. |