Bug 1122732
Summary: | remove volume hang glustefs | ||||||
---|---|---|---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Peter Auyeung <pauyeung> | ||||
Component: | glusterd | Assignee: | Atin Mukherjee <amukherj> | ||||
Status: | CLOSED EOL | QA Contact: | |||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 3.5.1 | CC: | amukherj, bugs, hgowtham, pauyeung, sasundar | ||||
Target Milestone: | --- | Keywords: | Triaged | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-06-17 15:58:30 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Peter Auyeung
2014-07-23 23:22:32 UTC
Can you attach the complete glusterd logs and if possible let us know the steps carried out before executing volume stop. Peter, Also please add the type of volume, volume configuration and backend information too. It would help us recreate the issue too. Type: Distributed-Replicate Status: Started Number of Bricks: 3 x 2 = 6 Transport-type: tcp XFS on ubuntu 12.04 Total volume size 17TB presented with 30% used. Options Reconfigured: features.quota: on nfs.export-volumes: off nfs.export-dirs: on features.quota-deem-statfs: on nfs.drc: off We share the volume directories over NFS with directory quota Today I tried to remove another volume but this time I disable the quota first. When i disable quota, the quota crawl takes forever. While the crawl still running, I do a gluster volume stop and this time the stop return immediately without hang We have been experiencing quota usage misreporting over time when an NFS export being use a while We starting seeing more and more xattr error and unlink error The log is too big to upload. After removed a volume, we getting these from nfs.log [2014-07-25 00:04:26.011239] W [nfs3-helpers.c:3401:nfs3_log_common_res] 0-nfs-nfsv3: XID: 9d0f866, FSSTAT: NFS: 70(Invalid file handle), POSIX: 14(Bad address) [2014-07-25 00:04:26.011336] E [nfs3.c:301:__nfs3_get_volume_id] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/nfs/server.so(nfs3_fsstat+0x1be) [0x7f3a34534d7e] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/nfs/server.so(nfs3_fsstat_reply+0x3b) [0x7f3a3453465b] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/nfs/server.so(nfs3_request_xlator_deviceid+0x78) [0x7f3a34527078]))) 0-nfs-nfsv3: invalid argument: xl [2014-07-25 00:04:26.012372] W [nfs3-helpers.c:3401:nfs3_log_common_res] 0-nfs-nfsv3: XID: ad0f866, FSSTAT: NFS: 70(Invalid file handle), POSIX: 14(Bad address) Seems like a candidate to be analysed by Quota team, assigning it to Varun Shastry who looks after Quota. Today i run into a similar issue with quota on this by just doing a du against a replica 2 volume. It's similar to what it was running when removing a volume with quota that the find setfattr scanning the volume forever. it was actually from the disk_usage_sync.sh from extra. The du -bc against the replica 2 volume crashed the bricks. Attaching the gluster logs to this case. Created attachment 922240 [details]
gluster log during brick crashed when running du
looks deeper on the scripts seems like that's the setfattr that crashs the bricks. the setattr action that crashing the bricks are very similar to when we disable quota and crash the bricks and also same as hanging removing volume This bug is getting closed because the 3.5 is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release. |