Bug 867253
| Summary: | DHT : If brick is down (where root directory is hashing) then lookup on nfs mount gives error ' cannot open directory .: Input/output error' | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Rachana Patel <racpatel> | ||||
| Component: | glusterfs | Assignee: | shishir gowda <sgowda> | ||||
| Status: | CLOSED ERRATA | QA Contact: | amainkar | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 2.0 | CC: | amarts, nsathyan, rhs-bugs, saujain, sdharane, sgowda, vagarwal, vbellur | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | glusterfs-3.4.0qa5-1 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2013-09-23 22:33:31 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Rachana Patel
2012-10-17 07:05:37 UTC
Please attach the nfs server logs. Created attachment 629846 [details]
server log
It might be related the above mentioned bug, but there are no similar failure error messages. Seeing these errors in the log. Need input nfs SME's. [2012-10-17 12:19:40.383182] W [client3_1-fops.c:2630:client3_1_lookup_cbk] 0-San_11-client-1: remote op eration failed: Transport endpoint is not connected. Path: / (00000000-0000-0000-0000-000000000001) [2012-10-17 12:19:40.383876] W [client3_1-fops.c:1332:client3_1_access_cbk] 0-San_11-client-1: remote op eration failed: Transport endpoint is not connected [2012-10-17 12:19:40.383925] W [nfs3.c:1491:nfs3svc_access_cbk] 0-nfs: 3bb08886: / => -1 (Transport endpoint is not connected) [2012-10-17 12:19:40.383953] W [nfs3-helpers.c:3389:nfs3_log_common_res] 0-nfs-nfsv3: XID: 3bb08886, ACCESS: NFS: 5(I/O error), POSIX: 107(Transport endpoint is not connected) The reported confirmed that on parallel access on fuse mount listed all Directories and files(not hashed on down sub-vol) the behaviour or both nfs and fuse mount are the same, as of the latest git HEAD 232adb88512274863c9f5ad51569695af80bd6c0. rachana, could you confirm the finding? This is reproducible. Found that dht_access returns the EIO error as-is if one brick is down. re-assigning. http://review.gluster.org/4240 is posted for review upstream, once in, will be backported and merged to downstream CHANGE: http://review.gluster.org/4240 (cluster/dht: send ACCESS call on dir to first_up_subvol if cached is down) merged in master by Vijay Bellur (vbellur) verified this on 3.4.0qa5 Now it is not giving error ' cannot open directory .: Input/output error' and shows files and directory but if hashed sub-volume is down for directory it says ' ls: cannot access d37: Invalid argument' we already have defect for that issue - https://bugzilla.redhat.com/show_bug.cgi?id=856459 so closing this as verified. CHANGE: http://review.gluster.org/4421 (bug-867253.t: do a clean umount at the end) merged in master by Anand Avati (avati) found this defect on -3.3.0.6rhs-4.el6.x86_64 DHT : If brick is down (where root directory is hashing) then lookup on nfs mount gives error ' cannot open directory .: Input/output error' - same as original defect- fuse mount is not giving any error but nfs mount is giving. so reopening defect info :- [root@cutlass tmp]# gluster v status 64-fuse Status of volume: 64-fuse Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick fred.lab.eng.blr.redhat.com:/brick1/6.4-fuse 24025 Y 6063 Brick fan.lab.eng.blr.redhat.com:/brick1/6.4-fuse 24020 N 18113 Brick mia.lab.eng.blr.redhat.com:/brick1/6.4-fuse 24017 Y 27197 NFS Server on localhost 38467 Y 31344 NFS Server on fred.lab.eng.blr.redhat.com 38467 Y 6102 NFS Server on 10.70.34.91 38467 Y 18196 NFS Server on mia.lab.eng.blr.redhat.com 38467 Y 17908 [root@fan tmp]# getfattr -d -m . -e hex /brick1/6.4-fuse/ getfattr: Removing leading '/' from absolute path names # file: brick1/6.4-fuse/ trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x00000001000000000000000055555554 trusted.glusterfs.volume-id=0x8a5d6a111cda4406b818c413e5ae0968 [root@fred tmp]# getfattr -d -m . -e hex /brick1/6.4-fuse/ getfattr: Removing leading '/' from absolute path names # file: brick1/6.4-fuse/ trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x0000000100000000aaaaaaaaffffffff trusted.glusterfs.volume-id=0x8a5d6a111cda4406b818c413e5ae0968 [root@mia tmp]# getfattr -d -m . -e hex /brick1/6.4-fuse/ getfattr: Removing leading '/' from absolute path names # file: brick1/6.4-fuse/ trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x000000010000000055555555aaaaaaa9 trusted.glusterfs.volume-id=0x8a5d6a111cda4406b818c413e5ae0968 nfs mount :- [root@rhsauto037 test2]# ls ls: cannot open directory .: Input/output error fuse mount :- [root@rhsauto037 test1]# ls d12 d15 d2 d23 d30 d33 d36 d40 d43 d48 d50 d9 f11 f16 f2 f22 f26 f3 f32 f37 f42 f46 f5 f9 d13 d16 d21 d25 d31 d34 d39 d41 d45 d49 d6 f1 f14 f18 f20 f23 f27 f30 f33 f38 f43 f47 f6 d14 d17 d22 d28 d32 d35 d4 d42 d47 d5 d7 f10 f15 f19 f21 f25 f29 f31 f36 f40 f45 f49 f7 verified on 3.4.0.4rhs-1.el6rhs.x86_64, working as per expectation, hence marking it as verified Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1262.html |