| Summary: | fileop reports mkdir failed on NFS mount over unfs3booster | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Shehjar Tikoo <shehjart> | ||||
| Component: | distribute | Assignee: | Anand Avati <aavati> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | urgent | ||||||
| Version: | 2.0.5 | CC: | chrisw, gluster-bugs | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | All | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | Type: | --- | |||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
fileop on a recent run using the command line below reported the failure of the mkdir syscall.
[root@client12 mount]# /root/shehjart/iozone3_326/src/current/fileop 1 -f 1 -s 1M -b -w -e -t
Mkdir failed
This is not reproduceable reliably but the situation when this occurred was that just before I started this test, I'd restarted the unfs3booster server expecting things to work without problems, as expected.
A snippet from the log file(see attached..) is below:
[2009-07-31 00:00:41] N [trace.c:1341:trace_mkdir] tr: 19: (path=/fileop_L1_0, ino=0, mode=493)
[2009-07-31 00:00:41] N [trace.c:1341:trace_mkdir] tr-below-wb: 19: (path=/fileop_L1_0, ino=0, mode=493)
[2009-07-31 00:00:41] N [trace.c:1695:trace_statfs] tr-brick1: 19: (loc {path=/, ino=0})
[2009-07-31 00:00:41] N [trace.c:1695:trace_statfs] tr-brick2: 19: (loc {path=/, ino=0})
[2009-07-31 00:00:41] N [trace.c:1695:trace_statfs] tr-brick5: 19: (loc {path=/, ino=0})
[2009-07-31 00:00:41] N [trace.c:1695:trace_statfs] tr-brick4: 19: (loc {path=/, ino=0})
[2009-07-31 00:00:41] D [dht-layout.c:101:dht_layout_search] dist-repl: no subvolume for hash (value) = 893457940
[2009-07-31 00:00:41] D [dht-helper.c:228:dht_subvol_get_hashed] dist-repl: could not find subvolume for path=/fileop_L1_0
[2009-07-31 00:00:41] D [dht-common.c:3003:dht_mkdir] dist-repl: hashed subvol not found for /fileop_L1_0
[2009-07-31 00:00:41] N [trace.c:617:trace_mkdir_cbk] tr-below-wb: 19: (op_ret=-1, op_errno=22, ino=0
[2009-07-31 00:00:41] N [trace.c:617:trace_mkdir_cbk] tr: 19: (op_ret=-1, op_errno=22, ino=0
We've havent experienced exactly similar problems ever again. However, similar errors were seen when the initialization phase in libglusterfsclient finished without all the subvols of a distribute volume being up and ready for use. We have fixed that temporarily by adding a small sleep time and that seems to work fine for now. |
Created attachment 50 [details] Tar file to reproduce bug