Hide Forgot
Created attachment 50 [details] Tar file to reproduce bug
fileop on a recent run using the command line below reported the failure of the mkdir syscall. [root@client12 mount]# /root/shehjart/iozone3_326/src/current/fileop 1 -f 1 -s 1M -b -w -e -t Mkdir failed This is not reproduceable reliably but the situation when this occurred was that just before I started this test, I'd restarted the unfs3booster server expecting things to work without problems, as expected. A snippet from the log file(see attached..) is below: [2009-07-31 00:00:41] N [trace.c:1341:trace_mkdir] tr: 19: (path=/fileop_L1_0, ino=0, mode=493) [2009-07-31 00:00:41] N [trace.c:1341:trace_mkdir] tr-below-wb: 19: (path=/fileop_L1_0, ino=0, mode=493) [2009-07-31 00:00:41] N [trace.c:1695:trace_statfs] tr-brick1: 19: (loc {path=/, ino=0}) [2009-07-31 00:00:41] N [trace.c:1695:trace_statfs] tr-brick2: 19: (loc {path=/, ino=0}) [2009-07-31 00:00:41] N [trace.c:1695:trace_statfs] tr-brick5: 19: (loc {path=/, ino=0}) [2009-07-31 00:00:41] N [trace.c:1695:trace_statfs] tr-brick4: 19: (loc {path=/, ino=0}) [2009-07-31 00:00:41] D [dht-layout.c:101:dht_layout_search] dist-repl: no subvolume for hash (value) = 893457940 [2009-07-31 00:00:41] D [dht-helper.c:228:dht_subvol_get_hashed] dist-repl: could not find subvolume for path=/fileop_L1_0 [2009-07-31 00:00:41] D [dht-common.c:3003:dht_mkdir] dist-repl: hashed subvol not found for /fileop_L1_0 [2009-07-31 00:00:41] N [trace.c:617:trace_mkdir_cbk] tr-below-wb: 19: (op_ret=-1, op_errno=22, ino=0 [2009-07-31 00:00:41] N [trace.c:617:trace_mkdir_cbk] tr: 19: (op_ret=-1, op_errno=22, ino=0
We've havent experienced exactly similar problems ever again. However, similar errors were seen when the initialization phase in libglusterfsclient finished without all the subvols of a distribute volume being up and ready for use. We have fixed that temporarily by adding a small sleep time and that seems to work fine for now.