Hide Forgot
In the third run, dmesg output also shows: NFS: nfs_lookup_revalidate(net/phy) is invalid NFS: dentry_delete(net/phy, 18) NFS: lookup(net/phy) NFS: dentry_delete(net/phy, 10) Confirming that the problem occurs because the lookup on net/phy failed.
compilebench fails on one NFS client when there are two NFS clients running a cbench run in their own directories on the same export. #################FIRST RUN########################## #################FIRST RUN########################## #################FIRST RUN########################## The client that fails has the following error message. [root@domU-12-31-39-10-66-21 compilebench-0.6]# ./compilebench -D /mnt/shehjart/mount/bench2/ using working directory /mnt/shehjart/mount/bench2/, 30 intial dirs 100 runs Traceback (most recent call last): File "./compilebench", line 567, in <module> dset = dataset(options.sources, rnd) File "./compilebench", line 319, in __init__ self.unpatched = native_order(self.unpatched, "unpatched") File "./compilebench", line 97, in native_order run_directory(tmplist, dirname, "native %s" % tag) File "./compilebench", line 225, in run_directory fp = file(fname, 'a+') IOError: [Errno 526] Unknown error 526: '/mnt/shehjart/mount/bench2/native-0/drivers/pnp/pnpacpi/Kconfig' Wireshark trace shows a NFS lookup reply with NFS3ERR_SERVERFAULT. #################SECOND RUN########################## #################SECOND RUN########################## #################SECOND RUN########################## [root@domU-12-31-39-10-66-21 compilebench-0.6]# ./compilebench -D /mnt/shehjart/mount/c2/ using working directory /mnt/shehjart/mount/c2/, 30 intial dirs 100 runs Traceback (most recent call last): File "./compilebench", line 567, in <module> dset = dataset(options.sources, rnd) File "./compilebench", line 319, in __init__ self.unpatched = native_order(self.unpatched, "unpatched") File "./compilebench", line 97, in native_order run_directory(tmplist, dirname, "native %s" % tag) File "./compilebench", line 225, in run_directory fp = file(fname, 'a+') IOError: [Errno 526] Unknown error 526: '/mnt/shehjart/mount/c2/native-0/include/video/cyblafb.h' Note: No error mesg in the nfs server log for cyblafb.h #################THIRD RUN########################## #################THIRD RUN########################## #################THIRD RUN########################## [root@domU-12-31-39-10-66-21 compilebench-0.6]# ./compilebench -D /mnt/shehjart/mount/c2/ using working directory /mnt/shehjart/mount/c2/, 30 intial dirs 100 runs Traceback (most recent call last): File "./compilebench", line 567, in <module> dset = dataset(options.sources, rnd) File "./compilebench", line 319, in __init__ self.unpatched = native_order(self.unpatched, "unpatched") File "./compilebench", line 97, in native_order run_directory(tmplist, dirname, "native %s" % tag) File "./compilebench", line 225, in run_directory fp = file(fname, 'a+') IOError: [Errno 526] Unknown error 526: '/mnt/shehjart/mount/c2/native-0/drivers/net/phy/broadcom.c' NFS server log contains: [2010-05-21 01:40:36] D [nfs3-helpers.c:2069:nfs3_log_fh_entry_call] nfs-nfsv3: XID: 3d12dccd, LOOKUP: args: FH: hashcount 4, xlid 1, gen 5473591126151662475, ino 15402732, name: phy ############################################################### [2010-05-21 01:40:36] D [dht-layout.c:653:dht_layout_dir_mismatch] dr: /c2/native-0/drivers/net/phy - disk layout missing [2010-05-21 01:40:36] D [dht-common.c:278:dht_revalidate_cbk] dr: mismatching layouts for /c2/native-0/drivers/net/phy ################################################################ [2010-05-21 01:40:36] D [nfs3-helpers.c:2242:nfs3_log_newfh_res] nfs-nfsv3: XID: 3d12dccd, LOOKUP: NFS: 10006(Error occurred on the server or IO Error), POSIX: 116(Stale NFS file handle), FH: hashcount 0, xlid 0, gen 0, ino 0 [2010-05-21 01:40:36] D [rpcsvc.c:1817:rpcsvc_request_create] rpc-service: RPC XID: 3e12dccd, Ver: 2, Program: 100003, ProgVers: 3, Proc: 3 [2010-05-21 01:40:36] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service: Actor found: NFS3 - LOOKUP [2010-05-21 01:40:36] D [nfs3-helpers.c:2069:nfs3_log_fh_entry_call] nfs-nfsv3: XID: 3e12dccd, LOOKUP: args: FH: hashcount 4, xlid 1, gen 5473591126151662475, ino 15402732, name: phy ############################################################### [2010-05-21 01:40:36] D [dht-layout.c:653:dht_layout_dir_mismatch] dr: /c2/native-0/drivers/net/phy - disk layout missing [2010-05-21 01:40:36] D [dht-common.c:278:dht_revalidate_cbk] dr: mismatching layouts for /c2/native-0/drivers/net/phy ################################################################# [2010-05-21 01:40:36] D [nfs3-helpers.c:2242:nfs3_log_newfh_res] nfs-nfsv3: XID: 3e12dccd, LOOKUP: NFS: 10006(Error occurred on the server or IO Error), POSIX: 116(Stale NFS file handle), FH: hashcount 0, xlid 0, gen 0, ino 0 The NFS3ERR_SERVERFAULT is returned due to an error returned by distribute.
Happens on a single client also but not on every run. Whenever it does happen, the problem is because of dht layout mismatch errors.
Latest news on this. I've not been able to reproduce this over: -distribute with no perf translators -distribute with all perf translators Consistently occurs with: dist-repl with all performance translators and only when running two clients each with a compilebench load. The log suggests the following: ######################################### ##### MKDIR is received for lib-64 ###### ######################################### [2010-06-25 02:02:08] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service: Actor found: NFS3 - MKDIR [2010-06-25 02:02:08] D [nfs3-helpers.c:2183:nfs3_log_fh_entry_call] nfs-nfsv3: XID: 46e2ee2a, MKDIR: args: FH: hashcount 4, xlid 0, gen 5486586052216884168, ino 15402804, name: lib- 64 [2010-06-25 02:02:08] T [nfs3.c:2427:nfs3_mkdir] nfs-nfsv3: FH to Volume: dr [2010-06-25 02:02:08] T [nfs3-helpers.c:2884:nfs3_fh_resolve_entry_hard] nfs-nfsv3: FH hard resolution: ino: 15402804, gen: 5486586052216884168, entry: lib-64, hashidx: 0 [2010-06-25 02:02:08] T [nfs3-helpers.c:2892:nfs3_fh_resolve_entry_hard] nfs-nfsv3: Entry needs lookup: /c1/native-0/arch/mips/lib-64 [2010-06-25 02:02:08] T [nfs-fops.c:279:nfs_fop_lookup] nfs: Lookup: /c1/native-0/arch/mips/lib-64 ######################################### ##### The lookup above failsfor lib-64 ## ######################################### [2010-06-25 02:02:08] T [server-protocol.c:2341:server_lookup_cbk] localsrv: 55350: LOOKUP /c1/native-0/arch/mips/lib-64 (0) ==> -1 (No such file or directory) [2010-06-25 02:02:08] T [access-control.c:205:ac_test_access] access-control: Root has access [2010-06-25 02:02:08] T [nfs3-helpers.c:2496:nfs3_fh_resolve_entry_lookup_cbk] nfs-nfsv3: Lookup failed: /c1/native-0/arch/mips/lib-64: No such file or directory [2010-06-25 02:02:08] T [nfs.c:407:nfs_user_create] nfs: uid: 0, gid 0, gids: 0 [2010-06-25 02:02:08] T [nfs-fops.c:697:nfs_fop_mkdir] nfs: Mkdir: /c1/native-0/arch/mips/lib-64 ################################################# ##### DHT does its thing for dire creation ###### ################################################# [2010-06-25 02:02:08] T [dht-layout.c:352:dht_layout_merge] dr: missing disk layout on repl2. err = -1 ..... ..... ..... [2010-06-25 02:02:08] T [nfs.c:407:nfs_user_create] nfs: uid: 0, gid 0, gids: 0 [2010-06-25 02:02:08] T [dht-layout.c:352:dht_layout_merge] dr: missing disk layout on repl1. err = -1 [2010-06-25 02:02:08] T [dht-selfheal.c:354:dht_selfheal_layout_new_directory] dr: gave fix: 0 - 2147483646 on repl1 for /c1/native-0/arch/mips/lib-64 [2010-06-25 02:02:08] T [dht-selfheal.c:354:dht_selfheal_layout_new_directory] dr: gave fix: 2147483647 - 4294967293 on repl2 for /c1/native-0/arch/mips/lib-64 [2010-06-25 02:02:08] T [dht-selfheal.c:174:dht_selfheal_dir_xattr] dr: 2 subvolumes missing xattr for /c1/native-0/arch/mips/lib-64 [2010-06-25 02:02:08] T [dht-selfheal.c:124:dht_selfheal_dir_xattr_persubvol] dr: setting hash range 0 - 2147483646 (type 0) on subvolume repl1 for /c1/native-0/arch/mips/lib-64 [2010-06-25 02:02:08] T [dht-selfheal.c:124:dht_selfheal_dir_xattr_persubvol] dr: setting hash range 2147483647 - 4294967295 (type 0) on subvolume repl2 for /c1/native-0/arch/mips/lib-64 [2010-06-25 02:02:08] D [nfs3-helpers.c:2356:nfs3_log_newfh_res] nfs-nfsv3: XID: ################################################# ##### NFS mkdir finishes ###### ################################################# 46e2ee2a, MKDIR: NFS: 0(Call completed successfully.), POSIX: 117(Structure needs cleaning), FH: hashcount 5, xlid 0, gen 5486586052216894264, ino 15536128 ################################################# ##### NFS mkdir is soon followed by a lookup #### ##### on the created directory #### ################################################# [2010-06-25 02:02:08] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service: Actor found: NFS3 - LOOKUP [2010-06-25 02:02:08] D [nfs3-helpers.c:2183:nfs3_log_fh_entry_call] nfs-nfsv3: XID: 49e2ee2a, LOOKUP: args: FH: hashcount 4, xlid 0, gen 5486586052216884168, ino 15402804, name: lib-64 [2010-06-25 02:02:08] T [nfs3.c:1038:nfs3_lookup] nfs-nfsv3: FH to Volume: dr [2010-06-25 02:02:08] T [nfs3-helpers.c:2884:nfs3_fh_resolve_entry_hard] nfs-nfsv3: FH hard resolution: ino: 15402804, gen: 5486586052216884168, entry: lib-64, hashidx: 0 [2010-06-25 02:02:08] T [nfs.c:407:nfs_user_create] nfs: uid: 0, gid 0, gids: 0 [2010-06-25 02:02:08] T [nfs-fops.c:279:nfs_fop_lookup] nfs: Lookup: /c1/native-0/arch/mips/lib-64 [2010-06-25 02:02:08] D [dht-layout.c:653:dht_layout_dir_mismatch] dr: /c1/native-0/arch/mips/lib-64 - disk layout missing [2010-06-25 02:02:08] D [dht-common.c:278:dht_revalidate_cbk] dr: mismatching layouts for /c1/native-0/arch/mips/lib-64 [2010-06-25 02:02:08] D [nfs3-helpers.c:2356:nfs3_log_newfh_res] nfs-nfsv3: XID: 49e2ee2a, LOOKUP: NFS: 10006(Error occurred on the server or IO Error), POSIX: 116(Stale NFS file handle), FH: hashcount 0, xlid 0, gen 0, ino 0
Going back to the start-up phase when the working directories are created i.e. directories c1 and c2 for the two clients, distribute reports the following when creating the directory. ################################################### ##### MKdir for the c1 directory ################## ################################################### [2010-06-25 01:56:47] D [nfs3-helpers.c:2183:nfs3_log_fh_entry_call] nfs-nfsv3: XID: 7c4aee2a, MKDIR: args: FH: hashcount 0, xlid 0, gen 0, ino 1, name: c1 [2010-06-25 01:56:47] T [nfs3.c:2427:nfs3_mkdir] nfs-nfsv3: FH to Volume: dr [2010-06-25 01:56:47] T [nfs3-helpers.c:2884:nfs3_fh_resolve_entry_hard] nfs-nfsv3: FH hard resolution: ino: 1, gen: 0, entry: c1, hashidx: 0 [2010-06-25 01:56:47] T [nfs3-helpers.c:2892:nfs3_fh_resolve_entry_hard] nfs-nfsv3: Entry needs lookup: /c1 [2010-06-25 01:56:47] T [nfs-fops.c:279:nfs_fop_lookup] nfs: Lookup: /c1 [2010-06-25 01:56:47] T [server-protocol.c:2341:server_lookup_cbk] localsrv: 16: LOOKUP /c1 (0) ==> -1 (No such file or directory) [2010-06-25 01:56:47] T [nfs3-helpers.c:2496:nfs3_fh_resolve_entry_lookup_cbk] nfs-nfsv3: Lookup failed: /c1: No such file or directory [2010-06-25 01:56:47] T [nfs.c:407:nfs_user_create] nfs: uid: 0, gid 0, gids: 0 [2010-06-25 01:56:47] T [nfs-fops.c:697:nfs_fop_mkdir] nfs: Mkdir: /c1 [2010-06-25 01:56:47] D [dht-diskusage.c:71:dht_du_info_cbk] dr: on subvolume 'repl2': avail_percent is: 51.00 and avail_space is: 228606398464 [2010-06-25 01:56:47] D [dht-diskusage.c:71:dht_du_info_cbk] dr: on subvolume 'repl1': avail_percent is: 99.00 and avail_space is: 443556425728 [2010-06-25 01:56:47] T [access-control.c:205:ac_test_access] access-control: Root has access [2010-06-25 01:56:47] T [dht-layout.c:352:dht_layout_merge] dr: missing disk layout on repl2. err = -1 [2010-06-25 01:56:47] T [dht-layout.c:352:dht_layout_merge] dr: missing disk layout on repl1. err = -1 [2010-06-25 01:56:47] T [dht-selfheal.c:354:dht_selfheal_layout_new_directory] dr: gave fix: 0 - 2147483646 on repl1 for /c1 [2010-06-25 01:56:47] T [dht-selfheal.c:354:dht_selfheal_layout_new_directory] dr: gave fix: 2147483647 - 4294967293 on repl2 for /c1 ########################################################## ##### MKdir for the c1 directory reports missing ######### ##### xattrs even after giving fixes ######### ########################################################## [2010-06-25 01:56:47] T [dht-selfheal.c:174:dht_selfheal_dir_xattr] dr: 2 subvolumes missing xattr for /c1 [2010-06-25 01:56:47] T [dht-selfheal.c:124:dht_selfheal_dir_xattr_persubvol] dr: setting hash range 0 - 2147483646 (type 0) on subvolume repl1 for /c1 [2010-06-25 01:56:47] T [dht-selfheal.c:124:dht_selfheal_dir_xattr_persubvol] dr: setting hash range 2147483647 - 4294967295 (type 0) on subvolume repl2 for /c1 ################################################### ##### MKdir for the c1 directory succeeds ######### ################################################### [2010-06-25 01:56:47] D [nfs3-helpers.c:2356:nfs3_log_newfh_res] nfs-nfsv3: XID: 7c4aee2a, MKDIR: NFS: 0(Call completed successfully.), POSIX: 117(Structure needs cleaning), FH: hashcount 1, xlid 0, gen 5486586052216881154, ino 15335436
Following command is the smallest compilebench load that triggers the bug: compilebench -D <working-dir> -r 100 -i 5
Further investigation reveals the code path where it fails: ############################################################### The point where ESTALE is handed to NFS ############################################################### dht-common.c:313: if (local->layout_mismatch) { local->op_ret = -1; local->op_errno = ESTALE; } ############################################################### Dht itself decides to return ESTALE by setting layout_mismatch =1 ############################################################### dht-common.c:271: if (is_dir) { ret = dht_layout_dir_mismatch (this, layout, prev->this, &local->loc, xattr); if (ret != 0) { gf_log (this->name, GF_LOG_DEBUG, "mismatching layouts for %s", local->loc.path); local->layout_mismatch = 1; goto unlock; } } ############################################################### That in turn happens because dht cannot find the required xattr in the dictionary returned by lookup. ############################################################### In dht_layout_dir_mismatch: dict_ret = dict_get_ptr (xattr, "trusted.glusterfs.dht", &disk_layout_raw); if (dict_ret < 0) { if (err == 0) { gf_log (this->name, GF_LOG_DEBUG, "%s - disk layout missing", loc->path); ret = -1; } goto out; }
Setting up gdb to intercept glusterfs process at the point where it'll return an ESTALE: Breakpoint 1, dht_revalidate_cbk (frame=0x10b5d70, cookie=0x10b5b00, this=0x6147b0, op_ret=<value optimized out>, op_errno=0, inode=<value optimized out>, stbuf=0x10b7ed0, xattr=0xfe19d0, postparent=0x10b7fa0) at dht-common.c:280 280 local->layout_mismatch = 1; (gdb) p op_ret $1 = <value optimized out> (gdb) p xattr $2 = (dict_t *) 0xfe19d0 (gdb) p *xattr $3 = {is_static = 0 '\0', hash_size = 1, count = 3, refcount = 3, members = 0x10bb2e0, members_list = 0x10ba1b0, extra_free = 0x10ba130 "", lock = 1} (gdb) p *xattr->members $4 = (data_pair_t *) 0x10ba1b0 ############################################################# The required xattr is not present in the dict ############################################################# (gdb) p **xattr->members $5 = {hash_next = 0x10bb2b0, prev = 0x0, next = 0x10bb2b0, value = 0x10bb330, key = 0x10bb3a0 "glusterfs.open-fd-count"} (gdb) p *xattr->members_list $6 = {hash_next = 0x10bb2b0, prev = 0x0, next = 0x10bb2b0, value = 0x10bb330, key = 0x10bb3a0 "glusterfs.open-fd-count"} (gdb) p *xattr->members_list->next $7 = {hash_next = 0x10baf60, prev = 0x10ba1b0, next = 0x10baf60, value = 0x10b99d0, key = 0x10bb360 "glusterfs.entrylk-count"} (gdb) p *xattr->members_list->next->next $8 = {hash_next = 0x0, prev = 0x10bb2b0, next = 0x0, value = 0x10baf30, key = 0x10b99b0 "glusterfs.inodelk-count"} (gdb) p *xattr->members_list->next->next->next Cannot access memory at address 0x0 (gdb) p *xattr->members_list->prev Cannot access memory at address 0x0 (gdb) bt #0 dht_revalidate_cbk (frame=0x10b5d70, cookie=0x10b5b00, this=0x6147b0, op_ret=<value optimized out>, op_errno=0, inode=<value optimized out>, stbuf=0x10b7ed0, xattr=0xfe19d0, postparent=0x10b7fa0) at dht-common.c:280 #1 0x00002aaaac12badb in afr_lookup_done (frame=0x10b5b00, this=<value optimized out>, lookup_buf=<value optimized out>) at afr.c:660 #2 0x00002aaaac12c77c in afr_revalidate_lookup_cbk (frame=0x10b5b00, cookie=0x0, this=0x612ba0, op_ret=0, op_errno=17517104, inode=0x10b4b90, buf=0x7fff9383f060, xattr=0xfe19d0, postparent=0x7fff9383eff0) at afr.c:936 #3 0x00002aaaabf11c00 in client_lookup_cbk (frame=0xfde3a0, hdr=<value optimized out>, hdrlen=<value optimized out>, iobuf=<value optimized out>) at client-protocol.c:4757 #4 0x00002aaaabefd22a in protocol_client_pollin (this=0x60f080, trans=0x6217b0) at client-protocol.c:6435 #5 0x00002aaaabf0bba2 in notify (this=0x0, event=2, data=0x6217b0) at client-protocol.c:6554 #6 0x00002aaaaacdad13 in xlator_notify (xl=0x60f080, event=2, data=0x6217b0) at xlator.c:919 #7 0x00002aaaad44f0e3 in socket_event_handler (fd=<value optimized out>, idx=2, data=0x6217b0, poll_in=1, poll_out=0, poll_err=0) at socket.c:831 #8 0x00002aaaaacf5dc5 in event_dispatch_epoll (event_pool=0x609360) at event.c:804 #9 0x0000000000404397 in main (argc=8, argv=0x7fff9383ffb8) at glusterfsd.c:1494 (gdb) fr 2 #2 0x00002aaaac12c77c in afr_revalidate_lookup_cbk (frame=0x10b5b00, cookie=0x0, this=0x612ba0, op_ret=0, op_errno=17517104, inode=0x10b4b90, buf=0x7fff9383f060, xattr=0xfe19d0, postparent=0x7fff9383eff0) at afr.c:936 936 afr_lookup_done (frame, this, lookup_buf); ############################################################# Not available at AFR level either. ############################################################# (gdb) p *xattr $9 = {is_static = 0 '\0', hash_size = 1, count = 3, refcount = 3, members = 0x10bb2e0, members_list = 0x10ba1b0, extra_free = 0x10ba130 "", lock = 1} (gdb) p *xattr->members_list $10 = {hash_next = 0x10bb2b0, prev = 0x0, next = 0x10bb2b0, value = 0x10bb330, key = 0x10bb3a0 "glusterfs.open-fd-count"} (gdb) p *xattr->members_list->nexyt There is no member named nexyt. (gdb) p *xattr->members_list->next $11 = {hash_next = 0x10baf60, prev = 0x10ba1b0, next = 0x10baf60, value = 0x10b99d0, key = 0x10bb360 "glusterfs.entrylk-count"} (gdb) p *xattr->members_list->next->next $12 = {hash_next = 0x0, prev = 0x10bb2b0, next = 0x0, value = 0x10baf30, key = 0x10b99b0 "glusterfs.inodelk-count"} (gdb) p *xattr->members_list->next->next->next Cannot access memory at address 0x0 (gdb) bt #0 dht_revalidate_cbk (frame=0x10b5d70, cookie=0x10b5b00, this=0x6147b0, op_ret=<value optimized out>, op_errno=0, inode=<value optimized out>, stbuf=0x10b7ed0, xattr=0xfe19d0, postparent=0x10b7fa0) at dht-common.c:280 #1 0x00002aaaac12badb in afr_lookup_done (frame=0x10b5b00, this=<value optimized out>, lookup_buf=<value optimized out>) at afr.c:660 #2 0x00002aaaac12c77c in afr_revalidate_lookup_cbk (frame=0x10b5b00, cookie=0x0, this=0x612ba0, op_ret=0, op_errno=17517104, inode=0x10b4b90, buf=0x7fff9383f060, xattr=0xfe19d0, postparent=0x7fff9383eff0) at afr.c:936 #3 0x00002aaaabf11c00 in client_lookup_cbk (frame=0xfde3a0, hdr=<value optimized out>, hdrlen=<value optimized out>, iobuf=<value optimized out>) at client-protocol.c:4757 #4 0x00002aaaabefd22a in protocol_client_pollin (this=0x60f080, trans=0x6217b0) at client-protocol.c:6435 #5 0x00002aaaabf0bba2 in notify (this=0x0, event=2, data=0x6217b0) at client-protocol.c:6554 #6 0x00002aaaaacdad13 in xlator_notify (xl=0x60f080, event=2, data=0x6217b0) at xlator.c:919 #7 0x00002aaaad44f0e3 in socket_event_handler (fd=<value optimized out>, idx=2, data=0x6217b0, poll_in=1, poll_out=0, poll_err=0) at socket.c:831 #8 0x00002aaaaacf5dc5 in event_dispatch_epoll (event_pool=0x609360) at event.c:804 #9 0x0000000000404397 in main (argc=8, argv=0x7fff9383ffb8) at glusterfsd.c:1494 (gdb) fr 3 #3 0x00002aaaabf11c00 in client_lookup_cbk (frame=0xfde3a0, hdr=<value optimized out>, hdrlen=<value optimized out>, iobuf=<value optimized out>) at client-protocol.c:4757 4757 STACK_UNWIND (frame, op_ret, op_errno, inode, &stbuf, xattr, (gdb) p *xattr $13 = {is_static = 0 '\0', hash_size = 1, count = 3, refcount = 3, members = 0x10bb2e0, members_list = 0x10ba1b0, extra_free = 0x10ba130 "", lock = 1} ############################################################# Not available at client either. Definitely not being sent by storage/posix. ############################################################# (gdb) p *xattr->members_list $14 = {hash_next = 0x10bb2b0, prev = 0x0, next = 0x10bb2b0, value = 0x10bb330, key = 0x10bb3a0 "glusterfs.open-fd-count"} (gdb) p *xattr->members_list->next $15 = {hash_next = 0x10baf60, prev = 0x10ba1b0, next = 0x10baf60, value = 0x10b99d0, key = 0x10bb360 "glusterfs.entrylk-count"} (gdb) p *xattr->members_list->next->next $16 = {hash_next = 0x0, prev = 0x10bb2b0, next = 0x0, value = 0x10baf30, key = 0x10b99b0 "glusterfs.inodelk-count"} (gdb) p *xattr->members_list->next->next->next Cannot access memory at address 0x0 (gdb) bt #0 dht_revalidate_cbk (frame=0x10b5d70, cookie=0x10b5b00, this=0x6147b0, op_ret=<value optimized out>, op_errno=0, inode=<value optimized out>, stbuf=0x10b7ed0, xattr=0xfe19d0, postparent=0x10b7fa0) at dht-common.c:280 #1 0x00002aaaac12badb in afr_lookup_done (frame=0x10b5b00, this=<value optimized out>, lookup_buf=<value optimized out>) at afr.c:660 #2 0x00002aaaac12c77c in afr_revalidate_lookup_cbk (frame=0x10b5b00, cookie=0x0, this=0x612ba0, op_ret=0, op_errno=17517104, inode=0x10b4b90, buf=0x7fff9383f060, xattr=0xfe19d0, postparent=0x7fff9383eff0) at afr.c:936 #3 0x00002aaaabf11c00 in client_lookup_cbk (frame=0xfde3a0, hdr=<value optimized out>, hdrlen=<value optimized out>, iobuf=<value optimized out>) at client-protocol.c:4757 #4 0x00002aaaabefd22a in protocol_client_pollin (this=0x60f080, trans=0x6217b0) at client-protocol.c:6435 #5 0x00002aaaabf0bba2 in notify (this=0x0, event=2, data=0x6217b0) at client-protocol.c:6554 #6 0x00002aaaaacdad13 in xlator_notify (xl=0x60f080, event=2, data=0x6217b0) at xlator.c:919 #7 0x00002aaaad44f0e3 in socket_event_handler (fd=<value optimized out>, idx=2, data=0x6217b0, poll_in=1, poll_out=0, poll_err=0) at socket.c:831 #8 0x00002aaaaacf5dc5 in event_dispatch_epoll (event_pool=0x609360) at event.c:804 #9 0x0000000000404397 in main (argc=8, argv=0x7fff9383ffb8) at glusterfsd.c:1494 ############################################################# Get the path for the directory being looked up. We'll next check for the xattr on the bricks for confirm. ############################################################# (gdb) p local->loc $17 = {path = 0xa8c450 "/c1/native-0/drivers/scsi/ibmvscsi", name = 0xa8c46a "ibmvscsi", ino = 15348260, inode = 0x10b4b90, parent = 0xdaa2e0} (gdb) p local->loc2 $18 = {path = 0x0, name = 0x0, ino = 0, inode = 0x0, parent = 0x0} (gdb) p this->name $19 = 0x60f060 "client1" ############################################################# Brick1 has it ############################################################# [root@domU-12-31-39-0E-8E-31 testdir1]# getfattr -d -m ".*" c1/native-0/drivers/scsi/ibmvscsi # file: c1/native-0/drivers/scsi/ibmvscsi trusted.glusterfs.dht=0sAAAAAQAAAAB//////////w== trusted.posix1.gen=0sTEbLiwAADIg= ############################################################# So does brick2 ############################################################# [root@domU-12-31-39-0B-E0-82 testdir1]# getfattr -d -m ".*" c1/native-0/drivers/scsi/ibmvscsi # file: c1/native-0/drivers/scsi/ibmvscsi trusted.glusterfs.dht=0sAAAAAQAAAAB//////////w== trusted.posix1.gen=0sTEbLjwAADIg= ############################################################# So does brick3 ############################################################# [root@domU-12-31-39-0E-85-62 testdir1]# getfattr -d -m ".*" c1/native-0/drivers/scsi/ibmvscsi # file: c1/native-0/drivers/scsi/ibmvscsi trusted.glusterfs.dht=0sAAAAAQAAAAAAAAAAf////g== trusted.posix1.gen=0sTEbLkgAADIg= ############################################################# So does brick4 ############################################################# [root@domU-12-31-39-10-6D-E2 testdir1]# getfattr -d -m ".*" c1/native-0/drivers/scsi/ibmvscsi # file: c1/native-0/drivers/scsi/ibmvscsi trusted.glusterfs.dht=0sAAAAAQAAAAAAAAAAf////g== trusted.posix1.gen=0sTEbmcQAADIg=
Avati's solution is to change wait-count in replicate to child-count so that every transaction waits for all children to reply before replicate unwinds to NFS. i.e. Change: xlators/cluster/afr/src/afr.c:2967 } priv->wait_count = 1; To: priv->wait_count = child_count; This change does prevent the ESTALE seen above with compilebench.
(In reply to comment #8) > Avati's solution is to change wait-count in replicate to child-count so that > every transaction waits for all children to reply before replicate unwinds to > NFS. > > i.e. > > Change: > xlators/cluster/afr/src/afr.c:2967 > } > > priv->wait_count = 1; > > To: > priv->wait_count = child_count; > > This change does prevent the ESTALE seen above with compilebench. The change prevents a race condition that occurs because nfs sends a lookup on the created directory even before all the setxattrs callbacks have been received by replicate from the children.
Regression test at: http://test.gluster.com/show_bug.cgi?id=89
PATCH: http://patches.gluster.com/patch/3922 in master (replicate: Set setxattr wait-count to child-count to avoid race)