Hide Forgot
FWIW, know that the previous run of the same volfile had also crashed due to a bug in nfsx, which could've left the replicate state in disarray.
User described the situation as follows: The following action of steps crashes glusterfs: - take down replica glusterfsd - bring it back - access storage on those replicas over nfs The log is here is annotions: ================================================================================ Version : glusterfs nfs-beta-rc5 built on May 19 2010 04:27:21 git: v3.0.0-213-gf424a88 Starting Time: 2010-05-20 19:48:54 Command line : glusterfs -f /root/nfs_triple.vol -l /tmp/nfs_triple.log PID : 4057 System name : Linux Nodename : gluster-node-01 Kernel Release : 2.6.18-164.el5 Hardware Identifier: x86_64 Given volfile: +------------------------------------------------------------------------------+ 1: volume node-11 2: type protocol/client 3: option transport-type tcp 4: option remote-port 6102 5: option remote-host 10.6.145.114 6: option remote-subvolume posix-1-locked-iot 7: end-volume 8: volume node-12 9: type protocol/client 10: option transport-type tcp 11: option remote-port 6102 12: option remote-host 10.6.145.114 13: option remote-subvolume posix-2-locked-iot 14: end-volume 15: volume node-13 16: type protocol/client 17: option transport-type tcp 18: option remote-port 6102 19: option remote-host 10.6.145.114 20: option remote-subvolume posix-3-locked-iot 21: end-volume 22: 23: volume node-21 24: type protocol/client 25: option transport-type tcp 26: option remote-port 6102 27: option remote-host 10.6.145.116 28: option remote-subvolume posix-1-locked-iot 29: end-volume 30: volume node-22 31: type protocol/client 32: option transport-type tcp 33: option remote-port 6102 34: option remote-host 10.6.145.116 35: option remote-subvolume posix-2-locked-iot 36: end-volume 37: volume node-23 38: type protocol/client 39: option transport-type tcp 40: option remote-port 6102 41: option remote-host 10.6.145.116 42: option remote-subvolume posix-3-locked-iot 43: end-volume 44: 45: volume node-31 46: type protocol/client 47: option transport-type tcp 48: option remote-port 6102 49: option remote-host 10.6.145.118 50: option remote-subvolume posix-1-locked-iot 51: end-volume 52: volume node-32 53: type protocol/client 54: option transport-type tcp 55: option remote-port 6102 56: option remote-host 10.6.145.118 57: option remote-subvolume posix-2-locked-iot 58: end-volume 59: volume node-33 60: type protocol/client 61: option transport-type tcp 62: option remote-port 6102 63: option remote-host 10.6.145.118 64: option remote-subvolume posix-3-locked-iot 65: end-volume 66: 67: volume node-41 68: type protocol/client 69: option transport-type tcp 70: option remote-port 6102 71: option remote-host 10.6.145.120 72: option remote-subvolume posix-1-locked-iot 73: end-volume 74: volume node-42 75: type protocol/client 76: option transport-type tcp 77: option remote-port 6102 78: option remote-host 10.6.145.120 79: option remote-subvolume posix-2-locked-iot 80: end-volume 81: volume node-43 82: type protocol/client 83: option transport-type tcp 84: option remote-port 6102 85: option remote-host 10.6.145.120 86: option remote-subvolume posix-3-locked-iot 87: end-volume 88: 89: volume mirror-01 90: type cluster/replicate 91: subvolumes node-11 node-21 node-31 92: end-volume 93: volume mirror-02 94: type cluster/replicate 95: subvolumes node-22 node-32 node-42 96: end-volume 97: volume mirror-03 98: type cluster/replicate 99: subvolumes node-33 node-43 node-12 100: end-volume 101: volume mirror-04 102: type cluster/replicate 103: subvolumes node-41 node-13 node-23 104: end-volume 105: 106: volume distribute 107: type cluster/distribute 108: subvolumes mirror-01 mirror-02 mirror-03 mirror-04 109: end-volume 110: 111: volume nfs-server 112: type nfs/server 113: subvolumes distribute 114: option rpc-auth.addr.allow * 115: end-volume +------------------------------------------------------------------------------+ [2010-05-20 19:48:54] N [glusterfsd.c:1477:main] glusterfs: Successfully started [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-11: Connected to 10.6.145.114:6102, attached to remote volume 'posix-1-locked-iot'. [2010-05-20 19:48:54] N [afr.c:2636:notify] mirror-01: Subvolume 'node-11' came back up; going online. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-11: Connected to 10.6.145.114:6102, attached to remote volume 'posix-1-locked-iot'. [2010-05-20 19:48:54] N [afr.c:2636:notify] mirror-01: Subvolume 'node-11' came back up; going online. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-42: Connected to 10.6.145.120:6102, attached to remote volume 'posix-2-locked-iot'. [2010-05-20 19:48:54] N [afr.c:2636:notify] mirror-02: Subvolume 'node-42' came back up; going online. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-42: Connected to 10.6.145.120:6102, attached to remote volume 'posix-2-locked-iot'. [2010-05-20 19:48:54] N [afr.c:2636:notify] mirror-02: Subvolume 'node-42' came back up; going online. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-33: Connected to 10.6.145.118:6102, attached to remote volume 'posix-3-locked-iot'. [2010-05-20 19:48:54] N [afr.c:2636:notify] mirror-03: Subvolume 'node-33' came back up; going online. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-33: Connected to 10.6.145.118:6102, attached to remote volume 'posix-3-locked-iot'. [2010-05-20 19:48:54] N [afr.c:2636:notify] mirror-03: Subvolume 'node-33' came back up; going online. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-43: Connected to 10.6.145.120:6102, attached to remote volume 'posix-3-locked-iot'. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-43: Connected to 10.6.145.120:6102, attached to remote volume 'posix-3-locked-iot'. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-12: Connected to 10.6.145.114:6102, attached to remote volume 'posix-2-locked-iot'. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-12: Connected to 10.6.145.114:6102, attached to remote volume 'posix-2-locked-iot'. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-41: Connected to 10.6.145.120:6102, attached to remote volume 'posix-1-locked-iot'. [2010-05-20 19:48:54] N [afr.c:2636:notify] mirror-04: Subvolume 'node-41' came back up; going online. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-41: Connected to 10.6.145.120:6102, attached to remote volume 'posix-1-locked-iot'. [2010-05-20 19:48:54] N [afr.c:2636:notify] mirror-04: Subvolume 'node-41' came back up; going online. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-13: Connected to 10.6.145.114:6102, attached to remote volume 'posix-3-locked-iot'. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-13: Connected to 10.6.145.114:6102, attached to remote volume 'posix-3-locked-iot'. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-23: Connected to 10.6.145.116:6102, attached to remote volume 'posix-3-locked-iot'. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-31: Connected to 10.6.145.118:6102, attached to remote volume 'posix-1-locked-iot'. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-31: Connected to 10.6.145.118:6102, attached to remote volume 'posix-1-locked-iot'. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-21: Connected to 10.6.145.116:6102, attached to remote volume 'posix-1-locked-iot'. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-21: Connected to 10.6.145.116:6102, attached to remote volume 'posix-1-locked-iot'. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-22: Connected to 10.6.145.116:6102, attached to remote volume 'posix-2-locked-iot'. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-22: Connected to 10.6.145.116:6102, attached to remote volume 'posix-2-locked-iot'. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-32: Connected to 10.6.145.118:6102, attached to remote volume 'posix-2-locked-iot'. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-32: Connected to 10.6.145.118:6102, attached to remote volume 'posix-2-locked-iot'. [2010-05-20 19:48:54] N [client-protocol.c:5813:client_setvolume_cbk] node-23: Connected to 10.6.145.116:6102, attached to remote volume 'posix-3-locked-iot'. ######################################################################## ###### USER TOOK DOWN A SERVER, EACH SERVER HERE CONTAINS THREE ###### ###### REPLICAS, SO THE DISCONNECTION MESSAGE IS FOR THREE SUBVOS ###### ######################################################################## [2010-05-21 03:33:14] N [client-protocol.c:6558:notify] node-23: disconnected [2010-05-21 03:33:14] N [client-protocol.c:6558:notify] node-22: disconnected [2010-05-21 03:33:14] N [client-protocol.c:6558:notify] node-21: disconnected [2010-05-21 03:33:17] E [socket.c:762:socket_connect_finish] node-23: connection to 10.6.145.116:6102 failed (Connection refused) [2010-05-21 03:33:17] E [socket.c:762:socket_connect_finish] node-23: connection to 10.6.145.116:6102 failed (Connection refused) [2010-05-21 03:33:17] E [socket.c:762:socket_connect_finish] node-22: connection to 10.6.145.116:6102 failed (Connection refused) [2010-05-21 03:33:17] E [socket.c:762:socket_connect_finish] node-22: connection to 10.6.145.116:6102 failed (Connection refused) [2010-05-21 03:33:17] E [socket.c:762:socket_connect_finish] node-21: connection to 10.6.145.116:6102 failed (Connection refused) [2010-05-21 03:33:17] E [socket.c:762:socket_connect_finish] node-21: connection to 10.6.145.116:6102 failed (Connection refused) [2010-05-21 03:37:54] N [client-protocol.c:6558:notify] node-33: disconnected [2010-05-21 03:37:54] N [client-protocol.c:6558:notify] node-32: disconnected [2010-05-21 03:37:54] N [client-protocol.c:6558:notify] node-31: disconnected [2010-05-21 03:37:54] E [socket.c:762:socket_connect_finish] node-32: connection to 10.6.145.118:6102 failed (Connection refused) [2010-05-21 03:37:54] E [socket.c:762:socket_connect_finish] node-32: connection to 10.6.145.118:6102 failed (Connection refused) [2010-05-21 03:37:54] E [socket.c:762:socket_connect_finish] node-33: connection to 10.6.145.118:6102 failed (Connection refused) [2010-05-21 03:37:54] E [socket.c:762:socket_connect_finish] node-33: connection to 10.6.145.118:6102 failed (Connection refused) [2010-05-21 03:37:57] E [socket.c:762:socket_connect_finish] node-31: connection to 10.6.145.118:6102 failed (Connection refused) [2010-05-21 03:37:57] E [socket.c:762:socket_connect_finish] node-31: connection to 10.6.145.118:6102 failed (Connection refused) ##################################### ###### REPLICAS CAME BACK UP ###### ##################################### [2010-05-21 03:46:18] N [client-protocol.c:5813:client_setvolume_cbk] node-23: Connected to 10.6.145.116:6102, attached to remote volume 'posix-3-locked-iot'. [2010-05-21 03:46:18] N [client-protocol.c:5813:client_setvolume_cbk] node-23: Connected to 10.6.145.116:6102, attached to remote volume 'posix-3-locked-iot'. [2010-05-21 03:46:18] N [client-protocol.c:5813:client_setvolume_cbk] node-22: Connected to 10.6.145.116:6102, attached to remote volume 'posix-2-locked-iot'. [2010-05-21 03:46:18] N [client-protocol.c:5813:client_setvolume_cbk] node-22: Connected to 10.6.145.116:6102, attached to remote volume 'posix-2-locked-iot'. [2010-05-21 03:46:18] N [client-protocol.c:5813:client_setvolume_cbk] node-21: Connected to 10.6.145.116:6102, attached to remote volume 'posix-1-locked-iot'. [2010-05-21 03:46:18] N [client-protocol.c:5813:client_setvolume_cbk] node-21: Connected to 10.6.145.116:6102, attached to remote volume 'posix-1-locked-iot'. [2010-05-21 03:46:19] E [afr-self-heal-common.c:1196:sh_missing_entries_create] mirror-01: conflicing filetypes exist for path /gluster-test-01/.lck-b715060000000000. returning. [2010-05-21 03:46:19] E [afr-self-heal-common.c:1215:sh_missing_entries_create] mirror-04: no missing files - /gluster-test-01/.lck-c106120000000000. proceeding to metadata check [2010-05-21 03:46:19] E [afr.c:96:afr_set_split_brain] mirror-01: invalid argument: inode pending frames: patchset: v3.0.0-213-gf424a88 signal received: 11 time of crash: 2010-05-21 03:46:19 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs nfs-beta-rc5 /lib64/libc.so.6[0x3ff08302d0] /usr/local/lib/glusterfs/nfs-beta-rc5/xlator/cluster/replicate.so(afr_sh_data_fix+0x1d1)[0x2b4d89ba3961] /usr/local/lib/glusterfs/nfs-beta-rc5/xlator/cluster/replicate.so(afr_sh_data_fstat_cbk+0x11e)[0x2b4d89ba3c3e] /usr/local/lib/glusterfs/nfs-beta-rc5/xlator/protocol/client.so(client_fstat_cbk+0xce)[0x2b4d89971cde] /usr/local/lib/glusterfs/nfs-beta-rc5/xlator/protocol/client.so(protocol_client_pollin+0xca)[0x2b4d899591ba] /usr/local/lib/glusterfs/nfs-beta-rc5/xlator/protocol/client.so(notify+0x212)[0x2b4d89967972] /usr/local/lib/libglusterfs.so.0(xlator_notify+0x43)[0x2b4d88e93433] /usr/local/lib/glusterfs/nfs-beta-rc5/transport/socket.so(socket_event_handler+0xd3)[0x2aaaaab9b073] /usr/local/lib/libglusterfs.so.0[0x2b4d88eae4e5] glusterfs(main+0xb17)[0x404367] /lib64/libc.so.6(__libc_start_main+0xf4)[0x3ff081d994] glusterfs[0x4027a9] ---------
*** This bug has been marked as a duplicate of bug 869 ***