created 8 dht setup and running iozone and untar linux kernel on nfs client mount pt - crashed the nfsserver. ========= #0 0x00fb5e85 in nfs3_fh_resolve_and_resume (cs=0xb3032024, fh=0xb12fffc8, entry=0x0, resum_fn=0xfa214a <nfs3_write_open_resume>) at nfs3-helpers.c:3065 3065 cs->resolvefh = *fh; (gdb) bt full #0 0x00fb5e85 in nfs3_fh_resolve_and_resume (cs=0xb3032024, fh=0xb12fffc8, entry=0x0, resum_fn=0xfa214a <nfs3_write_open_resume>) at nfs3-helpers.c:3065 ret = -14 #1 0x00fa26e8 in nfs3_write (req=0x9c5ba80, fh=0xb12fffc8, offset=0, count=20877, stable=FILE_SYNC, payload={iov_base = 0xb7d52000, iov_len = 131072}, iob=0x9bd8338) at nfs3.c:2038 vol = (xlator_t *) 0x9be8fc8 stat = NFS3ERR_SERVERFAULT ret = -14 nfs3 = (struct nfs3_state *) 0x9c34e08 cs = (nfs3_call_state_t *) 0xb3032024 __FUNCTION__ = "nfs3_write" #2 0x00fa2a66 in nfs3svc_write_vec (req=0x9c5ba80, iob=0x9bd8338) at nfs3.c:2124 args = (write3args *) 0xb0bcf348 ret = -1 payload = {iov_base = 0xb7d52000, iov_len = 131072} __FUNCTION__ = "nfs3svc_write_vec" #3 0x00fbe3ad in nfs_rpcsvc_record_vectored_call_actor (conn=0x9c55648) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2208 actor = (rpcsvc_actor_t *) 0xfcccb0 req = (rpcsvc_request_t *) 0x9c5ba80 rs = (rpcsvc_record_state_t *) 0x9c55658 svc = (rpcsvc_t *) 0x9bd8cc8 ret = -1 __FUNCTION__ = "nfs_rpcsvc_record_vectored_call_actor" #4 0x00fbe626 in nfs_rpcsvc_update_vectored_state (conn=0x9c55648) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2265 rs = (rpcsvc_record_state_t *) 0x9c55658 svc = (rpcsvc_t *) 0x9bd8cc8 __FUNCTION__ = "nfs_rpcsvc_update_vectored_state" #5 0x00fbe735 in nfs_rpcsvc_handle_vectored_frag (conn=0x9c55648, dataread=0) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2324 __FUNCTION__ = "nfs_rpcsvc_handle_vectored_frag" #6 0x00fbe82c in nfs_rpcsvc_record_update_state (conn=0x9c55648, dataread=20880) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2361 rs = (rpcsvc_record_state_t *) 0x9c55658 svc = (rpcsvc_t *) 0x0 __FUNCTION__ = "nfs_rpcsvc_record_update_state" #7 0x00fbeb57 in nfs_rpcsvc_conn_data_poll_in (conn=0x9c55648) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2430 dataread = 20880 readsize = 20880 readaddr = 0xb7d52000 "#\n# SATA/PATA driver configuration\n#\n\nmenuconfig ATA\n\ttristate \"Serial ATA and Parallel ATA drivers\"\n\tdepends on HAS_IOMEM\n\tMissing separate debuginfos, use: debuginfo-install gcc.i386 glibc.i686 ---Type <return> to continue, or q <return> to quit--- depends on BLOCK\n\tdepends on !(M32R || M68K) || BROKEN\n\tselect SCSI\n\t---help"... ret = -1 __FUNCTION__ = "nfs_rpcsvc_conn_data_poll_in" #8 0x00fbefe6 in nfs_rpcsvc_conn_data_handler (fd=20, idx=3, data=0x9c55648, poll_in=1, poll_out=0, poll_err=0) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2559 conn = (rpcsvc_conn_t *) 0x9c55648 ret = 0 #9 0x0097c8af in event_dispatch_epoll_handler (event_pool=0x9beb740, events=0x9c46108, i=0) at event.c:812 event_data = (struct event_data *) 0x9c4610c handler = (event_handler_t) 0xfbef68 <nfs_rpcsvc_conn_data_handler> data = (void *) 0x9c55648 idx = 3 ret = -1 __FUNCTION__ = "event_dispatch_epoll_handler" #10 0x0097ca9b in event_dispatch_epoll (event_pool=0x9beb740) at event.c:876 events = (struct epoll_event *) 0x9c46108 size = 1 i = 0 ret = 1 __FUNCTION__ = "event_dispatch_epoll" #11 0x0097ce37 in event_dispatch (event_pool=0x9beb740) at event.c:984 ret = -1 __FUNCTION__ = "event_dispatch" #12 0x00fb92d9 in nfs_rpcsvc_stage_proc (arg=0x9beb718) at ../../../../xlators/nfs/lib/src/rpcsvc.c:64 stg = (rpcsvc_stage_t *) 0x9beb718 #13 0x006c7542 in start_thread () from /lib/i686/nosegneg/libpthread.so.0 No symbol table info available. #14 0x00c76b6e in clone () from /lib/i686/nosegneg/libc.so.6 No symbol table info available. -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ================ Given volfile: +------------------------------------------------------------------------------+ 1: volume dht8-client-0 2: type protocol/client 3: option transport-type tcp 4: option remote-host 10.245.210.193 5: option transport.socket.nodelay on 6: option remote-subvolume /mnt/oct1 7: end-volume 8: 9: volume dht8-client-1 10: type protocol/client 11: option transport-type tcp 12: option remote-host 10.245.209.205 13: option transport.socket.nodelay on 14: option remote-subvolume /mnt/oct1 15: end-volume 16: 17: volume dht8-client-2 18: type protocol/client 19: option transport-type tcp 20: option remote-host 10.243.113.224 21: option transport.socket.nodelay on 22: option remote-subvolume /mnt/oct1 23: end-volume 24: 25: volume dht8-client-3 26: type protocol/client 27: option transport-type tcp 28: option remote-host 10.202.54.53 29: option transport.socket.nodelay on 30: option remote-subvolume /mnt/oct1 31: end-volume 32: 33: volume dht8-client-4 34: type protocol/client 35: option transport-type tcp 36: option remote-host 10.212.117.143 37: option transport.socket.nodelay on 38: option remote-subvolume /mnt/oct1 39: end-volume 40: 41: volume dht8-client-5 42: type protocol/client 43: option transport-type tcp 44: option remote-host 10.202.57.169 45: option transport.socket.nodelay on 46: option remote-subvolume /mnt/oct1 47: end-volume 48: 49: volume dht8-client-6 50: type protocol/client 51: option transport-type tcp 52: option remote-host 10.212.70.131 53: option transport.socket.nodelay on 54: option remote-subvolume /mnt/oct1 55: end-volume 56: 57: volume dht8-client-7 58: type protocol/client 59: option transport-type tcp 60: option remote-host 10.240.94.228 61: option transport.socket.nodelay on 62: option remote-subvolume /mnt/oct1 63: end-volume 64: 65: volume dht8-dht 66: type cluster/distribute 67: # option lookup-unhashed on 68: # option min-free-disk on 69: # option unhashed-sticky-bit on 70: subvolumes dht8-client-0 dht8-client-1 dht8-client-2 dht8-client-3 dht8-client-4 dht8-client-5 dht8-client-6 dht8-client-7 71: end-volume 72: 73: volume dht8-write-behind 74: type performance/write-behind 75: # option flush-behind on 76: # option cache-size on 77: # option disable-for-first-nbytes on 78: # option enable-O_SYNC on 79: # option enable-trickling-writes on 80: subvolumes dht8-dht 81: end-volume 82: 83: volume dht8-read-ahead 84: type performance/read-ahead 85: # option force-atime-update on 86: # option page-count on 87: subvolumes dht8-write-behind 88: end-volume 89: 90: volume dht8-io-cache 91: type performance/io-cache 92: # option priority on 93: # option cache-timeout on 94: # option cache-size on 95: # option min-file-size on 96: # option max-file-size on 97: subvolumes dht8-read-ahead 98: end-volume 99: 100: volume dht8-quick-read 101: type performance/quick-read 102: # option priority on 103: # option cache-timeout on 104: # option cache-size on 105: # option max-file-size on 106: subvolumes dht8-io-cache 107: end-volume 108: 109: volume dht8-stat-prefetch 110: type performance/stat-prefetch 111: subvolumes dht8-quick-read 112: end-volume 113: 114: volume dht8 115: type debug/io-stats 116: option dump-fd-stats no 117: option latency-measurement no 118: subvolumes dht8-stat-prefetch 119: end-volume 120: 121: volume nfs-server 122: type nfs/server 123: option rpc-auth.addr.dht8.allow * 124: option nfs.dynamic-volumes on 125: option nfs3.dht8.volume-id e0ca45de-ef2d-44dd-8818-579ac1501a82 126: subvolumes dht8 127: end-volume +------------------------------------------------------------------------------+ [2010-10-01 02:37:45.796080] E [client-handshake.c:749:client_query_portmap_cbk] dht8-client-6: failed to get the port number for remote subvolume [2010-10-01 02:37:45.822671] E [client-handshake.c:749:client_query_portmap_cbk] dht8-client-4: failed to get the port number for remote subvolume [2010-10-01 02:37:45.887901] E [client-handshake.c:749:client_query_portmap_cbk] dht8-client-5: failed to get the port number for remote subvolume [2010-10-01 02:37:47.887334] I [client-handshake.c:675:select_server_supported_programs] dht8-client-0: Using Program GlusterFS-3.1.0, Num (1298437), Version (310) [2010-10-01 02:37:47.887912] I [client-handshake.c:511:client_setvolume_cbk] dht8-client-0: Connected to 10.245.210.193:6971, attached to remote volume '/mnt/oct1'. [2010-10-01 02:37:48.893031] I [client-handshake.c:675:select_server_supported_programs] dht8-client-2: Using Program GlusterFS-3.1.0, Num (1298437), Version (310) [2010-10-01 02:37:48.893999] I [client-handshake.c:511:client_setvolume_cbk] dht8-client-2: Connected to 10.243.113.224:6971, attached to remote volume '/mnt/oct1'. [2010-10-01 02:37:48.895149] I [client-handshake.c:675:select_server_supported_programs] dht8-client-1: Using Program GlusterFS-3.1.0, Num (1298437), Version (310) [2010-10-01 02:37:48.895865] I [client-handshake.c:511:client_setvolume_cbk] dht8-client-1: Connected to 10.245.209.205:6971, attached to remote volume '/mnt/oct1'. [2010-10-01 02:37:48.898144] I [client-handshake.c:675:select_server_supported_programs] dht8-client-3: Using Program GlusterFS-3.1.0, Num (1298437), Version (310) [2010-10-01 02:37:48.898740] I [client-handshake.c:511:client_setvolume_cbk] dht8-client-3: Connected to 10.202.54.53:6971, attached to remote volume '/mnt/oct1'. [2010-10-01 02:37:48.945354] I [client-handshake.c:675:select_server_supported_programs] dht8-client-7: Using Program GlusterFS-3.1.0, Num (1298437), Version (310) [2010-10-01 02:37:49.48786] I [client-handshake.c:511:client_setvolume_cbk] dht8-client-7: Connected to 10.240.94.228:6972, attached to remote volume '/mnt/oct1'. [2010-10-01 02:37:51.933116] I [client-handshake.c:675:select_server_supported_programs] dht8-client-6: Using Program GlusterFS-3.1.0, Num (1298437), Version (310) [2010-10-01 02:37:51.933704] I [client-handshake.c:511:client_setvolume_cbk] dht8-client-6: Connected to 10.212.70.131:6972, attached to remote volume '/mnt/oct1'. [2010-10-01 02:37:51.935731] I [client-handshake.c:675:select_server_supported_programs] dht8-client-4: Using Program GlusterFS-3.1.0, Num (1298437), Version (310) [2010-10-01 02:37:51.936321] I [client-handshake.c:511:client_setvolume_cbk] dht8-client-4: Connected to 10.212.117.143:6972, attached to remote volume '/mnt/oct1'. [2010-10-01 02:37:52.941019] I [client-handshake.c:675:select_server_supported_programs] dht8-client-5: Using Program GlusterFS-3.1.0, Num (1298437), Version (310) [2010-10-01 02:37:52.941737] I [client-handshake.c:511:client_setvolume_cbk] dht8-client-5: Connected to 10.202.57.169:6972, attached to remote volume '/mnt/oct1'. [2010-10-01 02:37:52.944620] I [nfs.c:315:__nfs_subvolume_start] nfs: All exports up [2010-10-01 02:38:14.388911] W [dht-diskusage.c:216:dht_is_subvol_filled] dht8-dht: disk space on subvolume 'dht8-client-1' is getting full (91.00 %), consider adding more nodes [2010-10-01 02:46:15.412292] E [rpcsvc.c:1249:nfs_rpcsvc_program_actor] nfsrpc: RPC program not available [2010-10-01 02:55:43.321199] W [dht-diskusage.c:216:dht_is_subvol_filled] dht8-dht: disk space on subvolume 'dht8-client-1' is getting full (91.00 %), consider adding more nodes [2010-10-01 02:58:31.378961] W [dht-diskusage.c:216:dht_is_subvol_filled] dht8-dht: disk space on subvolume 'dht8-client-1' is getting full (91.00 %), consider adding more nodes pending frames: patchset: v3.1.0qa7-455-g760daf2 signal received: 11 time of crash: 2010-10-01 03:00:01 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.1.0qa36 [0x939420] /usr/local/lib/glusterfs/3.1.0qa36/xlator/nfs/server.so(nfs3_write+0x474)[0xfa26e8] /usr/local/lib/glusterfs/3.1.0qa36/xlator/nfs/server.so(nfs3svc_write_vec+0xcd)[0xfa2a66] /usr/local/lib/glusterfs/3.1.0qa36/xlator/nfs/server.so(nfs_rpcsvc_record_vectored_call_actor+0xc7)[0xfbe3ad] /usr/local/lib/glusterfs/3.1.0qa36/xlator/nfs/server.so(nfs_rpcsvc_update_vectored_state+0x1ec)[0xfbe626] /usr/local/lib/glusterfs/3.1.0qa36/xlator/nfs/server.so(nfs_rpcsvc_handle_vectored_frag+0xa3)[0xfbe735] /usr/local/lib/glusterfs/3.1.0qa36/xlator/nfs/server.so(nfs_rpcsvc_record_update_state+0xe5)[0xfbe82c] /usr/local/lib/glusterfs/3.1.0qa36/xlator/nfs/server.so(nfs_rpcsvc_conn_data_poll_in+0x105)[0xfbeb57] /usr/local/lib/glusterfs/3.1.0qa36/xlator/nfs/server.so(nfs_rpcsvc_conn_data_handler+0x7e)[0xfbefe6] /usr/local/lib/libglusterfs.so.0[0x97c8af] /usr/local/lib/libglusterfs.so.0[0x97ca9b] /usr/local/lib/libglusterfs.so.0(event_dispatch+0x8e)[0x97ce37] /usr/local/lib/glusterfs/3.1.0qa36/xlator/nfs/server.so(nfs_rpcsvc_stage_proc+0x35)[0xfb92d9] /lib/i686/nosegneg/libpthread.so.0[0x6c7542] /lib/i686/nosegneg/libc.so.6(clone+0x5e)[0xc76b6e] ---------
with qa37 , added bricks and did kernel untar without rebalance. this time it completed without any crash.
It is a mem corruption same as 999. Will need a very very peculiar mem situation to reproduce. Keep open.
*** This bug has been marked as a duplicate of bug 999 ***