Bug 763487 (GLUSTER-1755) - glfs-qa36 : crash at nfs3_fh_resolve_and_resume
Summary: glfs-qa36 : crash at nfs3_fh_resolve_and_resume
Keywords:
Status: CLOSED DUPLICATE of bug 762731
Alias: GLUSTER-1755
Product: GlusterFS
Classification: Community
Component: nfs
Version: 3.1-alpha
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Shehjar Tikoo
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-10-01 07:11 UTC by Lakshmipathi G
Modified: 2015-12-01 16:45 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: RTP
Mount Type: nfs
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Lakshmipathi G 2010-10-01 07:11:14 UTC
created 8 dht setup and running iozone and untar linux kernel on nfs client mount pt - crashed the nfsserver.
=========
#0  0x00fb5e85 in nfs3_fh_resolve_and_resume (cs=0xb3032024, fh=0xb12fffc8, entry=0x0, resum_fn=0xfa214a <nfs3_write_open_resume>) at nfs3-helpers.c:3065
3065	        cs->resolvefh = *fh;
(gdb) bt full
#0  0x00fb5e85 in nfs3_fh_resolve_and_resume (cs=0xb3032024, fh=0xb12fffc8, entry=0x0, resum_fn=0xfa214a <nfs3_write_open_resume>) at nfs3-helpers.c:3065
	ret = -14
#1  0x00fa26e8 in nfs3_write (req=0x9c5ba80, fh=0xb12fffc8, offset=0, count=20877, stable=FILE_SYNC, payload={iov_base = 0xb7d52000, iov_len = 131072}, iob=0x9bd8338)
    at nfs3.c:2038
	vol = (xlator_t *) 0x9be8fc8
	stat = NFS3ERR_SERVERFAULT
	ret = -14
	nfs3 = (struct nfs3_state *) 0x9c34e08
	cs = (nfs3_call_state_t *) 0xb3032024
	__FUNCTION__ = "nfs3_write"
#2  0x00fa2a66 in nfs3svc_write_vec (req=0x9c5ba80, iob=0x9bd8338) at nfs3.c:2124
	args = (write3args *) 0xb0bcf348
	ret = -1
	payload = {iov_base = 0xb7d52000, iov_len = 131072}
	__FUNCTION__ = "nfs3svc_write_vec"
#3  0x00fbe3ad in nfs_rpcsvc_record_vectored_call_actor (conn=0x9c55648) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2208
	actor = (rpcsvc_actor_t *) 0xfcccb0
	req = (rpcsvc_request_t *) 0x9c5ba80
	rs = (rpcsvc_record_state_t *) 0x9c55658
	svc = (rpcsvc_t *) 0x9bd8cc8
	ret = -1
	__FUNCTION__ = "nfs_rpcsvc_record_vectored_call_actor"
#4  0x00fbe626 in nfs_rpcsvc_update_vectored_state (conn=0x9c55648) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2265
	rs = (rpcsvc_record_state_t *) 0x9c55658
	svc = (rpcsvc_t *) 0x9bd8cc8
	__FUNCTION__ = "nfs_rpcsvc_update_vectored_state"
#5  0x00fbe735 in nfs_rpcsvc_handle_vectored_frag (conn=0x9c55648, dataread=0) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2324
	__FUNCTION__ = "nfs_rpcsvc_handle_vectored_frag"
#6  0x00fbe82c in nfs_rpcsvc_record_update_state (conn=0x9c55648, dataread=20880) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2361
	rs = (rpcsvc_record_state_t *) 0x9c55658
	svc = (rpcsvc_t *) 0x0
	__FUNCTION__ = "nfs_rpcsvc_record_update_state"
#7  0x00fbeb57 in nfs_rpcsvc_conn_data_poll_in (conn=0x9c55648) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2430
	dataread = 20880
	readsize = 20880
	readaddr = 0xb7d52000 "#\n# SATA/PATA driver configuration\n#\n\nmenuconfig ATA\n\ttristate \"Serial ATA and Parallel ATA drivers\"\n\tdepends on HAS_IOMEM\n\tMissing separate debuginfos, use: debuginfo-install gcc.i386 glibc.i686
---Type <return> to continue, or q <return> to quit---
depends on BLOCK\n\tdepends on !(M32R || M68K) || BROKEN\n\tselect SCSI\n\t---help"...
	ret = -1
	__FUNCTION__ = "nfs_rpcsvc_conn_data_poll_in"
#8  0x00fbefe6 in nfs_rpcsvc_conn_data_handler (fd=20, idx=3, data=0x9c55648, poll_in=1, poll_out=0, poll_err=0) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2559
	conn = (rpcsvc_conn_t *) 0x9c55648
	ret = 0
#9  0x0097c8af in event_dispatch_epoll_handler (event_pool=0x9beb740, events=0x9c46108, i=0) at event.c:812
	event_data = (struct event_data *) 0x9c4610c
	handler = (event_handler_t) 0xfbef68 <nfs_rpcsvc_conn_data_handler>
	data = (void *) 0x9c55648
	idx = 3
	ret = -1
	__FUNCTION__ = "event_dispatch_epoll_handler"
#10 0x0097ca9b in event_dispatch_epoll (event_pool=0x9beb740) at event.c:876
	events = (struct epoll_event *) 0x9c46108
	size = 1
	i = 0
	ret = 1
	__FUNCTION__ = "event_dispatch_epoll"
#11 0x0097ce37 in event_dispatch (event_pool=0x9beb740) at event.c:984
	ret = -1
	__FUNCTION__ = "event_dispatch"
#12 0x00fb92d9 in nfs_rpcsvc_stage_proc (arg=0x9beb718) at ../../../../xlators/nfs/lib/src/rpcsvc.c:64
	stg = (rpcsvc_stage_t *) 0x9beb718
#13 0x006c7542 in start_thread () from /lib/i686/nosegneg/libpthread.so.0
No symbol table info available.
#14 0x00c76b6e in clone () from /lib/i686/nosegneg/libc.so.6
No symbol table info available.
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
================
Given volfile:
+------------------------------------------------------------------------------+
  1: volume dht8-client-0
  2:     type protocol/client
  3:     option transport-type tcp
  4:     option remote-host 10.245.210.193
  5:     option transport.socket.nodelay on
  6:     option remote-subvolume /mnt/oct1
  7: end-volume
  8: 
  9: volume dht8-client-1
 10:     type protocol/client
 11:     option transport-type tcp
 12:     option remote-host 10.245.209.205
 13:     option transport.socket.nodelay on
 14:     option remote-subvolume /mnt/oct1
 15: end-volume
 16: 
 17: volume dht8-client-2
 18:     type protocol/client
 19:     option transport-type tcp
 20:     option remote-host 10.243.113.224
 21:     option transport.socket.nodelay on
 22:     option remote-subvolume /mnt/oct1
 23: end-volume
 24: 
 25: volume dht8-client-3
 26:     type protocol/client
 27:     option transport-type tcp
 28:     option remote-host 10.202.54.53
 29:     option transport.socket.nodelay on
 30:     option remote-subvolume /mnt/oct1
 31: end-volume
 32: 
 33: volume dht8-client-4
 34:     type protocol/client
 35:     option transport-type tcp
 36:     option remote-host 10.212.117.143
 37:     option transport.socket.nodelay on
 38:     option remote-subvolume /mnt/oct1
 39: end-volume
 40: 
 41: volume dht8-client-5
 42:     type protocol/client
 43:     option transport-type tcp
 44:     option remote-host 10.202.57.169
 45:     option transport.socket.nodelay on
 46:     option remote-subvolume /mnt/oct1
 47: end-volume
 48: 
 49: volume dht8-client-6
 50:     type protocol/client
 51:     option transport-type tcp
 52:     option remote-host 10.212.70.131
 53:     option transport.socket.nodelay on
 54:     option remote-subvolume /mnt/oct1
 55: end-volume
 56: 
 57: volume dht8-client-7
 58:     type protocol/client
 59:     option transport-type tcp
 60:     option remote-host 10.240.94.228
 61:     option transport.socket.nodelay on
 62:     option remote-subvolume /mnt/oct1
 63: end-volume
 64: 
 65: volume dht8-dht
 66: type cluster/distribute
 67: #   option lookup-unhashed on
 68: #   option min-free-disk on
 69: #   option unhashed-sticky-bit on
 70:     subvolumes dht8-client-0 dht8-client-1 dht8-client-2 dht8-client-3 dht8-client-4 dht8-client-5 dht8-client-6 dht8-client-7 
 71: end-volume
 72: 
 73: volume dht8-write-behind
 74:     type performance/write-behind
 75: #   option flush-behind on
 76: #   option cache-size on
 77: #   option disable-for-first-nbytes on
 78: #   option enable-O_SYNC on
 79: #   option enable-trickling-writes on
 80:     subvolumes dht8-dht
 81: end-volume
 82: 
 83: volume dht8-read-ahead
 84:     type performance/read-ahead
 85: #   option force-atime-update on
 86: #   option page-count on
 87:     subvolumes dht8-write-behind
 88: end-volume
 89: 
 90: volume dht8-io-cache
 91:     type performance/io-cache
 92: #   option priority on
 93: #   option cache-timeout on
 94: #   option cache-size on
 95: #   option min-file-size on
 96: #   option max-file-size on
 97:     subvolumes dht8-read-ahead
 98: end-volume
 99: 
100: volume dht8-quick-read
101:     type performance/quick-read
102: #   option priority on
103: #   option cache-timeout on
104: #   option cache-size on
105: #   option max-file-size on
106:     subvolumes dht8-io-cache
107: end-volume
108: 
109: volume dht8-stat-prefetch
110:     type performance/stat-prefetch
111:     subvolumes dht8-quick-read
112: end-volume
113: 
114: volume dht8
115:     type debug/io-stats
116:     option dump-fd-stats no
117:     option latency-measurement no
118:     subvolumes dht8-stat-prefetch
119: end-volume
120: 
121: volume nfs-server
122: type nfs/server
123: option rpc-auth.addr.dht8.allow *
124: option nfs.dynamic-volumes on
125: option nfs3.dht8.volume-id e0ca45de-ef2d-44dd-8818-579ac1501a82
126: subvolumes  dht8
127: end-volume

+------------------------------------------------------------------------------+
[2010-10-01 02:37:45.796080] E [client-handshake.c:749:client_query_portmap_cbk] dht8-client-6: failed to get the port number for remote subvolume
[2010-10-01 02:37:45.822671] E [client-handshake.c:749:client_query_portmap_cbk] dht8-client-4: failed to get the port number for remote subvolume
[2010-10-01 02:37:45.887901] E [client-handshake.c:749:client_query_portmap_cbk] dht8-client-5: failed to get the port number for remote subvolume
[2010-10-01 02:37:47.887334] I [client-handshake.c:675:select_server_supported_programs] dht8-client-0: Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
[2010-10-01 02:37:47.887912] I [client-handshake.c:511:client_setvolume_cbk] dht8-client-0: Connected to 10.245.210.193:6971, attached to remote volume '/mnt/oct1'.
[2010-10-01 02:37:48.893031] I [client-handshake.c:675:select_server_supported_programs] dht8-client-2: Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
[2010-10-01 02:37:48.893999] I [client-handshake.c:511:client_setvolume_cbk] dht8-client-2: Connected to 10.243.113.224:6971, attached to remote volume '/mnt/oct1'.
[2010-10-01 02:37:48.895149] I [client-handshake.c:675:select_server_supported_programs] dht8-client-1: Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
[2010-10-01 02:37:48.895865] I [client-handshake.c:511:client_setvolume_cbk] dht8-client-1: Connected to 10.245.209.205:6971, attached to remote volume '/mnt/oct1'.
[2010-10-01 02:37:48.898144] I [client-handshake.c:675:select_server_supported_programs] dht8-client-3: Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
[2010-10-01 02:37:48.898740] I [client-handshake.c:511:client_setvolume_cbk] dht8-client-3: Connected to 10.202.54.53:6971, attached to remote volume '/mnt/oct1'.
[2010-10-01 02:37:48.945354] I [client-handshake.c:675:select_server_supported_programs] dht8-client-7: Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
[2010-10-01 02:37:49.48786] I [client-handshake.c:511:client_setvolume_cbk] dht8-client-7: Connected to 10.240.94.228:6972, attached to remote volume '/mnt/oct1'.
[2010-10-01 02:37:51.933116] I [client-handshake.c:675:select_server_supported_programs] dht8-client-6: Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
[2010-10-01 02:37:51.933704] I [client-handshake.c:511:client_setvolume_cbk] dht8-client-6: Connected to 10.212.70.131:6972, attached to remote volume '/mnt/oct1'.
[2010-10-01 02:37:51.935731] I [client-handshake.c:675:select_server_supported_programs] dht8-client-4: Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
[2010-10-01 02:37:51.936321] I [client-handshake.c:511:client_setvolume_cbk] dht8-client-4: Connected to 10.212.117.143:6972, attached to remote volume '/mnt/oct1'.
[2010-10-01 02:37:52.941019] I [client-handshake.c:675:select_server_supported_programs] dht8-client-5: Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
[2010-10-01 02:37:52.941737] I [client-handshake.c:511:client_setvolume_cbk] dht8-client-5: Connected to 10.202.57.169:6972, attached to remote volume '/mnt/oct1'.
[2010-10-01 02:37:52.944620] I [nfs.c:315:__nfs_subvolume_start] nfs: All exports up
[2010-10-01 02:38:14.388911] W [dht-diskusage.c:216:dht_is_subvol_filled] dht8-dht: disk space on subvolume 'dht8-client-1' is getting full (91.00 %), consider adding more nodes
[2010-10-01 02:46:15.412292] E [rpcsvc.c:1249:nfs_rpcsvc_program_actor] nfsrpc: RPC program not available
[2010-10-01 02:55:43.321199] W [dht-diskusage.c:216:dht_is_subvol_filled] dht8-dht: disk space on subvolume 'dht8-client-1' is getting full (91.00 %), consider adding more nodes
[2010-10-01 02:58:31.378961] W [dht-diskusage.c:216:dht_is_subvol_filled] dht8-dht: disk space on subvolume 'dht8-client-1' is getting full (91.00 %), consider adding more nodes
pending frames:

patchset: v3.1.0qa7-455-g760daf2
signal received: 11
time of crash: 2010-10-01 03:00:01
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.1.0qa36
[0x939420]
/usr/local/lib/glusterfs/3.1.0qa36/xlator/nfs/server.so(nfs3_write+0x474)[0xfa26e8]
/usr/local/lib/glusterfs/3.1.0qa36/xlator/nfs/server.so(nfs3svc_write_vec+0xcd)[0xfa2a66]
/usr/local/lib/glusterfs/3.1.0qa36/xlator/nfs/server.so(nfs_rpcsvc_record_vectored_call_actor+0xc7)[0xfbe3ad]
/usr/local/lib/glusterfs/3.1.0qa36/xlator/nfs/server.so(nfs_rpcsvc_update_vectored_state+0x1ec)[0xfbe626]
/usr/local/lib/glusterfs/3.1.0qa36/xlator/nfs/server.so(nfs_rpcsvc_handle_vectored_frag+0xa3)[0xfbe735]
/usr/local/lib/glusterfs/3.1.0qa36/xlator/nfs/server.so(nfs_rpcsvc_record_update_state+0xe5)[0xfbe82c]
/usr/local/lib/glusterfs/3.1.0qa36/xlator/nfs/server.so(nfs_rpcsvc_conn_data_poll_in+0x105)[0xfbeb57]
/usr/local/lib/glusterfs/3.1.0qa36/xlator/nfs/server.so(nfs_rpcsvc_conn_data_handler+0x7e)[0xfbefe6]
/usr/local/lib/libglusterfs.so.0[0x97c8af]
/usr/local/lib/libglusterfs.so.0[0x97ca9b]
/usr/local/lib/libglusterfs.so.0(event_dispatch+0x8e)[0x97ce37]
/usr/local/lib/glusterfs/3.1.0qa36/xlator/nfs/server.so(nfs_rpcsvc_stage_proc+0x35)[0xfb92d9]
/lib/i686/nosegneg/libpthread.so.0[0x6c7542]
/lib/i686/nosegneg/libc.so.6(clone+0x5e)[0xc76b6e]
---------

Comment 1 Lakshmipathi G 2010-10-02 04:55:47 UTC
with qa37 , added bricks and did  kernel untar without rebalance. this time it completed without any crash.

Comment 2 Shehjar Tikoo 2010-10-04 01:53:40 UTC
It is a mem corruption same as 999. Will need a very very peculiar mem situation to reproduce. Keep open.

Comment 3 Shehjar Tikoo 2010-10-05 08:25:14 UTC

*** This bug has been marked as a duplicate of bug 999 ***


Note You need to log in before you can comment on or make changes to this bug.