Bug 1197118

Summary:	Perf: Crash seen during IOZone performance regression runs.
Product:	[Community] GlusterFS	Reporter:	Shyamsundar <srangana>
Component:	core	Assignee:	Shyamsundar <srangana>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	urgent	Docs Contact:
Priority:	unspecified
Version:	mainline	CC:	annair, bturner, bugs, gluster-bugs, kparthas, nbalacha, rcyriac, rgowdapp, srangana, vagarwal
Target Milestone:	---
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	glusterfs-3.7.0	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:	1194525	Environment:
Last Closed:	2015-05-14 17:29:12 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1193995, 1194525

Description Shyamsundar 2015-02-27 14:26:52 UTC

+++ This bug was initially created as a clone of Bug #1194525 +++

Description of problem:

When running iozone tests I am seeing a crash on the client side.

How reproducible:

Occasionally, not every time.

Steps to Reproduce:
1.  Run IOZone perf regression tests.
2.  Check client logs for crashes.

Actual results:

Poor performance, transport endpoint not connected messages, and disconnects.

Expected results:

Normal operation.

Additional info:

I have this reprod in the lab.

--- Additional comment from Ben Turner on 2015-02-19 22:21:14 EST ---

In the log:

[2015-02-20 02:50:47.140507] I [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Exited thread with index 4
pending frames:
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 
2015-02-20 02:51:12
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.6.0.45
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x7f3f4806d7b6]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x7f3f480883cf]
/lib64/libc.so.6[0x38acc329a0]
/usr/lib64/glusterfs/3.6.0.45/rpc-transport/socket.so(+0x9594)[0x7f3f3dc56594]
/usr/lib64/glusterfs/3.6.0.45/rpc-transport/socket.so(+0xad1d)[0x7f3f3dc57d1d]
/usr/lib64/libglusterfs.so.0(+0x77d1c)[0x7f3f480c4d1c]
/lib64/libpthread.so.0[0x38ad0079d1]
/lib64/libc.so.6(clone+0x6d)[0x38acce8b6d]
---------

--- Additional comment from Ben Turner on 2015-02-19 22:23:39 EST ---

I see this in the client logs:

[2015-02-20 02:07:16.497365] E [dht-helper.c:1364:dht_inode_ctx_set] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_readdirp_cbk+0x30c) [0x7feb0a9d7b6c] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_layout_preset+0x5e) [0x7feb0a9afa0e] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_inode_ctx_layout_set+0x52) [0x7feb0a9b1cc2]))) 0-rep-dht: invalid argument: inode
[2015-02-20 02:07:16.497405] E [dht-helper.c:1345:dht_inode_ctx_get] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_readdirp_cbk+0x30c) [0x7feb0a9d7b6c] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_layout_preset+0x5e) [0x7feb0a9afa0e] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_inode_ctx_layout_set+0x34) [0x7feb0a9b1ca4]))) 0-rep-dht: invalid argument: inode
[2015-02-20 02:07:16.497427] E [dht-helper.c:1364:dht_inode_ctx_set] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_readdirp_cbk+0x30c) [0x7feb0a9d7b6c] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_layout_preset+0x5e) [0x7feb0a9afa0e] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_inode_ctx_layout_set+0x52) [0x7feb0a9b1cc2]))) 0-rep-dht: invalid argument: inode
[2015-02-20 02:07:21.123976] E [dht-helper.c:1345:dht_inode_ctx_get] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_readdirp_cbk+0x30c) [0x7feb0a9d7b6c] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_layout_preset+0x5e) [0x7feb0a9afa0e] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_inode_ctx_layout_set+0x34) [0x7feb0a9b1ca4]))) 0-rep-dht: invalid argument: inode
[2015-02-20 02:07:21.124006] E [dht-helper.c:1364:dht_inode_ctx_set] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_readdirp_cbk+0x30c) [0x7feb0a9d7b6c] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_layout_preset+0x5e) [0x7feb0a9afa0e] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_inode_ctx_layout_set+0x52) [0x7feb0a9b1cc2]))) 0-rep-dht: invalid argument: inode

And here is the output from iozone:

	Iozone: Performance Test of File I/O
	        Version $Revision: 3.408 $
		Compiled for 64 bit mode.
		Build: linux 

	Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
	             Al Slater, Scott Rhine, Mike Wisner, Ken Goss
	             Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
	             Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
	             Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
	             Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
	             Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer.
	             Ben England.

	Run began: Thu Feb 19 22:15:53 2015

	Network distribution mode enabled.
	Hostname = gqac031.sbu.lab.eng.bos.redhat.com
	Setting no_unlink
	Include close in write timing
	Include fsync in write timing
	No retest option selected
	Record Size 1024 KB
	File size set to 2097152 KB
	Command line used: iozone -+m /root/clients.ioz -+h gqac031.sbu.lab.eng.bos.redhat.com -C -w -c -e -i 2 -J 3 -+n -r 1024k -s 2g -t 8
	Output is in Kbytes/sec
	Compute time 0.003000 seconds for reads and writes.
	Time Resolution = 0.000001 seconds.
	Processor cache size set to 1024 Kbytes.
	Processor cache line size set to 32 bytes.
	File stride size set to 17 * record size.
	Throughput test with 8 processes
	Each process writes a 2097152 Kbyte file in 1024 Kbyte records

	Test running:
Client # 7 reporting an error Transport endpoint is not connected !

Client # 6 reporting an error Transport endpoint is not connected !

Client # 4 reporting an error Transport endpoint is not connected !

Client # 5 reporting an error Transport endpoint is not connected !

	Children see throughput for 8 random readers 	=  481237.70 KB/sec
	Min throughput per process 			=   10994.98 KB/sec 
	Max throughput per process 			=  112833.18 KB/sec
	Avg throughput per process 			=   60154.71 KB/sec
	Min xfer 					=  264192.00 KB
	Child[0] xfer count = 2097152.00 KB, Throughput =  112833.18 KB/sec
	Child[1] xfer count = 1937408.00 KB, Throughput =  101591.91 KB/sec
	Child[2] xfer count = 2081792.00 KB, Throughput =  111696.71 KB/sec
	Child[3] xfer count = 1999872.00 KB, Throughput =  105917.29 KB/sec
	Child[4] xfer count =  264192.00 KB, Throughput =   10994.98 KB/sec
	Child[5] xfer count =  314368.00 KB, Throughput =   13162.18 KB/sec
	Child[6] xfer count =  316416.00 KB, Throughput =   13252.79 KB/sec
	Child[7] xfer count =  282624.00 KB, Throughput =   11788.66 KB/sec

	Test running:
Client # 5 reporting an error Transport endpoint is not connected !

Client # 4 reporting an error Transport endpoint is not connected !

Client # 7 reporting an error Transport endpoint is not connected !

Client # 6 reporting an error Transport endpoint is not connected !

	Children see throughput for 8 random writers 	= 1029431.46 KB/sec
	Min throughput per process 			=   22222.78 KB/sec 
	Max throughput per process 			=  237961.33 KB/sec
	Avg throughput per process 			=  128678.93 KB/sec
	Min xfer 					= 1366016.00 KB
	Child[0] xfer count = 2097152.00 KB, Throughput =  237961.33 KB/sec
	Child[1] xfer count = 2097152.00 KB, Throughput =  237687.33 KB/sec
	Child[2] xfer count = 2097152.00 KB, Throughput =  226001.80 KB/sec
	Child[3] xfer count = 2097152.00 KB, Throughput =  237303.77 KB/sec
	Child[4] xfer count = 1377280.00 KB, Throughput =   22405.48 KB/sec
	Child[5] xfer count = 1366016.00 KB, Throughput =   23476.16 KB/sec
	Child[6] xfer count = 1366016.00 KB, Throughput =   22222.78 KB/sec
	Child[7] xfer count = 1375232.00 KB, Throughput =   22372.82 KB/sec

--- Additional comment from Ben Turner on 2015-02-20 22:07:52 EST ---

Also some one of the core dumps:

[root@gqac006 tmp]# gdb /usr/sbin/glusterfs /core.21666
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/glusterfs...(no debugging symbols found)...done.
[New Thread 21672]
[New Thread 21666]
[New Thread 21667]
[New Thread 21679]
[New Thread 21678]
[New Thread 21673]
[New Thread 21670]
[New Thread 21674]
[New Thread 21668]
[New Thread 21671]
Reading symbols from /usr/lib64/libglusterfs.so.0...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libglusterfs.so.0
Reading symbols from /usr/lib64/libgfrpc.so.0...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libgfrpc.so.0
Reading symbols from /usr/lib64/libgfxdr.so.0...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libgfxdr.so.0
Reading symbols from /lib64/librt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/librt.so.1
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /usr/lib64/libcrypto.so.10...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libcrypto.so.10
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libz.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libz.so.1
Reading symbols from /usr/lib64/glusterfs/3.6.0.45/xlator/mount/fuse.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/glusterfs/3.6.0.45/xlator/mount/fuse.so
Reading symbols from /usr/lib64/glusterfs/3.6.0.45/rpc-transport/socket.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/glusterfs/3.6.0.45/rpc-transport/socket.so
Reading symbols from /usr/lib64/libssl.so.10...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libssl.so.10
Reading symbols from /lib64/libgssapi_krb5.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libgssapi_krb5.so.2
Reading symbols from /lib64/libkrb5.so.3...(no debugging symbols found)...done.
Loaded symbols for /lib64/libkrb5.so.3
Reading symbols from /lib64/libcom_err.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libcom_err.so.2
Reading symbols from /lib64/libk5crypto.so.3...(no debugging symbols found)...done.
Loaded symbols for /lib64/libk5crypto.so.3
Reading symbols from /lib64/libkrb5support.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib64/libkrb5support.so.0
Reading symbols from /lib64/libkeyutils.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libkeyutils.so.1
Reading symbols from /lib64/libresolv.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libresolv.so.2
Reading symbols from /lib64/libselinux.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libselinux.so.1
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnss_files.so.2
Reading symbols from /usr/lib64/glusterfs/3.6.0.45/xlator/protocol/client.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/glusterfs/3.6.0.45/xlator/protocol/client.so
Reading symbols from /usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so
Reading symbols from /usr/lib64/glusterfs/3.6.0.45/xlator/performance/write-behind.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/glusterfs/3.6.0.45/xlator/performance/write-behind.so
Reading symbols from /usr/lib64/glusterfs/3.6.0.45/xlator/performance/read-ahead.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/glusterfs/3.6.0.45/xlator/performance/read-ahead.so
Reading symbols from /usr/lib64/glusterfs/3.6.0.45/xlator/performance/readdir-ahead.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/glusterfs/3.6.0.45/xlator/performance/readdir-ahead.so
Reading symbols from /usr/lib64/glusterfs/3.6.0.45/xlator/performance/io-cache.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/glusterfs/3.6.0.45/xlator/performance/io-cache.so
Reading symbols from /usr/lib64/glusterfs/3.6.0.45/xlator/performance/quick-read.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/glusterfs/3.6.0.45/xlator/performance/quick-read.so
Reading symbols from /usr/lib64/glusterfs/3.6.0.45/xlator/performance/md-cache.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/glusterfs/3.6.0.45/xlator/performance/md-cache.so
Reading symbols from /usr/lib64/glusterfs/3.6.0.45/xlator/debug/io-stats.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/glusterfs/3.6.0.45/xlator/debug/io-stats.so
Reading symbols from /usr/lib64/glusterfs/3.6.0.45/xlator/meta.so...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/glusterfs/3.6.0.45/xlator/meta.so
Reading symbols from /lib64/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libgcc_s.so.1
Core was generated by `/usr/sbin/glusterfs --volfile-server=gqas014.sbu.lab.eng.bos.redhat.com --volfi'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f3f3dc56594 in ?? () from /usr/lib64/glusterfs/3.6.0.45/rpc-transport/socket.so
Missing separate debuginfos, use: debuginfo-install glusterfs-3.6.0.45-1.el6.x86_64
(gdb) thread info
No symbol table is loaded.  Use the "file" command.
(gdb) info thread
  10 Thread 0x7f3f3e860700 (LWP 21671)  0x00000038ad00b98e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  9 Thread 0x7f3f3fc62700 (LWP 21668)  0x00000038ad00f4b5 in sigwait () from /lib64/libpthread.so.0
  8 Thread 0x7f3f36fc1700 (LWP 21674)  0x00000038acce9163 in epoll_wait () from /lib64/libc.so.6
  7 Thread 0x7f3f3f261700 (LWP 21670)  0x00000038ad00b98e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  6 Thread 0x7f3f379c2700 (LWP 21673)  0x00000038ad00e264 in __lll_lock_wait () from /lib64/libpthread.so.0
  5 Thread 0x7f3f352bb700 (LWP 21678)  0x00000038ad00e264 in __lll_lock_wait () from /lib64/libpthread.so.0
  4 Thread 0x7f3f2bfff700 (LWP 21679)  0x00000038ad00e75d in read () from /lib64/libpthread.so.0
  3 Thread 0x7f3f40663700 (LWP 21667)  0x00000038ad00ef3d in nanosleep () from /lib64/libpthread.so.0
  2 Thread 0x7f3f47c18700 (LWP 21666)  0x00000038ad00822d in pthread_join () from /lib64/libpthread.so.0
* 1 Thread 0x7f3f3da31700 (LWP 21672)  0x00007f3f3dc56594 in ?? () from /usr/lib64/glusterfs/3.6.0.45/rpc-transport/socket.so
(gdb) bt
#0  0x00007f3f3dc56594 in ?? () from /usr/lib64/glusterfs/3.6.0.45/rpc-transport/socket.so
#1  0x00007f3f3dc57d1d in ?? () from /usr/lib64/glusterfs/3.6.0.45/rpc-transport/socket.so
#2  0x00007f3f480c4d1c in ?? () from /usr/lib64/libglusterfs.so.0
#3  0x00000038ad0079d1 in start_thread () from /lib64/libpthread.so.0
#4  0x00000038acce8b6d in clone () from /lib64/libc.so.6
(gdb) t 2
[Switching to thread 2 (Thread 0x7f3f47c18700 (LWP 21666))]#0  0x00000038ad00822d in pthread_join () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00000038ad00822d in pthread_join () from /lib64/libpthread.so.0
#1  0x00007f3f480c485d in ?? () from /usr/lib64/libglusterfs.so.0
#2  0x0000000000408022 in main ()
(gdb) t 3
[Switching to thread 3 (Thread 0x7f3f40663700 (LWP 21667))]#0  0x00000038ad00ef3d in nanosleep () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00000038ad00ef3d in nanosleep () from /lib64/libpthread.so.0
#1  0x00007f3f4808a03c in gf_timer_proc () from /usr/lib64/libglusterfs.so.0
#2  0x00000038ad0079d1 in start_thread () from /lib64/libpthread.so.0
#3  0x00000038acce8b6d in clone () from /lib64/libc.so.6
(gdb) t 4
[Switching to thread 4 (Thread 0x7f3f2bfff700 (LWP 21679))]#0  0x00000038ad00e75d in read () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00000038ad00e75d in read () from /lib64/libpthread.so.0
#1  0x00007f3f40676d03 in ?? () from /usr/lib64/glusterfs/3.6.0.45/xlator/mount/fuse.so
#2  0x00000038ad0079d1 in start_thread () from /lib64/libpthread.so.0
#3  0x00000038acce8b6d in clone () from /lib64/libc.so.6
(gdb) t 5
[Switching to thread 5 (Thread 0x7f3f352bb700 (LWP 21678))]#0  0x00000038ad00e264 in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00000038ad00e264 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00000038ad009508 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00000038ad0093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007f3f3dc54666 in ?? () from /usr/lib64/glusterfs/3.6.0.45/rpc-transport/socket.so
#4  0x00007f3f47e42bd7 in rpc_clnt_submit () from /usr/lib64/libgfrpc.so.0
#5  0x00007f3f3ce127c2 in client_submit_vec_request () from /usr/lib64/glusterfs/3.6.0.45/xlator/protocol/client.so
#6  0x00007f3f3ce12d06 in client3_3_writev () from /usr/lib64/glusterfs/3.6.0.45/xlator/protocol/client.so
#7  0x00007f3f3cdff01f in client_writev () from /usr/lib64/glusterfs/3.6.0.45/xlator/protocol/client.so
#8  0x00007f3f3cbd5d89 in dht_writev () from /usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so
#9  0x00007f3f3c990ee4 in wb_fulfill_head () from /usr/lib64/glusterfs/3.6.0.45/xlator/performance/write-behind.so
#10 0x00007f3f3c99107b in wb_fulfill () from /usr/lib64/glusterfs/3.6.0.45/xlator/performance/write-behind.so
#11 0x00007f3f3c992a92 in wb_process_queue () from /usr/lib64/glusterfs/3.6.0.45/xlator/performance/write-behind.so
#12 0x00007f3f3c993398 in wb_writev () from /usr/lib64/glusterfs/3.6.0.45/xlator/performance/write-behind.so
#13 0x00007f3f3c78357c in ra_writev () from /usr/lib64/glusterfs/3.6.0.45/xlator/performance/read-ahead.so
#14 0x00007f3f48073b30 in default_writev () from /usr/lib64/libglusterfs.so.0
#15 0x00007f3f3c3701c6 in ioc_writev () from /usr/lib64/glusterfs/3.6.0.45/xlator/performance/io-cache.so
#16 0x00007f3f3c164000 in qr_writev () from /usr/lib64/glusterfs/3.6.0.45/xlator/performance/quick-read.so
#17 0x00007f3f37df5b0b in mdc_writev () from /usr/lib64/glusterfs/3.6.0.45/xlator/performance/md-cache.so
#18 0x00007f3f37be0f37 in io_stats_writev () from /usr/lib64/glusterfs/3.6.0.45/xlator/debug/io-stats.so
#19 0x00007f3f48073b30 in default_writev () from /usr/lib64/libglusterfs.so.0
#20 0x00007f3f379c815e in meta_writev () from /usr/lib64/glusterfs/3.6.0.45/xlator/meta.so
#21 0x00007f3f40684943 in fuse_write_resume () from /usr/lib64/glusterfs/3.6.0.45/xlator/mount/fuse.so
#22 0x00007f3f4066c856 in ?? () from /usr/lib64/glusterfs/3.6.0.45/xlator/mount/fuse.so
---Type <return> to continue, or q <return> to quit---
#23 0x00007f3f4066c586 in ?? () from /usr/lib64/glusterfs/3.6.0.45/xlator/mount/fuse.so
#24 0x00007f3f4066c89e in ?? () from /usr/lib64/glusterfs/3.6.0.45/xlator/mount/fuse.so
#25 0x00007f3f4066c921 in fuse_resolve_continue () from /usr/lib64/glusterfs/3.6.0.45/xlator/mount/fuse.so
#26 0x00007f3f4066c52e in ?? () from /usr/lib64/glusterfs/3.6.0.45/xlator/mount/fuse.so
#27 0x00007f3f4066c87e in ?? () from /usr/lib64/glusterfs/3.6.0.45/xlator/mount/fuse.so
#28 0x00007f3f4066c8c8 in fuse_resolve_and_resume () from /usr/lib64/glusterfs/3.6.0.45/xlator/mount/fuse.so
#29 0x00007f3f406842e0 in ?? () from /usr/lib64/glusterfs/3.6.0.45/xlator/mount/fuse.so
#30 0x00000038ad0079d1 in start_thread () from /lib64/libpthread.so.0
#31 0x00000038acce8b6d in clone () from /lib64/libc.so.6 
(gdb) t 6
[Switching to thread 6 (Thread 0x7f3f379c2700 (LWP 21673))]#0  0x00000038ad00e264 in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00000038ad00e264 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00000038ad009508 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00000038ad0093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007f3f3dc55f2b in ?? () from /usr/lib64/glusterfs/3.6.0.45/rpc-transport/socket.so
#4  0x00007f3f3dc57d1d in ?? () from /usr/lib64/glusterfs/3.6.0.45/rpc-transport/socket.so
#5  0x00007f3f480c4d1c in ?? () from /usr/lib64/libglusterfs.so.0
#6  0x00000038ad0079d1 in start_thread () from /lib64/libpthread.so.0
#7  0x00000038acce8b6d in clone () from /lib64/libc.so.6
(gdb) t 7
[Switching to thread 7 (Thread 0x7f3f3f261700 (LWP 21670))]#0  0x00000038ad00b98e in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
(gdb) bt
#0  0x00000038ad00b98e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f3f480a9ccf in syncenv_task () from /usr/lib64/libglusterfs.so.0
#2  0x00007f3f480afc00 in syncenv_processor () from /usr/lib64/libglusterfs.so.0
#3  0x00000038ad0079d1 in start_thread () from /lib64/libpthread.so.0
#4  0x00000038acce8b6d in clone () from /lib64/libc.so.6
(gdb) t 8
[Switching to thread 8 (Thread 0x7f3f36fc1700 (LWP 21674))]#0  0x00000038acce9163 in epoll_wait () from /lib64/libc.so.6
(gdb) bt
#0  0x00000038acce9163 in epoll_wait () from /lib64/libc.so.6
#1  0x00007f3f480c4c11 in ?? () from /usr/lib64/libglusterfs.so.0
#2  0x00000038ad0079d1 in start_thread () from /lib64/libpthread.so.0
#3  0x00000038acce8b6d in clone () from /lib64/libc.so.6
(gdb) t 9
[Switching to thread 9 (Thread 0x7f3f3fc62700 (LWP 21668))]#0  0x00000038ad00f4b5 in sigwait () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00000038ad00f4b5 in sigwait () from /lib64/libpthread.so.0
#1  0x00000000004057bb in glusterfs_sigwaiter ()
#2  0x00000038ad0079d1 in start_thread () from /lib64/libpthread.so.0
#3  0x00000038acce8b6d in clone () from /lib64/libc.so.6
(gdb) t 10
[Switching to thread 10 (Thread 0x7f3f3e860700 (LWP 21671))]#0  0x00000038ad00b98e in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
(gdb) bt
#0  0x00000038ad00b98e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f3f480a9ccf in syncenv_task () from /usr/lib64/libglusterfs.so.0
#2  0x00007f3f480afc00 in syncenv_processor () from /usr/lib64/libglusterfs.so.0
#3  0x00000038ad0079d1 in start_thread () from /lib64/libpthread.so.0
#4  0x00000038acce8b6d in clone () from /lib64/libc.so.6
(gdb) t 11
Thread ID 11 not known.
(gdb) quit

--- Additional comment from Ben Turner on 2015-02-20 22:19:46 EST ---

Also to note, I am not seeing problems on pure distribute volumes.  These problems only have happened on volumes with replication so far.  I am fairly confident I can not repro this on pure dist volumes, ran for 3 hours straight on pure dist.  The FS hung on a replica on the first, third, fourth and sixth run when I switched to replica 2.

--- Additional comment from Ben Turner on 2015-02-21 17:42:18 EST ---

More cores on gqac022.sbu.lab.eng.bos.redhat.com:

[root@gqac022 tmp]# ll /tmp
total 262640
-rwx------ 1 root root      3989 Feb 20 21:36 bt.PID=10611UID=0
-rwx------ 1 root root      3157 Feb 20 22:03 bt.PID=11251UID=0
-rwx------ 1 root root 142323932 Feb 20 21:36 core.dump.PID=10611UID=0
-rwx------ 1 root root 126066908 Feb 20 22:03 core.dump.PID=11251UID=0
-rw-r--r-- 1 root root       112 Feb 20 21:36 core.info.PID=10611UID=0
-rw-r--r-- 1 root root       112 Feb 20 22:03 core.info.PID=11251UID=0

(gdb) bt
#0  rpc_clnt_fill_request_info (clnt=0x7f704806bc70, info=0x0) at rpc-clnt.c:444
#1  0x000000310540fd32 in rpc_clnt_notify (trans=<value optimized out>, mydata=0x7f704806bca0, event=<value optimized out>, 
    data=<value optimized out>) at rpc-clnt.c:877
#2  0x000000310540b4f8 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>)
    at rpc-transport.c:539
#3  0x00007f7057ba12fc in __socket_read_reply (this=0x7f704807b860) at socket.c:1818
#4  __socket_read_frag (this=0x7f704807b860) at socket.c:1887
#5  __socket_proto_state_machine (this=0x7f704807b860) at socket.c:2033
#6  socket_proto_state_machine (this=0x7f704807b860) at socket.c:2139
#7  socket_event_poll_in (this=0x7f704807b860) at socket.c:2160
#8  0x00007f7057ba1d1d in socket_event_handler (fd=<value optimized out>, idx=<value optimized out>, data=0x7f704807b860, poll_in=1, 
    poll_out=0, poll_err=0) at socket.c:2280
#9  0x00000031044783fc in event_dispatch_epoll_handler (data=0x1fbed90) at event-epoll.c:551
#10 event_dispatch_epoll_worker (data=0x1fbed90) at event-epoll.c:643
#11 0x00000031040079d1 in start_thread (arg=0x7f705775b700) at pthread_create.c:301
#12 0x0000003103ce8b6d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

--- Additional comment from Shyamsundar on 2015-02-23 12:37:45 EST ---

(In reply to Ben Turner from comment #4)

More from the core post installation of the gluster debug info:

Core was generated by `/usr/sbin/glusterfs --volfile-server=gqas014.sbu.lab.eng.bos.redhat.com --volfi'.
Program terminated with signal 11, Segmentation fault.
#0  __socket_proto_state_machine (this=0x7f3f30088360) at socket.c:2068
2068                                    vector[count].iov_base = iobuf_ptr (in->iobuf);

(gdb) bt
#0  __socket_proto_state_machine (this=0x7f3f30088360) at socket.c:2068
#1  socket_proto_state_machine (this=0x7f3f30088360) at socket.c:2139
#2  socket_event_poll_in (this=0x7f3f30088360) at socket.c:2160
#3  0x00007f3f3dc57d1d in socket_event_handler (fd=<value optimized out>, idx=<value optimized out>, data=0x7f3f30088360, poll_in=1, poll_out=0, poll_err=0) at socket.c:2280
#4  0x00007f3f480c4d1c in event_dispatch_epoll_handler (data=0x1242bc0) at event-epoll.c:551
#5  event_dispatch_epoll_worker (data=0x1242bc0) at event-epoll.c:643
#6  0x00000038ad0079d1 in start_thread (arg=0x7f3f3da31700) at pthread_create.c:301
#7  0x00000038acce8b6d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

(gdb) list
2063                                                    ret = -1;
2064                                                    goto out;
2065                                            }
2066                                    }
2067
2068                                    vector[count].iov_base = iobuf_ptr (in->iobuf);
2069                                    vector[count].iov_len = in->iobuf_size;
2070
2071                                    iobref = in->iobref;
2072

(gdb) info args 
pollin = <value optimized out>
this = 0x7f3f30088360

(gdb) info locals 
count = 1
ret = <value optimized out>
iobuf = <value optimized out>
frag = 0x7f3f30088fa8
priv = 0x7f3f30088f80
iobref = 0x7f3f30d524a0
vector = {{iov_base = 0x0, iov_len = 0}, {iov_base = 0x0, iov_len = 0}}
in = 0x7f3f30088fa0

(gdb) p in
$1 = (struct gf_sock_incoming *) 0x7f3f30088fa0

(gdb) p *in
$2 = {record_state = SP_STATE_NADA, frag = {fragcurrent = 0x7f3f47bf6dd8 "", bytes_read = 0, remaining_size = 0, vector = {iov_base = 0x7f3f47bf6cec, iov_len = 236}, pending_vector = 0x7f3f30088fb8, call_body = {request = {
        header_state = SP_STATE_REQUEST_HEADER_INIT, vector_state = SP_STATE_VECTORED_REQUEST_INIT, vector_sizer_state = 0}, reply = {accept_status = 0, status_state = SP_STATE_VECTORED_REPLY_STATUS_INIT, 
        accepted_state = SP_STATE_ACCEPTED_REPLY_INIT, accepted_success_state = SP_STATE_ACCEPTED_SUCCESS_REPLY_INIT}}, simple_state = SP_STATE_SIMPLE_MSG_INIT, state = SP_STATE_RPCFRAG_INIT}, proghdr_base_addr = 0x0, iobuf = 0x0, 
  iobuf_size = 236, vector = {{iov_base = 0x7f3f30089054, iov_len = 4}, {iov_base = 0x0, iov_len = 0}}, count = 0, payload_vector = {iov_base = 0x0, iov_len = 0}, iobref = 0x7f3f30d524a0, request_info = 0x0, 
  pending_vector = 0x7f3f30088fc8, pending_count = 0, fraghdr = 2147483884, complete_record = 0 '\000', msg_type = REPLY, total_bytes_read = 236, ra_read = 236, ra_max = 236, ra_served = 236, ra_buf = 0x7f3f47bf6c00 ""}

in->iobuf is NULL hence the de-reference of iobuf_ptr of in->iobuf has caused the seg fault in this case.

Interestingly the record_state should have changed from NADA to something before coming to this part of the code. Needs more analysis of the core to progress.

--- Additional comment from Shyamsundar on 2015-02-23 16:15:40 EST ---

Analysis as follows shows that we are probably racing between 2 epoll threads hitting the said race in the review @ http://review.gluster.org/#/c/3842/19/libglusterfs/src/event-epoll.c

The issue is that we drop the socket priv lock here, __socket_read_reply which we assume is not dropped in between a call to, __socket_proto_state_machine.

From the various threads in the core, we can see 3 threads operating on the same socket, one in write and 2 in read (one waiting on the lock in __socket_proto_state_machine and the other causing the crash above).

From the strack trace in comment #9 above, we see that the record_state should not be NADA, at the point of the crash. This could only happen in case another thread raced on the same and the processing within __socket_proto_state_machine was garbled.

The other 2 threads that are presently operating on the same socket are as follows,
Thread 6 (Thread 0x7f3f379c2700 (LWP 21673)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00000038ad009508 in _L_lock_854 () from /lib64/libpthread-2.12.so
#2  0x00000038ad0093d7 in __pthread_mutex_lock (mutex=0x7f3f30089088) at pthread_mutex_lock.c:61
#3  0x00007f3f3dc55f2b in socket_proto_state_machine (this=0x7f3f30088360) at socket.c:2137
#4  socket_event_poll_in (this=0x7f3f30088360) at socket.c:2160
#5  0x00007f3f3dc57d1d in socket_event_handler (fd=<value optimized out>, idx=<value optimized out>, data=0x7f3f30088360, poll_in=1, poll_out=0, poll_err=0) at socket.c:2280
#6  0x00007f3f480c4d1c in event_dispatch_epoll_handler (data=0x7f3f3002bf50) at event-epoll.c:551
#7  event_dispatch_epoll_worker (data=0x7f3f3002bf50) at event-epoll.c:643
#8  0x00000038ad0079d1 in start_thread (arg=0x7f3f379c2700) at pthread_create.c:301
#9  0x00000038acce8b6d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 5 (Thread 0x7f3f352bb700 (LWP 21678)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00000038ad009508 in _L_lock_854 () from /lib64/libpthread-2.12.so
#2  0x00000038ad0093d7 in __pthread_mutex_lock (mutex=0x7f3f30089088) at pthread_mutex_lock.c:61
#3  0x00007f3f3dc54666 in socket_submit_request (this=0x7f3f30088360, req=0x7f3f352ba050) at socket.c:3166
#4  0x00007f3f47e42bd7 in rpc_clnt_submit (rpc=0x7f3f30078760, prog=<value optimized out>, procnum=<value optimized out>, cbkfn=0x7f3f3ce1c3d0 <client3_3_writev_cbk>, proghdr=0x7f3f352ba1a0, proghdrcount=1, progpayload=0x7f3f352ba690, 
    progpayloadcount=1, iobref=<value optimized out>, frame=0x7f3f46f0b474, rsphdr=0x0, rsphdr_count=0, rsp_payload=0x0, rsp_payload_count=0, rsp_iobref=0x0) at rpc-clnt.c:1555
#5  0x00007f3f3ce127c2 in client_submit_vec_request (this=0x7f3f300060f0, req=0x7f3f352ba460, frame=0x7f3f46f0b474, prog=0x7f3f3d030820, procnum=13, cbkfn=0x7f3f3ce1c3d0 <client3_3_writev_cbk>, payload=0x7f3f352ba690, payloadcnt=1, 
    iobref=0x7f3f20025a90, xdrproc=0x7f3f47c25ec0 <xdr_gfs3_write_req>) at client-rpc-fops.c:88
#6  0x00007f3f3ce12d06 in client3_3_writev (frame=0x7f3f46f0b474, this=0x7f3f300060f0, data=0x7f3f352ba4f0) at client-rpc-fops.c:4187
#7  0x00007f3f3cdff01f in client_writev (frame=0x7f3f46f0b474, this=0x7f3f300060f0, fd=<value optimized out>, vector=<value optimized out>, count=<value optimized out>, off=<value optimized out>, flags=32770, iobref=0x7f3f20025a90, 
    xdata=0x0) at client.c:1016

So basically I believe we hit the race condition due to dropping the lock in between, as errno in the main thread that crashed is also 11 (EAGAIN) but not handled overall. We need to verify the analysis to move forward with a fix for the same,

Verify,
1) Can __socket_read_reply drop the lock processing map_xid?
2) The incoming state as present for this request, seems to have read a few bytes or an entire RPC request (check ra_read/max/served), so is it possible that the 2 threads caused this garbling of data?
3) The write seems to need poll_out as the socket ioq seems to have a request cached (inspect priv->ioq where priv is socket_private_t)
4) The socket is also primed for poll_out as seen in slot->events on frame 5 (0xc0000007 - where 07 is the IN/OUT/PRI (ONESHOT and ET are the higher bits, see sys/epoll.h), which means a poll-out was requested and hence the chance for the socket to be armed once more on the said race.
5) The slot in event_dispatch_epoll_handler also has an in handler count of 2, but the threads evidently demonstrate the same.

Bottomline, even with LT mode of epoll (which is another patch to handle the throttle) we could hit this condition, as a result we would need a fix for the mentioned race in the review.

--- Additional comment from Raghavendra G on 2015-02-23 23:28:46 EST ---

@shyam,

Good one!! I was raking my heads from past two days thinking how can someone modify the state since we are holding the lock. Now that you point out the code its clearer. Yes, dropping the lock in __socket_read_reply is highly likely reason for the crash.

regards,
Raghavendra

Comment 1 Anand Avati 2015-02-27 14:44:23 UTC

REVIEW: http://review.gluster.org/9742 (socket: allow only one epoll thread to read msg fragments) posted (#3) for review on master by Shyamsundar Ranganathan (srangana)

Comment 2 Anand Avati 2015-02-27 14:45:34 UTC

REVIEW: http://review.gluster.org/9742 (socket: allow only one epoll thread to read msg fragments) posted (#4) for review on master by Shyamsundar Ranganathan (srangana)

Comment 3 Anand Avati 2015-02-28 05:16:15 UTC

COMMIT: http://review.gluster.org/9742 committed in master by Raghavendra G (rgowdapp) 
------
commit b117d4d84becd25ef79c049ebf9b8ec6c4abca88
Author: Krishnan Parthasarathi <kparthas>
Date:   Wed Feb 25 15:49:11 2015 +0530

    socket: allow only one epoll thread to read msg fragments
    
    __socket_read_reply function releases sock priv->lock briefly for
    notifying higher layers of message's xid. This could result in other
    epoll threads that are processing events on this socket to read further
    fragments of the same message. This may lead to incorrect fragment
    processing and result in a crash.
    
    Change-Id: I915665b2e54ca16f2ad65970e51bf76c65d954a4
    BUG: 1197118
    Signed-off-by: Krishnan Parthasarathi <kparthas>
    Signed-off-by: Shyam <srangana>
    Reviewed-on: http://review.gluster.org/9742
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Raghavendra G <rgowdapp>
    Tested-by: Raghavendra G <rgowdapp>

Comment 4 Niels de Vos 2015-05-14 17:29:12 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 5 Niels de Vos 2015-05-14 17:35:51 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 6 Niels de Vos 2015-05-14 17:38:13 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 7 Niels de Vos 2015-05-14 17:46:12 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user