1381970 – GlusterFS Daemon stops working after a longer runtime and higher file workload due to design flaws?

Bug 1381970 - GlusterFS Daemon stops working after a longer runtime and higher file workload due to design flaws?

Summary: GlusterFS Daemon stops working after a longer runtime and higher file workloa...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	nfs
Sub Component:
Version:	mainline
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	urgent
Target Milestone:	---
Assignee:	Niels de Vos
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1411344 1450377 1450378 1450380 1467313
TreeView+	depends on / blocked

Reported:	2016-10-05 13:01 UTC by Jules
Modified:	2017-10-26 14:36 UTC (History)
CC List:	5 users (show)
Fixed In Version:	glusterfs-3.12.0
Clone Of:
Environment:
Last Closed:	2017-09-05 17:24:45 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
/var/log/gluster from the arbiter node (12.29 MB, application/octet-stream) 2016-12-02 19:03 UTC, Giuseppe Ragusa	no flags	Details
/var/log/gluster from node 2 (17.05 MB, application/octet-stream) 2016-12-02 19:05 UTC, Giuseppe Ragusa	no flags	Details
/var/log/gluster from node 1 (part1) (13.65 MB, application/x-xz) 2016-12-02 19:39 UTC, Giuseppe Ragusa	no flags	Details
/var/log/gluster from node 1 (part2) (12.21 MB, application/x-xz) 2016-12-02 19:41 UTC, Giuseppe Ragusa	no flags	Details
/var/log/gluster from node 1 (part3) (4.94 MB, application/x-xz) 2016-12-02 19:42 UTC, Giuseppe Ragusa	no flags	Details
/var/log/gluster from node 1 (part4) (5.63 MB, application/x-xz) 2016-12-02 19:44 UTC, Giuseppe Ragusa	no flags	Details
/var/log/gluster from node 1 (part5) (1.59 MB, application/x-xz) 2016-12-02 19:45 UTC, Giuseppe Ragusa	no flags	Details
Logs from the time while the nfs crashed. Node2d was the failing one (11.83 KB, application/zip) 2017-02-05 11:16 UTC, Jules	no flags	Details
View All

Description Jules 2016-10-05 13:01:26 UTC

Description of problem:

Since first Release running GlusterFS 3.7 for some time glusterfsd seem to stop working/crashing and freezes all the involved nodes which are member of the glusterfs cluster.
At that state you can't login to the machines anymore and need to force reboot them, since the volumes mounted becoming inaccessable and locking the whole system.

Here is the nfs.log Output Before/after the Crash:

First Node:

[2016-10-05 09:30:13.410418] W [socket.c:596:__socket_rwv] 0-NLM-client: readv on 10.1.20.32:23198 failed (No data available)
[2016-10-05 09:32:31.243012] E [socket.c:2292:socket_connect_finish] 0-NLM-client: connection to 10.1.20.32:23198 failed (Connection timed out)
pending frames:

frame : type(0) op(0)  <--- Repeating thousand times

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2016-10-05 09:32:33
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.15
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_msg_backtrace_nomem+0x7e)[0x7f28321a3fbe]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x31d)[0x7f28321c670d]
/lib/x86_64-linux-gnu/libc.so.6(+0x350e0)[0x7f2830c450e0]
/lib/x86_64-linux-gnu/libc.so.6(+0x91d8a)[0x7f2830ca1d8a]
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/nfs/server.so(nlm_set_rpc_clnt+0x62)[0x7f2827714c32]
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/nfs/server.so(nlm_rpcclnt_notify+0x35)[0x7f2827717395]
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x2aa)[0x7f2831f7147a]
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7f2831f6d733]
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/rpc-transport/socket.so(+0x4a73)[0x7f282d0d8a73]
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/rpc-transport/socket.so(+0x8e1f)[0x7f282d0dce1f]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x8e722)[0x7f283220c722]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)[0x7f28314270a4]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f2830cf862d]


Second Node:

pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2016-10-05 09:35:44
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.15
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_msg_backtrace_nomem+0x7e)[0x7fa8251f0fbe]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x31d)[0x7fa82521370d]
/lib/x86_64-linux-gnu/libc.so.6(+0x350e0)[0x7fa823c920e0]
/lib/x86_64-linux-gnu/libc.so.6(+0x91d8a)[0x7fa823ceed8a]
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/nfs/server.so(nlm_set_rpc_clnt+0x62)[0x7fa81e907c32]
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/nfs/server.so(nlm_rpcclnt_notify+0x35)[0x7fa81e90a395]
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x2aa)[0x7fa824fbe47a]
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fa824fba733]
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/rpc-transport/socket.so(+0x4a73)[0x7fa820125a73]
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/rpc-transport/socket.so(+0x8e1f)[0x7fa820129e1f]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x8e722)[0x7fa825259722]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)[0x7fa8244740a4]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fa823d4562d]


And here the most common lines from /var/log/glusterfs/bricks/*.log:

[2016-10-04 11:18:34.009309] E [MSGID: 113027] [posix.c:1538:posix_mkdir] 0-netshare-posix: mkdir of /xxxxxxx/shared/web failed [File exists]
[2016-10-04 12:06:31.511435] E [MSGID: 115050] [server-rpc-fops.c:179:server_lookup_cbk] 0-netshare-server: 1147010245: LOOKUP /www/schausteller.de/releases/20161004111428/web/anzeigenmarkt (bed02db1-dd66-4244-9adb-5e8b22513d62/anzeigenmarkt) ==> (Permission denied) [Permission denied]
[2016-10-04 15:44:03.460733] E [MSGID: 113001] [posix.c:5415:_posix_handle_xattr_keyvalue_pair] 0-netshare-posix: getxattr failed on /storage/gfs/netshare/.glusterfs/42/ef/42ef035a-7688-45e2-ae2e-1306e49fcc9f while doing xattrop: Key:trusted.afr.dirty  [No such file or directory]
[2016-10-04 22:24:13.715602] E [rpcsvc.c:565:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully
[2016-10-04 22:24:13.715623] E [server-helpers.c:390:server_alloc_frame] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x325) [0x7f1b892f45e5] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/protocol/server.so(server3_3_lookup+0x8b) [0x7f1b80b3649b] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/protocol/server.so(get_frame_from_request+0x302) [0x7f1b80b1dc82] ) 0-server: invalid argument: client [Invalid argument]
[2016-10-04 22:24:13.715631] E [rpcsvc.c:565:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully
[2016-10-04 22:24:13.715654] E [server-helpers.c:390:server_alloc_frame] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x325) [0x7f1b892f45e5] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/protocol/server.so(server3_3_lookup+0x8b) [0x7f1b80b3649b] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/protocol/server.so(get_frame_from_request+0x302) [0x7f1b80b1dc82] ) 0-server: invalid argument: client [Invalid argument]
[2016-10-04 22:28:08.361526] W [socket.c:596:__socket_rwv] 0-tcp.netshare-server: writev on 10.1.20.1:49124 failed (Broken pipe)
[2016-10-04 22:28:08.361996] E [rpcsvc.c:1314:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x6a4f5f, Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) to rpc-transport (tcp.netshare-server)
[2016-10-04 22:28:08.362163] W [entrylk.c:757:pl_entrylk_log_cleanup] 0-netshare-server: releasing lock on 67caf764-b306-4e59-a8d2-891287ba33b0 held by {client=0x7f552400ff80, pid=-6 lk-owner=78bce576c67f0000}
[2016-10-04 22:28:08.362250] W [entrylk.c:757:pl_entrylk_log_cleanup] 0-netshare-server: releasing lock on 67caf764-b306-4e59-a8d2-891287ba33b0 held by {client=0x7f552400ff80, pid=-6 lk-owner=78bce576c67f0000}
[2016-10-04 22:28:08.362402] E [MSGID: 113040] [posix-helpers.c:1640:__posix_fd_ctx_get] 0-netshare-posix: Failed to get fd context for a non-anonymous fd, gfid: 4dd09768-deb2-493c-a7f3-aa70c79c21e5
[2016-10-04 22:28:08.362431] W [MSGID: 113006] [posix.c:3444:posix_flush] 0-netshare-posix: pfd is NULL on fd=0x7f552c0ce8a4 [Invalid argument]
[2016-10-04 22:28:08.363021] W [inodelk.c:404:pl_inodelk_log_cleanup] 0-netshare-server: releasing lock on 1051d710-8d00-48f7-85ae-eea721b3b86e held by {client=0x7f55240834b0, pid=-6 lk-owner=90d525f2a27f0000}
[2016-10-04 22:28:08.363027] E [rpcsvc.c:1314:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x472ca5d9, Program: GlusterFS 3.3, ProgVers: 330, Proc: 22) to rpc-transport (tcp.netshare-server)
[2016-10-04 22:28:08.367940] E [server-helpers.c:390:server_alloc_frame] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x325) [0x7f5538dd15e5] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/protocol/server.so(server3_3_lookup+0x8b) [0x7f55306ac49b] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/protocol/server.so(get_frame_from_request+0x302) [0x7f5530693c82] ) 0-server: invalid argument: client [Invalid argument]
[2016-10-04 22:28:08.367953] E [server.c:205:server_submit_reply] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/debug/io-stats.so(io_stats_lookup_cbk+0x117) [0x7f55308d3017] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/protocol/server.so(server_lookup_cbk+0x50e) [0x7f55306abe1e] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/protocol/server.so(server_submit_reply+0x2ed) [0x7f553068ffed] ) 0-: Reply submission failed
[2016-10-04 22:28:08.586426] E [MSGID: 113001] [posix.c:5415:_posix_handle_xattr_keyvalue_pair] 0-netshare-posix: getxattr failed on /storage/gfs/netshare/.glusterfs/9a/7d/9a7d25ee-a7c4-45ed-9e62-2c780e11f700 while doing xattrop: Key:trusted.afr.dirty  [No such file or directory]
[2016-10-04 23:43:03.514841] E [rpcsvc.c:1314:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x28bc4, Program: GlusterFS 3.3, ProgVers: 330, Proc: 40) to rpc-transport (tcp.netshare-server)
[2016-10-04 23:43:03.514923] E [server.c:205:server_submit_reply] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/debug/io-stats.so(io_stats_readdirp_cbk+0x18f) [0x7f55308d1f8f] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/protocol/server.so(server_readdirp_cbk+0xe3) [0x7f553069a653] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/protocol/server.so(server_submit_reply+0x2ed) [0x7f553068ffed] ) 0-: Reply submission failed
[2016-10-04 23:43:03.515148] E [rpcsvc.c:1314:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x28bc3, Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) to rpc-transport (tcp.netshare-server)
[2016-10-04 23:43:03.515192] E [server.c:205:server_submit_reply] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/debug/io-stats.so(io_stats_lookup_cbk+0x117) [0x7f55308d3017] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/protocol/server.so(server_lookup_cbk+0x50e) [0x7f55306abe1e] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/protocol/server.so(server_submit_reply+0x2ed) [0x7f553068ffed] ) 0-: Reply submission failed
[2016-10-04 23:43:04.518386] W [entrylk.c:757:pl_entrylk_log_cleanup] 0-netshare-server: releasing lock on 57869c3e-ab7d-42e1-8460-249c8ec1cc8d held by {client=0x7f5524481400, pid=-6 lk-owner=c46de6952c7f0000}
[2016-10-04 23:43:04.518429] W [entrylk.c:757:pl_entrylk_log_cleanup] 0-netshare-server: releasing lock on 57869c3e-ab7d-42e1-8460-249c8ec1cc8d held by {client=0x7f5524481400, pid=-6 lk-owner=c46de6952c7f0000}
[2016-10-04 23:43:04.518487] E [MSGID: 113040] [posix-helpers.c:1640:__posix_fd_ctx_get] 0-netshare-posix: Failed to get fd context for a non-anonymous fd, gfid: 4dd09768-deb2-493c-a7f3-aa70c79c21e5
[2016-10-04 23:43:04.518512] W [MSGID: 113006] [posix.c:3444:posix_flush] 0-netshare-posix: pfd is NULL on fd=0x7f552c0cfc78 [Invalid argument]
[2016-10-04 23:43:04.532080] E [rpcsvc.c:1314:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1ad80, Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) to rpc-transport (tcp.netshare-server)
[2016-10-04 23:43:06.531762] W [entrylk.c:757:pl_entrylk_log_cleanup] 0-netshare-server: releasing lock on 67caf764-b306-4e59-a8d2-891287ba33b0 held by {client=0x7f552c0f6ae0, pid=-6 lk-owner=c8218da72e7f0000}
[2016-10-04 23:43:06.531816] E [MSGID: 113040] [posix-helpers.c:1640:__posix_fd_ctx_get] 0-netshare-posix: Failed to get fd context for a non-anonymous fd, gfid: cb026705-8b5c-45db-9333-c1ca1fa4c025
[2016-10-04 23:44:04.530858] E [MSGID: 113040] [posix-helpers.c:1640:__posix_fd_ctx_get] 0-netshare-posix: Failed to get fd context for a non-anonymous fd, gfid: 4dd09768-deb2-493c-a7f3-aa70c79c21e5
[2016-10-04 23:44:04.530880] W [MSGID: 113006] [posix.c:3444:posix_flush] 0-netshare-posix: pfd is NULL on fd=0x7f552c0d0260 [Invalid argument]
[2016-10-05 10:04:50.164154] W [MSGID: 115009] [server-resolve.c:574:server_resolve] 0-netshare-server: no resolution type for (null) (LOOKUP)
[2016-10-05 10:04:50.164281] E [MSGID: 115050] [server-rpc-fops.c:179:server_lookup_cbk] 0-netshare-server: 782572: LOOKUP (null) (00000000-0000-0000-0000-000000000000/HistoryEntry.orm.xml) ==> (Invalid argument) [Invalid argument]



Version-Release number of selected component (if applicable): 3.7.15


How reproducible:
Serving daily mixed content/data with read/write access and wait a few weeks for it to crash.

Steps to Reproduce:
1. Only reproducable on higher workload glusterfs clusters.
2. Produce daily mixed read/write access to NFS mounted Volumes.
3. Wait until gluster daemon crashes.
4. Have fun getting it under control.

Actual results:
Non Production Ready Releases since first GlusterFS 3.7

Expected results:
Stability!

Additional info:
Since first Release of GlusterFS 3.7 i'm facing into this issue, GlusterFS after some time soon or later becomes instable and will Overload your Servers. At that state you can't do anything else than press the reset button or force a reboot.

Comment 1 Vijay Bellur 2016-10-05 17:32:55 UTC

Niels, does look like a crash in gNFS. can you please look into this one?

Comment 2 Giuseppe Ragusa 2016-11-27 12:15:29 UTC

I can confirm this bug (or something very similar) on latest 3.7.17 on CentOS 7.2

I have some replicated (replica 3 with arbiter) distributed volumes some of which have NFS enabled (Gluster NFS, not NFSGanesha).

After some NFS activity (few hours), all NFS translator processes die on the non-arbiter-brick nodes (all arbiter bricks for all volumes are confined to one node).

Simply stopping then restarting one of the NFS-enabled volumes make the NFS translator processes restart on all volumes and all is fine up to next time.

Here are the relevant contents from /var/log/glusterfs/nfs.log at about the same event (clocks are synchronized between nodes) on the two non-arbiter nodes (nothing relevant was found on the arbiter node).

Node names are shockley (arbiter), read and hall.

The Gluster cluster has been formed on a dedicated (3x1Gbps LACP bonded) interface.
The NFS traffic with NFS clients happens on a different dedicated (2x1Gbps LACP bonded) interface.

read:

[2016-11-26 19:07:41.439354] W [socket.c:596:__socket_rwv] 0-NLM-client: readv o
n 172.25.15.39:4001 failed (No data available)
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2016-11-26 19:07:51
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.17
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xc2)[0x7fef26c4b622]
/lib64/libglusterfs.so.0(gf_print_trace+0x31d)[0x7fef26c70ddd]
/lib64/libc.so.6(+0x35670)[0x7fef25337670]
/lib64/libc.so.6(+0x132ad6)[0x7fef25434ad6]
/usr/lib64/glusterfs/3.7.17/xlator/nfs/server.so(nlm_set_rpc_clnt+0x62)[0x7fef185de912]
/usr/lib64/glusterfs/3.7.17/xlator/nfs/server.so(nlm_rpcclnt_notify+0x35)[0x7fef185e12a5]
/lib64/libgfrpc.so.0(rpc_clnt_notify+0x214)[0x7fef26a199b4]
/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fef26a15843]
/usr/lib64/glusterfs/3.7.17/rpc-transport/socket.so(+0x8d77)[0x7fef1b851d77]
/usr/lib64/glusterfs/3.7.17/rpc-transport/socket.so(+0x919f)[0x7fef1b85219f]
/lib64/libglusterfs.so.0(+0x9554a)[0x7fef26cba54a]
/lib64/libpthread.so.0(+0x7dc5)[0x7fef25ab3dc5]
/lib64/libc.so.6(clone+0x6d)[0x7fef253f8ced]
---------


hall:

[2016-11-26 19:25:42.784519] W [socket.c:596:__socket_rwv] 0-NLM-client: readv o
n 172.25.15.39:4001 failed (No data available)
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2016-11-26 19:25:52
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.17
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xc2)[0x7f11e5c0d622]
/lib64/libglusterfs.so.0(gf_print_trace+0x31d)[0x7f11e5c32ddd]
/lib64/libc.so.6(+0x35670)[0x7f11e42f9670]
/lib64/libc.so.6(+0x132ad6)[0x7f11e43f6ad6]
/usr/lib64/glusterfs/3.7.17/xlator/nfs/server.so(nlm_set_rpc_clnt+0x62)[0x7f11d34ab912]
/usr/lib64/glusterfs/3.7.17/xlator/nfs/server.so(nlm_rpcclnt_notify+0x35)[0x7f11d34ae2a5]
/lib64/libgfrpc.so.0(rpc_clnt_notify+0x214)[0x7f11e59db9b4]
/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7f11e59d7843]
/usr/lib64/glusterfs/3.7.17/rpc-transport/socket.so(+0x8d77)[0x7f11da813d77]
/usr/lib64/glusterfs/3.7.17/rpc-transport/socket.so(+0x919f)[0x7f11da81419f]
/lib64/libglusterfs.so.0(+0x9554a)[0x7f11e5c7c54a]
/lib64/libpthread.so.0(+0x7dc5)[0x7f11e4a75dc5]
/lib64/libc.so.6(clone+0x6d)[0x7f11e43baced]
---------

nothing on shockley (arbiter node)

Comment 3 Giuseppe Ragusa 2016-11-29 22:00:42 UTC

Slight correction to comment #2 :

As shown by timestamps, the crash are not synchronous, in fact I found the following workaround:

A cron job on the arbiter node greps the output of "ctdb status" (there is a CTDB cluster which assigns a couple of IPs on the NFS network: all nodes are members but the arbiter node does not participate in IP sharing) looking for UNHEALTHY message (CTDB detects NFS crashes by means of the monitoring script tracked in BZ #1371178) and if found stops then restarts an unused Gluster volume which has NFS enabled.

Note that generating a little bit of load like recreating yum metadata on a CentOS7 os mirror hosted on a NFS share is an almost sure way of getting a crash, sometimes even corrupting the repodata/.olddata directory (simply recreating/removing those dirs is enough to restore sanity, apparently).

Comment 4 Giuseppe Ragusa 2016-12-02 18:54:46 UTC

Further notes:

After extensive log reading and data cross-checks, we found no corruption on user data; even more: users did not experience any visible error on NFS automounted areas (mounted "hard"), only some delays and a web re-shared area got some transient 404s.

All NFS clients (both physical and virtual, hosted on the hyperconverged infrastructure) are on a dedicated jumbo-frame (9000) VLAN while the Gluster cluster has been formed and relegated (by means of DNS node names) on a dedicated jumbo-frame (9000) VLAN with no access from virtual machines (no Linux bridge over the Gluster node LACP bonds).

We disabled all offloads from the Gluster node nics involved in NFS traffic (since on the Gluster nodes the access to the NFS VLAN is by means of a LACP-bonded-then-bridged link).

We tried increasing server.outstanding-rpc-limit to 256 (often suggested on Gluster ML, albeit mainly for performance reasons).

We tried lowering nfs.outstanding-rpc-limit to 8 (cfr BZ #1008301).

We fixed a sm-notify problem (since the proper hostnames of NFS clients map to public LAN IPs, sm-notify failed to reach them after each Gluster NFS restart beacuse NFS traffic on public interface is denied by iptables rules: fixed by forcing client public names to NFS private IPs in /etc/hosts on each Gluster node).

We found that oVirt makes a constant flux of Gluster volume queries by means of VDSM (more than one per second), but these should be informative only (albeit the load they generate could matter).

We found a couple of still pending but probably unrelevant VDSM errors (one related to the split-network setup which should be solved in more recent VDSM, the other a "device" error with python backtrace while invoking glusterVolumeStatus which does not fail if invoked by hand with VdsClient)

Nothing of the above solved the crash problem.

This is the sample configuration of one involved volume:

Volume Name: home
Type: Distributed-Replicate
Volume ID: 31636f1a-79d7-4065-b345-c14a727330ac
Status: Started
Number of Bricks: 3 x (2 + 1) = 9
Transport-type: tcp
Bricks:
Brick1: read.gluster.private:/srv/glusterfs/disk0/home_brick
Brick2: hall.gluster.private:/srv/glusterfs/disk0/home_brick
Brick3: shockley.gluster.private:/srv/glusterfs/disk0/home_arbiter_brick (arbiter)
Brick4: read.gluster.private:/srv/glusterfs/disk1/home_brick
Brick5: hall.gluster.private:/srv/glusterfs/disk1/home_brick
Brick6: shockley.gluster.private:/srv/glusterfs/disk1/home_arbiter_brick (arbiter)
Brick7: read.gluster.private:/srv/glusterfs/disk2/home_brick
Brick8: hall.gluster.private:/srv/glusterfs/disk2/home_brick
Brick9: shockley.gluster.private:/srv/glusterfs/disk2/home_arbiter_brick (arbiter)
Options Reconfigured:
nfs.acl: off
nfs.addr-namelookup: off
network.ping-timeout: 10
cluster.server-quorum-type: server
cluster.quorum-type: auto
storage.batch-fsync-delay-usec: 0
nfs.enable-ino32: off
cluster.lookup-optimize: on
performance.write-behind-window-size: 32MB
performance.write-behind: on
performance.io-thread-count: 8
performance.cache-refresh-timeout: 4
performance.stat-prefetch: off
client.bind-insecure: on
server.allow-insecure: on
storage.owner-gid: 0
storage.owner-uid: 0
nfs.auth-cache-ttl-sec: 600
nfs.auth-refresh-interval-sec: 600
nfs.exports-auth-enable: off
nfs.disable: off
user.cifs: disable
performance.readdir-ahead: on

I will attach xz-compressed tar archives of the /var/log/gluster dir on each node.

Brief summary of the IPs involved:

shockley (arbiter node):
NFS: 172.25.15.21
GlusterFS: 172.25.5.21

read (node1):
NFS: 172.25.15.22
GlusterFS: 172.25.5.22

hall (node2):
NFS: 172.25.15.23
GlusterFS: 172.25.5.23

The CTDB cluster assigns (only between node1 and node2) the additional NFS IPs:
172.25.15.202
172.25.15.203

Other relevant configuration notes:

Gluster/CTDB are CPU-bound by means of a systemd slice (with CPUShares=1024).

Comment 5 Giuseppe Ragusa 2016-12-02 19:03:54 UTC

Created attachment 1227459 [details]
/var/log/gluster from the arbiter node

Comment 6 Giuseppe Ragusa 2016-12-02 19:05:48 UTC

Created attachment 1227460 [details]
/var/log/gluster from node 2

Comment 7 Giuseppe Ragusa 2016-12-02 19:39:39 UTC

Created attachment 1227469 [details]
/var/log/gluster from node 1 (part1)

Comment 8 Giuseppe Ragusa 2016-12-02 19:41:19 UTC

Created attachment 1227470 [details]
/var/log/gluster from node 1 (part2)

Comment 9 Giuseppe Ragusa 2016-12-02 19:42:47 UTC

Created attachment 1227471 [details]
/var/log/gluster from node 1 (part3)

Comment 10 Giuseppe Ragusa 2016-12-02 19:44:10 UTC

Created attachment 1227472 [details]
/var/log/gluster from node 1 (part4)

Comment 11 Giuseppe Ragusa 2016-12-02 19:45:12 UTC

Created attachment 1227473 [details]
/var/log/gluster from node 1 (part5)

Comment 12 Jules 2017-02-05 09:40:55 UTC

We upgraded to latest GlusterFS 3.9.1-1 Release this night and realizing that this makes the behavior even worse. It didn't took more than 8 Hours until the same behavior happens.

Here is the output of the nfs.log:

Final graph:
+------------------------------------------------------------------------------+
  1: volume netshare-client-0
  2:     type protocol/client
  3:     option ping-timeout 5
  4:     option remote-host node2c
  5:     option remote-subvolume /storage/gfs/netshare
  6:     option transport-type socket
  7:     option username 5385f910-8b41-4c66-a1d3-f6d9c21a2733
  8:     option password 7a096f83-1017-4de4-a72d-24a1a61a2c8f
  9:     option send-gids true
 10: end-volume
 11:  
 12: volume netshare-client-1
 13:     type protocol/client
 14:     option ping-timeout 5
 15:     option remote-host node2d
 16:     option remote-subvolume /storage/gfs/netshare
 17:     option transport-type socket
 18:     option username 5385f910-8b41-4c66-a1d3-f6d9c21a2733
 19:     option password 7a096f83-1017-4de4-a72d-24a1a61a2c8f
 20:     option send-gids true
 21: end-volume
 22:  
 23: volume netshare-replicate-0
 24:     type cluster/replicate
 25:     option use-compound-fops off
 26:     subvolumes netshare-client-0 netshare-client-1
 27: end-volume
 28:  
 29: volume netshare-dht
 30:     type cluster/distribute
 31:     option min-free-disk 5%
 32:     option readdir-optimize on
 33:     option lock-migration off
 34:     subvolumes netshare-replicate-0
 35: end-volume
 36:  
 37: volume netshare-write-behind
 38:     type performance/write-behind
 39:     subvolumes netshare-dht
 40: end-volume
 41:  
 42: volume netshare
 43:     type debug/io-stats
 44:     option log-level WARNING
 45:     option latency-measurement off
 46:     option count-fop-hits off
 47:     subvolumes netshare-write-behind
 48: end-volume
 49:  
 50: volume netshare-dev-client-0
 51:     type protocol/client
 52:     option ping-timeout 5
 53:     option remote-host node2c
 54:     option remote-subvolume /storage/gfs/netshare-dev
 55:     option transport-type socket
 56:     option username edf1a0ce-db21-405d-a6f2-e96e1ecc629f
 57:     option password 8f27fe64-137e-4e62-8f03-1f6a69ad1d86
 58:     option send-gids true
 59: end-volume
 60:  
 61: volume netshare-dev-client-1
 62:     type protocol/client
 63:     option ping-timeout 5
 64:     option remote-host node2d
 65:     option remote-subvolume /storage/gfs/netshare-dev
 66:     option transport-type socket
 67:     option username edf1a0ce-db21-405d-a6f2-e96e1ecc629f
 68:     option password 8f27fe64-137e-4e62-8f03-1f6a69ad1d86
 69:     option send-gids true
 70: end-volume
 71:  
 72: volume netshare-dev-replicate-0
 73:     type cluster/replicate
 74:     option use-compound-fops off
 75:     subvolumes netshare-dev-client-0 netshare-dev-client-1
 76: end-volume
 77:  
 78: volume netshare-dev-dht
 79:     type cluster/distribute
 80:     option min-free-disk 5%
 81:     option readdir-optimize on
 82:     option lock-migration off
 83:     subvolumes netshare-dev-replicate-0
 84: end-volume
 85:  
 86: volume netshare-dev-write-behind
 87:     type performance/write-behind
 88:     subvolumes netshare-dev-dht
 89: end-volume
 90:  
 91: volume netshare-dev
 92:     type debug/io-stats
 93:     option log-level WARNING
 94:     option latency-measurement off
 95:     option count-fop-hits off
 96:     subvolumes netshare-dev-write-behind
 97: end-volume
 98:  
 99: volume nfs-server
100:     type nfs/server
101:     option rpc-auth.auth-glusterfs on
102:     option rpc-auth.auth-unix on
103:     option rpc-auth.auth-null on
104:     option rpc-auth.ports.insecure on
105:     option rpc-auth-allow-insecure on
106:     option transport-type socket
107:     option transport.socket.listen-port 2049
108:     option nfs.dynamic-volumes on
109:     option nfs.nlm on
110:     option nfs.drc off
111:     option rpc-auth.addr.netshare.allow *
112:     option nfs3.netshare.volume-id 6ddee9c9-bd06-43d0-8acf-405d6b759720
113:     option rpc-auth.addr.netshare-dev.allow *
114:     option nfs3.netshare-dev.volume-id 2e2967a8-9881-456f-bdc8-edc23f6a6ff1
115:     option nfs.gfs-vol0.disable on
116:     option nfs.netshare.disable off
117:     option nfs.netshare-dev.disable off
118:     subvolumes netshare netshare-dev
119: end-volume
120:  
+------------------------------------------------------------------------------+
[2017-02-05 08:57:30.276426] W [socket.c:588:__socket_rwv] 0-NLM-client: readv on 10.1.20.36:11205 failed (No data available)
[2017-02-05 08:59:47.746940] E [socket.c:2307:socket_connect_finish] 0-NLM-client: connection to 10.1.20.36:11205 failed (Connection timed out)
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 
2017-02-05 08:59:49
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.9.1
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xac)[0x7fc411ee271c]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x324)[0x7fc411eebd34]
/lib/x86_64-linux-gnu/libc.so.6(+0x350e0)[0x7fc41097d0e0]
/lib/x86_64-linux-gnu/libc.so.6(+0x91d8a)[0x7fc4109d9d8a]
/usr/lib/x86_64-linux-gnu/glusterfs/3.9.1/xlator/nfs/server.so(+0x3a352)[0x7fc40b1f3352]
/usr/lib/x86_64-linux-gnu/glusterfs/3.9.1/xlator/nfs/server.so(+0x3cc15)[0x7fc40b1f5c15]
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x234)[0x7fc411cac4b4]
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fc411ca87e3]
/usr/lib/x86_64-linux-gnu/glusterfs/3.9.1/rpc-transport/socket.so(+0x4b33)[0x7fc40cacab33]
/usr/lib/x86_64-linux-gnu/glusterfs/3.9.1/rpc-transport/socket.so(+0x8f07)[0x7fc40cacef07]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x7fc06)[0x7fc411f38c06]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8064)[0x7fc41115f064]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fc410a3062d]
---------

Please fix that asap!

Comment 13 Jules 2017-02-05 11:16:10 UTC

Created attachment 1247794 [details]
Logs from the time while the nfs crashed. Node2d was the failing one

Comment 14 Jules 2017-02-05 11:17:12 UTC

I attached gluster_logs.zip with filtered lines while the crash happened. node2d was the failing one.

Comment 15 Niels de Vos 2017-02-16 09:47:49 UTC

I have been looking into this for quite some time now, unfortunately the fix is not so straight forward.

This is a use-after-free for some of the structures used while locking. There is no decent reference counting on the structures, and it is shared between the NFS protocol and NLM. Somewhere there are races that cause the structure(s?) to be free'd too early.

Comment 16 Kaushal 2017-03-08 10:54:34 UTC

This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.

Comment 17 Jules 2017-03-08 11:23:05 UTC

This still occurs in 3.9

Comment 18 Worker Ant 2017-05-05 10:48:58 UTC

REVIEW: https://review.gluster.org/17185 (nfs/nlm: ignore notify when there is no matching rpc request) posted (#1) for review on master by Niels de Vos (ndevos)

Comment 19 Worker Ant 2017-05-05 10:49:03 UTC

REVIEW: https://review.gluster.org/17186 (nfs/nlm: log the caller_name if nlm_client_t can be found) posted (#1) for review on master by Niels de Vos (ndevos)

Comment 20 Worker Ant 2017-05-05 10:49:09 UTC

REVIEW: https://review.gluster.org/17187 (nfs/nlm: unref rpc-client after nlm4svc_send_granted()) posted (#1) for review on master by Niels de Vos (ndevos)

Comment 21 Worker Ant 2017-05-05 10:49:16 UTC

REVIEW: https://review.gluster.org/17188 (nfs/nlm: remove lock request from the list after cancel) posted (#1) for review on master by Niels de Vos (ndevos)

Comment 22 Worker Ant 2017-05-05 10:49:24 UTC

REVIEW: https://review.gluster.org/17189 (nfs/nlm: free the nlm_client upon RPC_DISCONNECT) posted (#1) for review on master by Niels de Vos (ndevos)

Comment 23 Worker Ant 2017-05-08 13:30:05 UTC

COMMIT: https://review.gluster.org/17187 committed in master by Jeff Darcy (jeff.us) 
------
commit 52c28c0c04722a9ffaa7c39c49ffebdf0a5c75e1
Author: Niels de Vos <ndevos>
Date:   Fri Jan 13 16:05:02 2017 +0100

    nfs/nlm: unref rpc-client after nlm4svc_send_granted()
    
    nlm4svc_send_granted() uses the rpc_clnt by getting it from the
    call-state structure. It is safer to unref the rpc_clnt after the
    function is done with it.
    
    Change-Id: I7cb7c4297801463d21259c58b50d7df7c57aec5e
    BUG: 1381970
    Signed-off-by: Niels de Vos <ndevos>
    Reviewed-on: https://review.gluster.org/17187
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: soumya k <skoduri>
    Reviewed-by: Jeff Darcy <jeff.us>

Comment 24 Worker Ant 2017-05-09 17:59:54 UTC

REVIEW: https://review.gluster.org/17189 (nfs/nlm: free the nlm_client upon RPC_DISCONNECT) posted (#2) for review on master by Niels de Vos (ndevos)

Comment 25 Worker Ant 2017-05-09 18:36:24 UTC

REVIEW: https://review.gluster.org/17189 (nfs/nlm: free the nlm_client upon RPC_DISCONNECT) posted (#3) for review on master by Niels de Vos (ndevos)

Comment 26 Worker Ant 2017-05-10 12:27:15 UTC

COMMIT: https://review.gluster.org/17185 committed in master by Jeff Darcy (jeff.us) 
------
commit e997d752ba08f80b1b00d2c0035874befafe5200
Author: Niels de Vos <ndevos>
Date:   Fri Jan 13 14:02:45 2017 +0100

    nfs/nlm: ignore notify when there is no matching rpc request
    
    In certain (unclear) occasions it seems to happen that there are
    notifications sent to the Gluster/NFS NLM service, but no call-state can
    be found. Instead of segfaulting, log an error but keep on running.
    
    Change-Id: I0f186e56e46a86ca40314d230c1cc7719c61f0b5
    BUG: 1381970
    Signed-off-by: Niels de Vos <ndevos>
    Reviewed-on: https://review.gluster.org/17185
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: soumya k <skoduri>
    Reviewed-by: jiffin tony Thottan <jthottan>
    Reviewed-by: Jeff Darcy <jeff.us>

Comment 27 Worker Ant 2017-05-10 12:28:26 UTC

COMMIT: https://review.gluster.org/17186 committed in master by Jeff Darcy (jeff.us) 
------
commit 9bfb74a39954a7e63bfd762c816efc7e64b9df65
Author: Niels de Vos <ndevos>
Date:   Fri Jan 13 14:46:17 2017 +0100

    nfs/nlm: log the caller_name if nlm_client_t can be found
    
    In order to help tracking possible misbehaving clients down, log the
    'caller_name' (hostname of the NFS client) that does not have a matching
    nlm_client_t structure.
    
    Change-Id: Ib514a78d1809719a3d0274acc31ee632727d746d
    BUG: 1381970
    Signed-off-by: Niels de Vos <ndevos>
    Reviewed-on: https://review.gluster.org/17186
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: soumya k <skoduri>
    Reviewed-by: Jeff Darcy <jeff.us>

Comment 28 Worker Ant 2017-05-11 10:38:22 UTC

REVIEW: https://review.gluster.org/17188 (nfs/nlm: remove lock request from the list after cancel) posted (#2) for review on master by Niels de Vos (ndevos)

Comment 29 Worker Ant 2017-05-11 10:38:29 UTC

REVIEW: https://review.gluster.org/17189 (nfs/nlm: free the nlm_client upon RPC_DISCONNECT) posted (#4) for review on master by Niels de Vos (ndevos)

Comment 30 Worker Ant 2017-05-12 11:24:57 UTC

COMMIT: https://review.gluster.org/17189 committed in master by Niels de Vos (ndevos) 
------
commit 6897ba5c51b29c05b270c447adb1a34cb8e61911
Author: Niels de Vos <ndevos>
Date:   Fri Jan 20 14:15:31 2017 +0100

    nfs/nlm: free the nlm_client upon RPC_DISCONNECT
    
    When an NLM client disconnects, it should be removed from the list and
    free'd.
    
    Change-Id: Ib427c896bfcdc547a3aee42a652578ffd076e2ad
    BUG: 1381970
    Signed-off-by: Niels de Vos <ndevos>
    Reviewed-on: https://review.gluster.org/17189
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Kaleb KEITHLEY <kkeithle>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: jiffin tony Thottan <jthottan>

Comment 31 Worker Ant 2017-05-12 11:25:12 UTC

COMMIT: https://review.gluster.org/17188 committed in master by Niels de Vos (ndevos) 
------
commit 71cb7f3eb4fb706aab7f83906592942a2ff2e924
Author: Niels de Vos <ndevos>
Date:   Fri Jan 13 13:02:23 2017 +0100

    nfs/nlm: remove lock request from the list after cancel
    
    Once an NLM client cancels a lock request, it should be removed from the
    list. The list can also be cleaned of unneeded entries once the client
    does not have any outstanding lock/share requests/granted.
    
    Change-Id: I2f2b666b627dcb52cddc6d5b95856e420b2b2e26
    BUG: 1381970
    Signed-off-by: Niels de Vos <ndevos>
    Reviewed-on: https://review.gluster.org/17188
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Kaleb KEITHLEY <kkeithle>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: jiffin tony Thottan <jthottan>

Comment 32 Shyamsundar 2017-09-05 17:24:45 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.0, please open a new bug report.

glusterfs-3.12.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-September/000082.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.