Bug 1346549

Summary: (Quota on)When glusterfsd init failed and exit, sometime lead to crash
Product: [Community] GlusterFS Reporter: jiademing.dd <iesool>
Component: quotaAssignee: Raghavendra G <rgowdapp>
Status: CLOSED EOL QA Contact:
Severity: low Docs Contact:
Priority: medium    
Version: 3.7.12CC: amukherj, bugs, iesool, smohan
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1354262 (view as bug list) Environment:
Last Closed: 2017-03-08 10:58:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1354262    

Description jiademing.dd 2016-06-15 01:53:51 UTC
Description of problem:
Create a volume like this:
Volume Name: test
Type: Distributed-Disperse
Volume ID: 78bd1b85-cfe9-401e-ac1e-dc9e072ed4db
Status: Started
Number of Bricks: 2 x (2 + 1) = 6
Transport-type: tcp
Bricks:
Brick1: node-1:/disk1
Brick2: node-2:/disk1
Brick3: node-3:/disk1
Brick4: node-1:/disk2
Brick5: node-2:/disk2
Brick6: node-3:/disk2
Options Reconfigured:
performance.readdir-ahead: on
features.quota: on
features.inode-quota: on
features.quota-deem-statfs: on

Then I umount /disk{1..3} and set /disk{1..3} readonly.

I  several attempt to gluster vol start test force, sometimes the glusterfsd crash.

glusterfsd's log:
[2016-06-15 09:44:20.567687] I [MSGID: 100030] [glusterfsd.c:2338:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.7.12 (args: /usr/sbin/glusterfsd -s node-1 --volfile-id test.node-1.disk2 -p /var/lib/glusterd/vols/test/run/node-1-disk2.pid -S /var/run/gluster/bddd1d1330cb529b05a3a9266879baee.socket --brick-name /disk2 -l /var/log/glusterfs/bricks/disk2.log --xlator-option *-posix.glusterd-uuid=dee1dcb8-280b-4b4c-b5a6-6ad7dbd0360a --brick-port 49153 --xlator-option test-server.listen-port=49153)
[2016-06-15 09:44:20.575048] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2016-06-15 09:44:20.580116] I [graph.c:269:gf_add_cmdline_options] 0-test-server: adding option 'listen-port' for volume 'test-server' with value '49153'
[2016-06-15 09:44:20.580187] I [graph.c:269:gf_add_cmdline_options] 0-test-posix: adding option 'glusterd-uuid' for volume 'test-posix' with value 'dee1dcb8-280b-4b4c-b5a6-6ad7dbd0360a'
[2016-06-15 09:44:20.580607] I [MSGID: 115034] [server.c:403:_check_for_auth_option] 0-/disk2: skip format check for non-addr auth option auth.login./disk2.allow
[2016-06-15 09:44:20.580765] I [MSGID: 115034] [server.c:403:_check_for_auth_option] 0-/disk2: skip format check for non-addr auth option auth.login.8306814a-3bf6-49b0-b75a-95665c2ba483.password
[2016-06-15 09:44:20.582297] I [rpcsvc.c:2196:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64
[2016-06-15 09:44:20.582499] W [MSGID: 101002] [options.c:957:xl_opt_validate] 0-test-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction
[2016-06-15 09:44:20.583012] W [socket.c:3759:reconfigure] 0-test-quota: NBIO on -1 failed (Bad file descriptor)
[2016-06-15 09:44:20.583141] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2
[2016-06-15 09:44:20.588207] E [index.c:188:index_dir_create] 0-test-index: /disk2/.glusterfs/indices/xattrop: Failed to create (Permission denied)
[2016-06-15 09:44:20.588401] E [MSGID: 101019] [xlator.c:435:xlator_init] 0-test-index: Initialization of volume 'test-index' failed, review your volfile again
[2016-06-15 09:44:20.588512] E [graph.c:322:glusterfs_graph_init] 0-test-index: initializing translator failed
[2016-06-15 09:44:20.588613] E [graph.c:662:glusterfs_graph_activate] 0-graph: init failed
[2016-06-15 09:44:20.590554] W [glusterfsd.c:1251:cleanup_and_exit] (-->/usr/sbin/glusterfsd(mgmt_getspec_cbk+0x307) [0x40dbe7] -->/usr/sbin/glusterfsd(glusterfs_process_volfp+0x13a) [0x408c7a] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x5f) [0x40831f] ) 0-: received signum (1), shutting down
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 
2016-06-15 09:44:20
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1


gdb bt:
Core was generated by `/usr/sbin/glusterfsd -s node-1 --volfile-id test.node-1.disk2 -p /var/lib/glust'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fd73c6ff688 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
(gdb) bt
#0  0x00007fd73c6ff688 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#1  0x00007fd73c7006f8 in _Unwind_Backtrace () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#2  0x00007fd7418dae26 in __GI___backtrace (array=array@entry=0x7fd735bbfb80, size=size@entry=200) at ../sysdeps/x86_64/backtrace.c:109
#3  0x00007fd742411ea2 in _gf_msg_backtrace_nomem (level=level@entry=GF_LOG_ALERT, stacksize=stacksize@entry=200) at logging.c:1095
#4  0x00007fd74243713d in gf_print_trace (signum=11, ctx=0x2049010) at common-utils.c:615
#5  <signal handler called>
#6  0x00007fd73644adb0 in ?? ()
#7  0x00007fd7421df8e4 in rpc_clnt_notify (trans=<optimized out>, mydata=0x7fd73803ef80, event=<optimized out>, data=0x7fd7380420f0) at rpc-clnt.c:957
#8  0x00007fd7421db593 in rpc_transport_notify (this=this@entry=0x7fd7380420f0, event=event@entry=RPC_TRANSPORT_CONNECT, data=data@entry=0x7fd7380420f0) at rpc-transport.c:546
#9  0x00007fd73d579f8f in socket_connect_finish (this=this@entry=0x7fd7380420f0) at socket.c:2429
#10 0x00007fd73d57a3af in socket_event_handler (fd=fd@entry=12, idx=idx@entry=3, data=0x7fd7380420f0, poll_in=0, poll_out=4, poll_err=0) at socket.c:2459
#11 0x00007fd74247f9fa in event_dispatch_epoll_handler (event=0x7fd735bc0e90, event_pool=0x2067da0) at event-epoll.c:575
#12 event_dispatch_epoll_worker (data=0x7fd73801e670) at event-epoll.c:678
#13 0x00007fd741b9d182 in start_thread (arg=0x7fd735bc1700) at pthread_create.c:312
#14 0x00007fd7418ca47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) 



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 jiademing.dd 2016-06-15 02:04:29 UTC
After analysis, rpc_clnt_notify() will call quota_enforcer_notify(),because rpc_clnt_register_notify (rpc, quota_enforcer_notify, this) in quota. glusterfsd exit will call glusterfs_graph_destroy(),in glusterfs_graph_destroy() will         dlclose (xl->dlhandle).

so if dlclose(xl->dlhandle) before rpc_clnt_notify(),quota.so's quota_enforcer_notify() invalid, then lead to crash.

Comment 3 Kaushal 2017-03-08 10:58:00 UTC
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.