| Summary: | Quota[glusterfs-3.2.1qa3]: enable/disable crashes the glusterd on other node | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Saurabh <saurabh> | ||||
| Component: | quota | Assignee: | Pranith Kumar K <pkarampu> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | mainline | CC: | gluster-bugs, pkarampu | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | Type: | --- | |||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
just now I tried to use quota, and enabled it but the issue happens is that while setting space-limit kills glusterd on the other node,
gluster> volume quota dist1 enable
quota translator is enabled
gluster> volume quotaa dist1 limit-usage /dist1 2GB
unrecognized word: quotaa (position 1)
gluster> volume quota dist1 limit-usage /dist1 2GB
Quota command failed
gluster> peer status
Number of Peers: 1
Hostname: 10.1.12.135
Uuid: eb79e865-3435-4fe0-8389-66f819026df0
State: Peer in Cluster (Disconnected)
#######################from other node#############################
root 8245 1 0 16:22 ? 00:00:01 /opt/glusterfs/3.2.1/inst//sbin/glusterfsd --xlator-option dist1-server.listen-port=24017 -s localhost --volfile-id dist1.10.1.12.135.mnt-dist1 -p /etc/glusterd/vols/dist1/run/10.1.12.135-mnt-dist1.pid -S /tmp/73c5bcf416ce90433cab2ded1614ede3.socket --brick-name /mnt/dist1 --brick-port 24017 -l /opt/glusterfs/3.2.1/inst//var/log/glusterfs/bricks/mnt-dist1.log
root 8285 1 0 16:33 ? 00:00:01 /opt/glusterfs/3.2.1/inst//sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/run/nfs.pid -l /opt/glusterfs/3.2.1/inst//var/log/glusterfs/nfs.log
root 8373 8143 0 16:59 pts/0 00:00:00 grep glu
##################bt of the core########################
Core was generated by `/opt/glusterfs/3.2.1/inst//sbin/glusterd'.
Program terminated with signal 11, Segmentation fault.
#0 0x00002ad82e02c299 in _dict_lookup (this=0x1efc7810, key=0x2aaaaab0b58d "errstr") at dict.c:220
220 for (pair = this->members[hashval]; pair != NULL; pair = pair->hash_next) {
(gdb) bt
#0 0x00002ad82e02c299 in _dict_lookup (this=0x1efc7810, key=0x2aaaaab0b58d "errstr") at dict.c:220
#1 0x00002ad82e02d4ad in _dict_set (this=0x1efc7810, key=<value optimized out>, value=0x1efc99f0) at dict.c:251
#2 dict_set (this=0x1efc7810, key=<value optimized out>, value=0x1efc99f0) at dict.c:315
#3 0x00002aaaaaae37eb in glusterd_op_quota (dict=0x1efbdcd0, op_errstr=0x7fff27bcfd20) at glusterd-op-sm.c:4645
#4 0x00002aaaaaaed9b3 in glusterd_op_stage_validate (op=<value optimized out>, dict=0x1efbdcd0, op_errstr=0x7fff27bcfd20, rsp_dict=0x2aaaaab0b591)
at glusterd-op-sm.c:6675
#5 0x00002aaaaaaeea0c in glusterd_op_ac_stage_op (event=<value optimized out>, ctx=0x1efc7ad0) at glusterd-op-sm.c:6504
#6 0x00002aaaaaadb39f in glusterd_op_sm () at glusterd-op-sm.c:7557
#7 0x00002aaaaaac73d7 in glusterd_handle_stage_op (req=<value optimized out>) at glusterd-handler.c:565
#8 0x00002ad82e28ba7e in rpcsvc_handle_rpc_call (svc=0x1efbe020, trans=<value optimized out>, msg=0x1efc7130) at rpcsvc.c:1003
#9 0x00002ad82e28bc7c in rpcsvc_notify (trans=0x1efc8730, mydata=0x2aaaaab0b591, event=<value optimized out>, data=0x1efc7130) at rpcsvc.c:1099
#10 0x00002ad82e28cb9c in rpc_transport_notify (this=0x2aaaaab0b591, event=RPC_TRANSPORT_DISCONNECT, data=0x2aaaaab0b591) at rpc-transport.c:1029
#11 0x00002aaaaadd743f in socket_event_poll_in (this=0x1efc8730) at socket.c:1639
#12 0x00002aaaaadd75c8 in socket_event_handler (fd=<value optimized out>, idx=3, data=0x1efc8730, poll_in=1, poll_out=0, poll_err=0) at socket.c:1753
#13 0x00002ad82e053717 in event_dispatch_epoll_handler (event_pool=0x1efbc360) at event.c:812
#14 event_dispatch_epoll (event_pool=0x1efbc360) at event.c:876
#15 0x0000000000404fdb in main (argc=1, argv=0x7fff27bd0608) at glusterfsd.c:1458
PATCH: http://patches.gluster.com/patch/6588 in master (mgmt/glusterd: Fix import friend volumes) Created attachment 466 Attached is the test script I used for unit-testing. You will have to start the glusterd on both the machines with valgrind for it to work. I think you can improve this script to add as regression. test done over fuse mount for a dist-rep volume [root@centos-qa-client-3 gluster-test]# rm -rf * [root@centos-qa-client-3 gluster-test]# dd if=/dev/zero of=f.1 bs=1KB count=512 512+0 records in 512+0 records out 512000 bytes (512 kB) copied, 0.210623 seconds, 2.4 MB/s [root@centos-qa-client-3 gluster-test]# dd if=/dev/zero of=f.2 bs=1KB count=512 512+0 records in 512+0 records out 512000 bytes (512 kB) copied, 0.170953 seconds, 3.0 MB/s [root@centos-qa-client-3 gluster-test]# dd if=/dev/zero of=f.3 bs=1KB count=200 dd: closing output file `f.3': Disk quota exceeded [root@centos-qa-client-3 gluster-test]# dd if=/dev/zero of=f.4 bs=1KB count=1 1+0 records in 1+0 records out 1000 bytes (1.0 kB) copied, 0.000647 seconds, 1.5 MB/s [root@centos-qa-client-3 gluster-test]# ls -l total 1044 -rw-r--r-- 1 root root 512000 Apr 17 02:48 f.1 -rw-r--r-- 1 root root 512000 Apr 17 02:48 f.2 -rw-r--r-- 1 root root 14000 Apr 17 02:48 f.3 -rw-r--r-- 1 root root 1000 Apr 17 02:48 f.4 [root@centos-qa-client-3 gluster-test]# ######################################################################## [root@centos-qa-client-2 sbin]# ./gluster volume quota dr2 list path limit_set size ---------------------------------------------------------------------------------- / 1048576 0 [root@centos-qa-client-2 sbin]# ./gluster volume quota dr2 list path limit_set size ---------------------------------------------------------------------------------- / 1048576 1024000 [root@centos-qa-client-2 sbin]# ./gluster volume quota dr2 list path limit_set size ---------------------------------------------------------------------------------- / 1048576 1039000 [root@centos-qa-client-2 sbin]# Hence moving this bug to verified state last time wrongly updated this bug, the bug to be moved verified was 2741, as the tabs were open for both bug, this one got moved to other state. ran valgrind on the bricks and leaks were not found. ran posix and untar of linux kernel tarball from the mount point |
I have two servers with a distribute volume on it, Enabling the quota will work. but disabling it will kill the glusterd process on the other node, but the quota disable fails. and again if I manually kill all processes and enable glusterd on both nodes, then disable works. but again enabling it back will kill the glusterd on other node. I have tried to test in similar fashion from node A and node B. here is the bt of one of the cores, (gdb) bt #0 0x00002ab67feba28c in _dict_lookup (this=0x1aed50b0, key=0x2aaaaab0b58d "errstr") at dict.c:220 #1 0x00002ab67febb4ad in _dict_set (this=0x1aed50b0, key=<value optimized out>, value=0x1aed3810) at dict.c:251 #2 dict_set (this=0x1aed50b0, key=<value optimized out>, value=0x1aed3810) at dict.c:315 #3 0x00002aaaaaae37eb in glusterd_op_quota (dict=0x1aed38c0, op_errstr=0x7fff9183f4f0) at glusterd-op-sm.c:4645 #4 0x00002aaaaaaed9b3 in glusterd_op_stage_validate (op=<value optimized out>, dict=0x1aed38c0, op_errstr=0x7fff9183f4f0, rsp_dict=0x2aaaaab0b591) at glusterd-op-sm.c:6675 #5 0x00002aaaaaaeea0c in glusterd_op_ac_stage_op (event=<value optimized out>, ctx=0x1aed51c0) at glusterd-op-sm.c:6504 #6 0x00002aaaaaadb39f in glusterd_op_sm () at glusterd-op-sm.c:7557 #7 0x00002aaaaaac73d7 in glusterd_handle_stage_op (req=<value optimized out>) at glusterd-handler.c:565 #8 0x00002ab680119a7e in rpcsvc_handle_rpc_call (svc=0x1aec7020, trans=<value optimized out>, msg=0x1aed7640) at rpcsvc.c:1003 #9 0x00002ab680119c7c in rpcsvc_notify (trans=0x1aed1520, mydata=0x2aaaaab0b591, event=<value optimized out>, data=0x1aed7640) at rpcsvc.c:1099 #10 0x00002ab68011ab9c in rpc_transport_notify (this=0x2aaaaab0b591, event=RPC_TRANSPORT_DISCONNECT, data=0x2aaaaab0b591) at rpc-transport.c:1029 #11 0x00002aaaaadd743f in socket_event_poll_in (this=0x1aed1520) at socket.c:1639 #12 0x00002aaaaadd75c8 in socket_event_handler (fd=<value optimized out>, idx=1, data=0x1aed1520, poll_in=1, poll_out=0, poll_err=0) at socket.c:1753 #13 0x00002ab67fee1717 in event_dispatch_epoll_handler (event_pool=0x1aec5360) at event.c:812 #14 event_dispatch_epoll (event_pool=0x1aec5360) at event.c:876 #15 0x0000000000404fdb in main (argc=1, argv=0x7fff9183fdd8) at glusterfsd.c:1458 (gdb) frame 3 #3 0x00002aaaaaae37eb in glusterd_op_quota (dict=0x1aed38c0, op_errstr=0x7fff9183f4f0) at glusterd-op-sm.c:4645 4645 ret = dict_set_str (ctx, "errstr", *op_errstr); (gdb) info threads 4 Thread 4770 0x0000003a8ae0e838 in do_sigwait () from /lib64/libpthread.so.0 3 Thread 4818 0x0000003a8a69a1a1 in nanosleep () from /lib64/libc.so.6 2 Thread 4862 0x0000003a8a699daf in waitpid () from /lib64/libc.so.6 * 1 Thread 4769 0x00002ab67feba28c in _dict_lookup (this=0x1aed50b0, key=0x2aaaaab0b58d "errstr") at dict.c:220 (gdb) p *this No symbol "this" in current context. (gdb) down #2 dict_set (this=0x1aed50b0, key=<value optimized out>, value=0x1aed3810) at dict.c:315 315 ret = _dict_set (this, key, value); (gdb) p *this $1 = {is_static = 0 '\000', hash_size = 0, count = -2141910640, refcount = 10934, members = 0x3a8a41c4e8, members_list = 0x0, extra_free = 0x1b18fe80 "", extra_stdfree = 0x2ab680550990 "", lock = -1975401240} (gdb) p *this->members $2 = (data_pair_t *) 0x0 ################################# volume log file messages#################### [2011-03-16 19:45:10.255600] I [glusterd-handler.c:488:glusterd_req_ctx_create] glusterd: Received op from uuid: eb79e865-3435-4fe0-8389-66f819026df0 pending frames: patchset: v3.2.1qa3 signal received: 8 time of crash: 2011-03-16 19:45:18 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.2.1qa3 /lib64/libc.so.6[0x3a8a6302d0] /opt/glusterfs/3.2.1/inst//lib/libglusterfs.so.0[0x2ab67feba28c] /opt/glusterfs/3.2.1/inst//lib/libglusterfs.so.0(dict_set+0x8d)[0x2ab67febb4ad] /opt/glusterfs/3.2.1/inst//lib/glusterfs/3.2.1qa3/xlator/mgmt/glusterd.so(glusterd_op_quota+0xeb)[0x2aaaaaae37eb] /opt/glusterfs/3.2.1/inst//lib/glusterfs/3.2.1qa3/xlator/mgmt/glusterd.so(glusterd_op_stage_validate+0x723)[0x2aaaaaaed9b3] /opt/glusterfs/3.2.1/inst//lib/glusterfs/3.2.1qa3/xlator/mgmt/glusterd.so[0x2aaaaaaeea0c] /opt/glusterfs/3.2.1/inst//lib/glusterfs/3.2.1qa3/xlator/mgmt/glusterd.so(glusterd_op_sm+0x15f)[0x2aaaaaadb39f] /opt/glusterfs/3.2.1/inst//lib/glusterfs/3.2.1qa3/xlator/mgmt/glusterd.so(glusterd_handle_stage_op+0xb7)[0x2aaaaaac73d7] /opt/glusterfs/3.2.1/inst//lib/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x28e)[0x2ab680119a7e] /opt/glusterfs/3.2.1/inst//lib/libgfrpc.so.0(rpcsvc_notify+0x16c)[0x2ab680119c7c] /opt/glusterfs/3.2.1/inst//lib/libgfrpc.so.0(rpc_transport_notify+0x2c)[0x2ab68011ab9c] /opt/glusterfs/3.2.1/inst//lib/glusterfs/3.2.1qa3/rpc-transport/socket.so(socket_event_poll_in+0x3f)[0x2aaaaadd743f] /opt/glusterfs/3.2.1/inst//lib/glusterfs/3.2.1qa3/rpc-transport/socket.so(socket_event_handler+0x168)[0x2aaaaadd75c8] /opt/glusterfs/3.2.1/inst//lib/libglusterfs.so.0[0x2ab67fee1717] /opt/glusterfs/3.2.1/inst/sbin/glusterd(main+0x38b)[0x404fdb] /lib64/libc.so.6(__libc_start_main+0xf4)[0x3a8a61d994] /opt/glusterfs/3.2.1/inst/sbin/glusterd[0x403619] ---------