1221025 – Glusterd crashes after enabling quota limit on a distrep volume.

Bug 1221025 - Glusterd crashes after enabling quota limit on a distrep volume.

Summary: Glusterd crashes after enabling quota limit on a distrep volume.

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	quota
Sub Component:
Version:	mainline
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Assignee:	Satish Mohan
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1224152
TreeView+	depends on / blocked

Reported:	2015-05-13 07:26 UTC by Triveni Rao
Modified:	2016-06-16 13:00 UTC (History)
CC List:	6 users (show)
Fixed In Version:	glusterfs-3.8rc2
Clone Of:
Clones:	1224152 (view as bug list)
Environment:
Last Closed:	2016-06-16 13:00:40 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
core file attached (20.00 KB, application/x-tar) 2015-05-13 09:38 UTC, Triveni Rao	no flags	Details
View All

Description Triveni Rao 2015-05-13 07:26:01 UTC

Description of problem:

Glusterd crashes after enabling quota limit on a distrep volume.

Version-Release number of selected component (if applicable):

root@rhsqa14-vm3 ~]# glusterfs --version
glusterfs 3.7.0beta2 built on May 11 2015 01:27:45
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2013 Red Hat, Inc. <http://www.redhat.com/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.
You have new mail in /var/spool/mail/root
[root@rhsqa14-vm3 ~]# 

[root@rhsqa14-vm3 ~]# rpm -qa | grep gluster
glusterfs-libs-3.7.0beta2-0.0.el6.x86_64
glusterfs-fuse-3.7.0beta2-0.0.el6.x86_64
glusterfs-rdma-3.7.0beta2-0.0.el6.x86_64
glusterfs-3.7.0beta2-0.0.el6.x86_64
glusterfs-api-3.7.0beta2-0.0.el6.x86_64
glusterfs-cli-3.7.0beta2-0.0.el6.x86_64
glusterfs-geo-replication-3.7.0beta2-0.0.el6.x86_64
glusterfs-extra-xlators-3.7.0beta2-0.0.el6.x86_64
glusterfs-client-xlators-3.7.0beta2-0.0.el6.x86_64
glusterfs-server-3.7.0beta2-0.0.el6.x86_64
[root@rhsqa14-vm3 ~]# 



How reproducible:
easily

Steps to Reproduce:
1.create normal distrep 2x2 volume
2.enable quota
3.set the quota limit usage on the volume

Actual results:
glusterd crashed

Expected results:
setting the quota limit should success

Additional info:

root@rhsqa14-vm4 ~]# gluster v create V1 replica 2  10.70.46.243:/rhs/brick1/t2 10.70.46.240:/rhs/brick1/t2 10.70.46.243:/rhs/brick2/t2 10.70.46.240:/rhs/brick2/t2 force
volume create: V1: success: please start the volume to access data
[root@rhsqa14-vm4 ~]# 
[root@rhsqa14-vm4 ~]# 
[root@rhsqa14-vm4 ~]# gluster  v start V1
volume start: V1: success
[root@rhsqa14-vm4 ~]# gluster v info V1
 
Volume Name: V1
Type: Distributed-Replicate
Volume ID: 99f99d6d-b24d-4cc8-96e0-25444dbf10fd
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.70.46.243:/rhs/brick1/t2
Brick2: 10.70.46.240:/rhs/brick1/t2
Brick3: 10.70.46.243:/rhs/brick2/t2
Brick4: 10.70.46.240:/rhs/brick2/t2
Options Reconfigured:
performance.readdir-ahead: on
[root@rhsqa14-vm4 ~]# 
[root@rhsqa14-vm4 ~]# 

[root@rhsqa14-vm1 ~]# cat options.sh 
gluster v set $1 cluster.min-free-disk 10
gluster volume quota $1 enable
gluster v set $1 quota-deem-statfs on
gluster v quota $1 limit-usage / 20GB
gluster v set $1 features.uss enable

[root@rhsqa14-vm1 ~]# 

[root@rhsqa14-vm4 ~]# ./options.sh V1
volume set: success
volume quota : success
volume set: success
Connection failed. Please check if gluster daemon is operational.
You have new mail in /var/spool/mail/root
[root@rhsqa14-vm4 ~]# service glusterd status
glusterd dead but pid file exists
[root@rhsqa14-vm4 ~]#


Log messages:

[2015-05-13 07:14:24.088781] W [socket.c:3059:socket_connect] 0-snapd: Ignore failed connection attempt on /var/run/gluster/e97ed36149cb00fbc0a75840c9ad6cf6.socket, (No such file or directory)
[2015-05-13 07:14:24.090158] W [socket.c:642:__socket_rwv] 0-snapd: readv on /var/run/gluster/e97ed36149cb00fbc0a75840c9ad6cf6.socket failed (Invalid argument)
[2015-05-13 07:14:26.029793] I [run.c:190:runner_log] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x7f71b207cfb0] (--> /usr/lib64/libglusterfs.so.0(runner_log+0x105)[0x7f71b20cdd05] (--> /usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so(glusterd_hooks_run_hooks+0x5a0)[0x7f71a7e03070] (--> /usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so(+0xd3302)[0x7f71a7e03302] (--> /lib64/libpthread.so.0[0x395a0079d1] ))))) 0-management: Ran script: /var/lib/glusterd/hooks/1/set/post/S30samba-set.sh --volname=V1 -o quota-deem-statfs=on --gd-workdir=/var/lib/glusterd
[2015-05-13 07:14:26.050454] I [run.c:190:runner_log] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x7f71b207cfb0] (--> /usr/lib64/libglusterfs.so.0(runner_log+0x105)[0x7f71b20cdd05] (--> /usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so(glusterd_hooks_run_hooks+0x5a0)[0x7f71a7e03070] (--> /usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so(+0xd3302)[0x7f71a7e03302] (--> /lib64/libpthread.so.0[0x395a0079d1] ))))) 0-management: Ran script: /var/lib/glusterd/hooks/1/set/post/S31ganesha-set.sh --volname=V1 -o quota-deem-statfs=on --gd-workdir=/var/lib/glusterd
[2015-05-13 07:14:27.096916] W [socket.c:3059:socket_connect] 0-snapd: Ignore failed connection attempt on /var/run/gluster/e97ed36149cb00fbc0a75840c9ad6cf6.socket, (No such file or directory)
The message "I [MSGID: 106006] [glusterd-snapd-svc.c:368:glusterd_snapdsvc_rpc_notify] 0-management: snapd has disconnected from glusterd." repeated 21 times between [2015-05-13 07:13:19.275136] and [2015-05-13 07:14:24.090209]
The message "I [MSGID: 106004] [glusterd-handler.c:4809:__glusterd_peer_rpc_notify] 0-management: Peer <10.70.46.233> (<eb626ff0-c985-45b7-b088-76f7230dcfc7>), in state <Sent and Received peer request>, has disconnected from glusterd." repeated 21 times between [2015-05-13 07:13:19.279502] and [2015-05-13 07:14:24.096822]
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 6
time of crash:
2015-05-13 07:14:27
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.0beta2
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x7f71b207cb96]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x7f71b209b5af]
/lib64/libc.so.6[0x3959c326a0]
/lib64/libc.so.6(gsignal+0x35)[0x3959c32625]
/lib64/libc.so.6(abort+0x175)[0x3959c33e05]
/lib64/libc.so.6[0x3959c70537]
/lib64/libc.so.6(__fortify_fail+0x37)[0x3959d02697]
/lib64/libc.so.6[0x3959d00580]
/lib64/libc.so.6(__read_chk+0x22)[0x3959d00a52]
/usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so(glusterd_store_quota_config+0x23e)[0x7f71a7dd79de]
/usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so(glusterd_quota_limit_usage+0x338)[0x7f71a7dd89f8]
/usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so(glusterd_op_quota+0x42f)[0x7f71a7dd9bdf]
/usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so(glusterd_op_commit_perform+0x233)[0x7f71a7d8ec83]
/usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so(gd_commit_op_phase+0xd8)[0x7f71a7dfeaa8]
/usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so(gd_sync_task_begin+0x61d)[0x7f71a7e01f2d]
/usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so(glusterd_op_begin_synctask+0x3b)[0x7f71a7e0211b]
/usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so(__glusterd_handle_quota+0x302)[0x7f71a7dd7682]
/usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so(glusterd_big_locked_handler+0x3f)[0x7f71a7d66c7f]
/usr/lib64/libglusterfs.so.0(synctask_wrap+0x12)[0x7f71b20bd5c2]
/lib64/libc.so.6[0x3959c438f0]
---------

Comment 1 Kaushal 2015-05-13 09:10:39 UTC

Triveni, could you provide the core files for this?

Comment 2 Triveni Rao 2015-05-13 09:38:59 UTC

Created attachment 1024977 [details]
core file attached

Comment 3 Anand Nekkunti 2015-05-13 09:49:12 UTC

bt for for this bug 

(gdb) bt
#0  0x00007f56a2177625 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007f56a2178e05 in abort () at abort.c:92
#2  0x00007f56a21b5537 in __libc_message (do_abort=2, fmt=0x7f56a229c5ef "*** %s ***: %s terminated\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:198
#3  0x00007f56a2247527 in __fortify_fail (msg=0x7f56a229c595 "buffer overflow detected") at fortify_fail.c:32
#4  0x00007f56a2245410 in __chk_fail () at chk_fail.c:29
#5  0x00007f56a22458e2 in __read_chk (fd=<value optimized out>, buf=<value optimized out>, nbytes=<value optimized out>, buflen=<value optimized out>) at read_chk.c:31
#6  0x00007f56982599de in read (volinfo=0x7f56900077b0, path=0x7f5680001720 "/", gfid_str=0x7f5680001440 "00000000-0000-0000-0000-", '0' <repeats 11 times>, "1", opcode=3, op_errstr=0x7f568c404ab0) at /usr/include/bits/unistd.h:43
#7  glusterd_store_quota_config (volinfo=0x7f56900077b0, path=0x7f5680001720 "/", gfid_str=0x7f5680001440 "00000000-0000-0000-0000-", '0' <repeats 11 times>, "1", opcode=3, op_errstr=0x7f568c404ab0) at glusterd-quota.c:872
#8  0x00007f569825a9f8 in glusterd_quota_limit_usage (volinfo=0x7f56900077b0, dict=0x7f56a0d8fb14, opcode=3, op_errstr=0x7f568c404ab0) at glusterd-quota.c:1077
#9  0x00007f569825bbdf in glusterd_op_quota (dict=0x7f56a0d8fb14, op_errstr=0x7f568c404ab0, rsp_dict=0x7f56a0d8fc2c) at glusterd-quota.c:1329
#10 0x00007f5698210c83 in glusterd_op_commit_perform (op=GD_OP_QUOTA, dict=0x7f56a0d8fb14, op_errstr=0x7f568c404ab0, rsp_dict=0x7f56a0d8fc2c) at glusterd-op-sm.c:5116
#11 0x00007f5698280aa8 in gd_commit_op_phase (op=GD_OP_QUOTA, op_ctx=0x7f56a0d8d6fc, req_dict=0x7f56a0d8fb14, op_errstr=0x7f568c404ab0, txn_opinfo=0x7f568c404a20) at glusterd-syncop.c:1337
#12 0x00007f5698283f2d in gd_sync_task_begin (op_ctx=0x7f56a0d8d6fc, req=0x1f978cc) at glusterd-syncop.c:1826
#13 0x00007f569828411b in glusterd_op_begin_synctask (req=0x1f978cc, op=<value optimized out>, dict=0x7f56a0d8d6fc) at glusterd-syncop.c:1883
#14 0x00007f5698259682 in __glusterd_handle_quota (req=0x1f978cc) at glusterd-quota.c:171
#15 0x00007f56981e8c7f in glusterd_big_locked_handler (req=0x1f978cc, actor_fn=0x7f5698259380 <__glusterd_handle_quota>) at glusterd-handler.c:83
#16 0x00007f56a35925c2 in synctask_wrap (old_task=<value optimized out>) at syncop.c:375
#17 0x00007f56a21888f0 in ?? () from /lib64/libc-2.12.so
#18 0x0000000000000000 in ?? ()

Comment 4 Anand Nekkunti 2015-05-13 09:54:27 UTC

int
glusterd_store_quota_config (glusterd_volinfo_t *volinfo, char *path,
                             char *gfid_str, int opcode, char **op_errstr)
{
        int                ret                   = -1;
        int                fd                    = -1;
        int                conf_fd               = -1;
        size_t             entry_sz              = 139264;
        ssize_t            bytes_read            = 0;
        size_t             bytes_to_write        = 0;
        unsigned char      buf[131072]           = {0,};
        uuid_t             gfid                  = {0,};
        xlator_t          *this                  = NULL;
        gf_boolean_t       found                 = _gf_false;
        gf_boolean_t       modified              = _gf_false;
        gf_boolean_t       is_file_empty         = _gf_false;
        gf_boolean_t       is_first_read         = _gf_true;
        glusterd_conf_t   *conf                  = NULL;
        float              version               = 0.0f;
        char               type                  = 0;
              

         ..............
         ..........
        bytes_read = read (conf_fd, (void*)&buf, entry_sz);
     
        the crash was happening in read call because of buf size  < entry_sz

Comment 5 Anand Avati 2015-05-13 09:59:06 UTC

REVIEW: http://review.gluster.org/10766 (quota/glusterd: on read call number of byte read should equal to buffer length) posted (#1) for review on master by Gaurav Kumar Garg (ggarg)

Comment 6 Anand Avati 2015-05-13 10:47:01 UTC

REVIEW: http://review.gluster.org/10767 (quota/glusterd: on read call number of byte read should equal to buffer length) posted (#1) for review on release-3.7 by Gaurav Kumar Garg (ggarg)

Comment 7 Apeksha 2015-05-14 06:22:30 UTC

Saw a similar crash while running BVT on 3.7beta2 build

Backtrace of the core:
(gdb) bt
#0  0x0000003d4de32625 in raise () from /lib64/libc.so.6
#1  0x0000003d4de33e05 in abort () from /lib64/libc.so.6
#2  0x0000003d4de70537 in __libc_message () from /lib64/libc.so.6
#3  0x0000003d4df02697 in __fortify_fail () from /lib64/libc.so.6
#4  0x0000003d4df00580 in __chk_fail () from /lib64/libc.so.6
#5  0x0000003d4df00a52 in __read_chk () from /lib64/libc.so.6
#6  0x00007f51b02e09de in glusterd_store_quota_config () from /usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so
#7  0x00007f51b02e19f8 in glusterd_quota_limit_usage () from /usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so
#8  0x00007f51b02e2bdf in glusterd_op_quota () from /usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so
#9  0x00007f51b0297c83 in glusterd_op_commit_perform () from /usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so
#10 0x00007f51b0307aa8 in gd_commit_op_phase () from /usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so
#11 0x00007f51b030af2d in gd_sync_task_begin () from /usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so
#12 0x00007f51b030b11b in glusterd_op_begin_synctask () from /usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so
#13 0x00007f51b02e0682 in __glusterd_handle_quota () from /usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so
#14 0x00007f51b026fc7f in glusterd_big_locked_handler () from /usr/lib64/glusterfs/3.7.0beta2/xlator/mgmt/glusterd.so
#15 0x0000003d4fa655c2 in synctask_wrap () from /usr/lib64/libglusterfs.so.0
#16 0x0000003d4de438f0 in ?? () from /lib64/libc.so.6
#17 0x0000000000000000 in ?? ()

Comment 8 Apeksha 2015-05-14 06:30:43 UTC

sosreport and core is available at : http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1221025/

Comment 11 Niels de Vos 2016-06-16 13:00:40 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.