Bug 1221577 - glusterfsd crashed on a quota enabled volume where snapshots were scheduled
Summary: glusterfsd crashed on a quota enabled volume where snapshots were scheduled
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: quota
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
Assignee: Anuradha
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1227235
TreeView+ depends on / blocked
 
Reported: 2015-05-14 11:49 UTC by senaik
Modified: 2016-09-20 02:01 UTC (History)
5 users (show)

Fixed In Version: glusterfs-3.8rc2
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1224225 1227235 (view as bug list)
Environment:
Last Closed: 2016-06-16 13:01:11 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description senaik 2015-05-14 11:49:41 UTC
Description of problem:
=======================
glusterfsd crashed on the volume on which quota,USS and bitrot was enabled

Version-Release number of selected component (if applicable):
=============================================================
glusterfs 3.7.0beta2 built on May 11 2015

How reproducible:
================
1/1


Steps to Reproduce:
===================
1.Create a 6x2 dist-rep volume (vol0), EC volume(vol1).
Fuse and NFS mount the volume 

2.Schedule some snapshots on both the volumes 

3.Tried some attach and detach tier on vol0 

4. After a while accessed files form mount point, faced "Transport endpoint not connected" error. Checked gluster v status on the vol1 

[root@rhs-arch-srv2 ~]# gluster v status vol1
Status of volume: vol1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick rhs-arch-srv2.lab.eng.blr.redhat.com:
/rhs/brick5/b5                              N/A       N/A        N       26340
Brick rhs-arch-srv3.lab.eng.blr.redhat.com:
/rhs/brick5/b5                              49158     0          Y       27843
Brick rhs-arch-srv4.lab.eng.blr.redhat.com:
/rhs/brick5/b5                              N/A       N/A        N       9492 
Brick rhs-arch-srv2.lab.eng.blr.redhat.com:
/rhs/brick6/b6                              N/A       N/A        N       26357
Brick rhs-arch-srv3.lab.eng.blr.redhat.com:
/rhs/brick6/b6                              N/A       N/A        N       27860
Brick rhs-arch-srv4.lab.eng.blr.redhat.com:
/rhs/brick6/b6                              49159     0          Y       9509 
Brick rhs-arch-srv2.lab.eng.blr.redhat.com:
/rhs/brick7/b7                              N/A       N/A        N       26374
Brick rhs-arch-srv3.lab.eng.blr.redhat.com:
/rhs/brick7/b7                              49160     0          Y       27877
Brick rhs-arch-srv4.lab.eng.blr.redhat.com:
/rhs/brick7/b7                              49160     0          Y       9526 
Snapshot Daemon on localhost                49164     0          Y       26480
NFS Server on localhost                     2049      0          Y       26990
Quota Daemon on localhost                   N/A       N/A        Y       27000
Bitrot Daemon on localhost                  N/A       N/A        Y       27007
Scrubber Daemon on localhost                N/A       N/A        Y       27013
Snapshot Daemon on rhs-arch-srv3.lab.eng.bl
r.redhat.com                                49163     0          Y       28000
NFS Server on rhs-arch-srv3.lab.eng.blr.red
hat.com                                     2049      0          Y       28540
Quota Daemon on rhs-arch-srv3.lab.eng.blr.r
edhat.com                                   N/A       N/A        Y       28550
Bitrot Daemon on rhs-arch-srv3.lab.eng.blr.
redhat.com                                  N/A       N/A        N       N/A  
Scrubber Daemon on rhs-arch-srv3.lab.eng.bl
r.redhat.com                                N/A       N/A        Y       28563
Snapshot Daemon on rhs-arch-srv4.lab.eng.bl
r.redhat.com                                49162     0          Y       9632 
NFS Server on rhs-arch-srv4.lab.eng.blr.red
hat.com                                     2049      0          Y       10486
Quota Daemon on rhs-arch-srv4.lab.eng.blr.r
edhat.com                                   N/A       N/A        Y       10496
Bitrot Daemon on rhs-arch-srv4.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       10503
Scrubber Daemon on rhs-arch-srv4.lab.eng.bl
r.redhat.com                                N/A       N/A        Y       10509
 
Task Status of Volume vol1
------------------------------------------------------------------------------
There are no active volume tasks

Part of the brick log:
=====================

[2015-05-14 08:17:53.820609] I [client_t.c:401:gf_client_unref] 0-vol1-server: Shutting down connection rhs-arch-srv2.lab.eng.blr.redhat.com-27006-2015/05/14-07:47:52:577963-vol1-client-1-0-0
[2015-05-14 08:18:03.847871] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 967f3173-1c1d-431a-aa99-f26579a86fc6
[2015-05-14 08:18:03.847905] I [server-handshake.c:585:server_setvolume] 0-vol1-server: accepted client from rhs-arch-srv2.lab.eng.blr.redhat.com-26999-2015/05/14-07:47:52:572659-vol1-client-1-0-0 (version: 3.7.0beta2)
pending frames:
frame : type(0) op(13)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 
2015-05-14 08:18:06
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.0beta2
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x38a8624b96]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x38a86435af]
/lib64/libc.so.6[0x3a964326a0]
/usr/lib64/glusterfs/3.7.0beta2/xlator/features/quota.so(quota_writev_cbk+0x17f)[0x7f4405d0889f]
/usr/lib64/glusterfs/3.7.0beta2/xlator/features/marker.so(marker_writev_cbk+0xe7)[0x7f4405f29ef7]
/usr/lib64/libglusterfs.so.0(default_writev_cbk+0xcc)[0x38a86356fc]
/usr/lib64/glusterfs/3.7.0beta2/xlator/features/upcall.so(up_writev_cbk+0xef)[0x7f440676d00f]
/usr/lib64/glusterfs/3.7.0beta2/xlator/features/locks.so(pl_writev_cbk+0xcc)[0x7f4406979fdc]
/usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so(br_stub_writev_cbk+0x170)[0x7f4406da06e0]
/usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so(br_stub_writev_resume+0x1f5)[0x7f4406da0975]
/usr/lib64/libglusterfs.so.0(call_resume_wind+0x38a)[0x38a8649c3a]
/usr/lib64/libglusterfs.so.0(call_resume+0x80)[0x38a864bb60]
/usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so(br_stub_fd_incversioning_cbk+0x9a)[0x7f4406d9e6fa]
/usr/lib64/glusterfs/3.7.0beta2/xlator/features/changelog.so(changelog_fsetxattr_cbk+0xf5)[0x7f4406fafb85]
/usr/lib64/glusterfs/3.7.0beta2/xlator/storage/posix.so(posix_fsetxattr+0x185)[0x7f4407c19715]
/usr/lib64/libglusterfs.so.0(default_fsetxattr+0x83)[0x38a862e0e3]
/usr/lib64/libglusterfs.so.0(default_fsetxattr+0x83)[0x38a862e0e3]
/usr/lib64/glusterfs/3.7.0beta2/xlator/features/changelog.so(changelog_fsetxattr+0x18b)[0x7f4406fb160b]
/usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so(br_stub_fd_versioning+0x1dd)[0x7f4406d9e57d]
/usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so(+0x70ab)[0x7f4406da10ab]
/usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so(br_stub_writev+0x30c)[0x7f4406da1e6c]
/usr/lib64/libglusterfs.so.0(default_writev+0xa0)[0x38a862db90]
/usr/lib64/glusterfs/3.7.0beta2/xlator/features/locks.so(pl_writev+0x1f6)[0x7f440697c666]
/usr/lib64/glusterfs/3.7.0beta2/xlator/features/upcall.so(up_writev+0x1b0)[0x7f440676a080]


bt from rhs-arch-srv2.lab.eng.blr.redhat.com
============================================
core.26340 :

Core was generated by `/usr/sbin/glusterfsd -s rhs-arch-srv2.lab.eng.blr.redhat.com --volfile-id vol1.'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f4405d0889f in quota_writev_cbk () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/quota.so
Missing separate debuginfos, use: debuginfo-install glusterfs-fuse-3.7.0beta2-0.0.el6.x86_64
(gdb) bt
#0  0x00007f4405d0889f in quota_writev_cbk () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/quota.so
#1  0x00007f4405f29ef7 in marker_writev_cbk () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/marker.so
#2  0x00000038a86356fc in default_writev_cbk () from /usr/lib64/libglusterfs.so.0
#3  0x00007f440676d00f in up_writev_cbk () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/upcall.so
#4  0x00007f4406979fdc in pl_writev_cbk () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/locks.so
#5  0x00007f4406da06e0 in br_stub_writev_cbk () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so
#6  0x00007f4406da0975 in br_stub_writev_resume () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so
#7  0x00000038a8649c3a in call_resume_wind () from /usr/lib64/libglusterfs.so.0
#8  0x00000038a864bb60 in call_resume () from /usr/lib64/libglusterfs.so.0
#9  0x00007f4406d9e6fa in br_stub_fd_incversioning_cbk () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so
#10 0x00007f4406fafb85 in changelog_fsetxattr_cbk () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/changelog.so
#11 0x00007f4407c19715 in posix_fsetxattr () from /usr/lib64/glusterfs/3.7.0beta2/xlator/storage/posix.so
#12 0x00000038a862e0e3 in default_fsetxattr () from /usr/lib64/libglusterfs.so.0
#13 0x00000038a862e0e3 in default_fsetxattr () from /usr/lib64/libglusterfs.so.0
#14 0x00007f4406fb160b in changelog_fsetxattr () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/changelog.so
#15 0x00007f4406d9e57d in br_stub_fd_versioning () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so
#16 0x00007f4406da10ab in ?? () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so
#17 0x00007f4406da1e6c in br_stub_writev () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so
#18 0x00000038a862db90 in default_writev () from /usr/lib64/libglusterfs.so.0
#19 0x00007f440697c666 in pl_writev () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/locks.so
#20 0x00007f440676a080 in up_writev () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/upcall.so
#21 0x00000038a8632262 in default_writev_resume () from /usr/lib64/libglusterfs.so.0
#22 0x00000038a8649c3a in call_resume_wind () from /usr/lib64/libglusterfs.so.0
#23 0x00000038a864bb60 in call_resume () from /usr/lib64/libglusterfs.so.0
#24 0x00007f440655e398 in iot_worker () from /usr/lib64/glusterfs/3.7.0beta2/xlator/performance/io-threads.so
#25 0x0000003a968079d1 in start_thread () from /lib64/libpthread.so.0
#26 0x0000003a964e89dd in clone () from /lib64/libc.so.6


Actual results:



Expected results:


Additional info:

Comment 1 Anand Avati 2015-05-25 05:45:57 UTC
REVIEW: http://review.gluster.org/10898 (features/quota : Do unwind if postbuf is NULL) posted (#1) for review on master by Anuradha Talur (atalur@redhat.com)

Comment 2 Anand Avati 2015-05-29 06:49:53 UTC
REVIEW: http://review.gluster.org/10898 (features/quota : Do unwind if postbuf is NULL) posted (#2) for review on master by Raghavendra G (rgowdapp@redhat.com)

Comment 3 Anand Avati 2015-06-01 07:03:06 UTC
REVIEW: http://review.gluster.org/10898 (features/quota : Do unwind if postbuf is NULL) posted (#3) for review on master by Anuradha Talur (atalur@redhat.com)

Comment 4 Anand Avati 2015-06-02 05:25:43 UTC
COMMIT: http://review.gluster.org/10898 committed in master by Raghavendra G (rgowdapp@redhat.com) 
------
commit f3a340694fcb195aa8b546578c348b41fb2208d1
Author: Anuradha <atalur@redhat.com>
Date:   Mon May 25 11:07:27 2015 +0530

    features/quota : Do unwind if postbuf is NULL
    
    If postbuf in quota_writev_cbk is NULL directly
    an unwind should be done. Trying to dereference
    it will lead to a crash.
    
    Change-Id: Idba6ce3cd1bbf37ede96c7f17d01007d6c07057a
    BUG: 1221577
    Signed-off-by: Anuradha <atalur@redhat.com>
    Reviewed-on: http://review.gluster.org/10898
    Tested-by: NetBSD Build System <jenkins@build.gluster.org>
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
    Tested-by: Raghavendra G <rgowdapp@redhat.com>

Comment 5 Niels de Vos 2015-06-02 08:20:18 UTC
The required changes to fix this bug have not made it into glusterfs-3.7.1. This bug is now getting tracked for glusterfs-3.7.2.

Comment 6 Niels de Vos 2015-06-20 10:08:14 UTC
Unfortunately glusterfs-3.7.2 did not contain a code change that was associated with this bug report. This bug is now proposed to be a blocker for glusterfs-3.7.3.

Comment 7 Vijaikumar Mallikarjuna 2015-06-22 06:55:38 UTC
Fixed in
Upstream patch: http://review.gluster.org/#/c/10898/
Release-3.7 patch: http://review.gluster.org/#/c/11040/

Comment 8 Niels de Vos 2016-06-16 13:01:11 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.