Description of problem: ======================= glusterfsd crashed on the volume on which quota,USS and bitrot was enabled Version-Release number of selected component (if applicable): ============================================================= glusterfs 3.7.0beta2 built on May 11 2015 How reproducible: ================ 1/1 Steps to Reproduce: =================== 1.Create a 6x2 dist-rep volume (vol0), EC volume(vol1). Fuse and NFS mount the volume 2.Schedule some snapshots on both the volumes 3.Tried some attach and detach tier on vol0 4. After a while accessed files form mount point, faced "Transport endpoint not connected" error. Checked gluster v status on the vol1 [root@rhs-arch-srv2 ~]# gluster v status vol1 Status of volume: vol1 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick rhs-arch-srv2.lab.eng.blr.redhat.com: /rhs/brick5/b5 N/A N/A N 26340 Brick rhs-arch-srv3.lab.eng.blr.redhat.com: /rhs/brick5/b5 49158 0 Y 27843 Brick rhs-arch-srv4.lab.eng.blr.redhat.com: /rhs/brick5/b5 N/A N/A N 9492 Brick rhs-arch-srv2.lab.eng.blr.redhat.com: /rhs/brick6/b6 N/A N/A N 26357 Brick rhs-arch-srv3.lab.eng.blr.redhat.com: /rhs/brick6/b6 N/A N/A N 27860 Brick rhs-arch-srv4.lab.eng.blr.redhat.com: /rhs/brick6/b6 49159 0 Y 9509 Brick rhs-arch-srv2.lab.eng.blr.redhat.com: /rhs/brick7/b7 N/A N/A N 26374 Brick rhs-arch-srv3.lab.eng.blr.redhat.com: /rhs/brick7/b7 49160 0 Y 27877 Brick rhs-arch-srv4.lab.eng.blr.redhat.com: /rhs/brick7/b7 49160 0 Y 9526 Snapshot Daemon on localhost 49164 0 Y 26480 NFS Server on localhost 2049 0 Y 26990 Quota Daemon on localhost N/A N/A Y 27000 Bitrot Daemon on localhost N/A N/A Y 27007 Scrubber Daemon on localhost N/A N/A Y 27013 Snapshot Daemon on rhs-arch-srv3.lab.eng.bl r.redhat.com 49163 0 Y 28000 NFS Server on rhs-arch-srv3.lab.eng.blr.red hat.com 2049 0 Y 28540 Quota Daemon on rhs-arch-srv3.lab.eng.blr.r edhat.com N/A N/A Y 28550 Bitrot Daemon on rhs-arch-srv3.lab.eng.blr. redhat.com N/A N/A N N/A Scrubber Daemon on rhs-arch-srv3.lab.eng.bl r.redhat.com N/A N/A Y 28563 Snapshot Daemon on rhs-arch-srv4.lab.eng.bl r.redhat.com 49162 0 Y 9632 NFS Server on rhs-arch-srv4.lab.eng.blr.red hat.com 2049 0 Y 10486 Quota Daemon on rhs-arch-srv4.lab.eng.blr.r edhat.com N/A N/A Y 10496 Bitrot Daemon on rhs-arch-srv4.lab.eng.blr. redhat.com N/A N/A Y 10503 Scrubber Daemon on rhs-arch-srv4.lab.eng.bl r.redhat.com N/A N/A Y 10509 Task Status of Volume vol1 ------------------------------------------------------------------------------ There are no active volume tasks Part of the brick log: ===================== [2015-05-14 08:17:53.820609] I [client_t.c:401:gf_client_unref] 0-vol1-server: Shutting down connection rhs-arch-srv2.lab.eng.blr.redhat.com-27006-2015/05/14-07:47:52:577963-vol1-client-1-0-0 [2015-05-14 08:18:03.847871] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 967f3173-1c1d-431a-aa99-f26579a86fc6 [2015-05-14 08:18:03.847905] I [server-handshake.c:585:server_setvolume] 0-vol1-server: accepted client from rhs-arch-srv2.lab.eng.blr.redhat.com-26999-2015/05/14-07:47:52:572659-vol1-client-1-0-0 (version: 3.7.0beta2) pending frames: frame : type(0) op(13) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2015-05-14 08:18:06 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.7.0beta2 /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x38a8624b96] /usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x38a86435af] /lib64/libc.so.6[0x3a964326a0] /usr/lib64/glusterfs/3.7.0beta2/xlator/features/quota.so(quota_writev_cbk+0x17f)[0x7f4405d0889f] /usr/lib64/glusterfs/3.7.0beta2/xlator/features/marker.so(marker_writev_cbk+0xe7)[0x7f4405f29ef7] /usr/lib64/libglusterfs.so.0(default_writev_cbk+0xcc)[0x38a86356fc] /usr/lib64/glusterfs/3.7.0beta2/xlator/features/upcall.so(up_writev_cbk+0xef)[0x7f440676d00f] /usr/lib64/glusterfs/3.7.0beta2/xlator/features/locks.so(pl_writev_cbk+0xcc)[0x7f4406979fdc] /usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so(br_stub_writev_cbk+0x170)[0x7f4406da06e0] /usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so(br_stub_writev_resume+0x1f5)[0x7f4406da0975] /usr/lib64/libglusterfs.so.0(call_resume_wind+0x38a)[0x38a8649c3a] /usr/lib64/libglusterfs.so.0(call_resume+0x80)[0x38a864bb60] /usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so(br_stub_fd_incversioning_cbk+0x9a)[0x7f4406d9e6fa] /usr/lib64/glusterfs/3.7.0beta2/xlator/features/changelog.so(changelog_fsetxattr_cbk+0xf5)[0x7f4406fafb85] /usr/lib64/glusterfs/3.7.0beta2/xlator/storage/posix.so(posix_fsetxattr+0x185)[0x7f4407c19715] /usr/lib64/libglusterfs.so.0(default_fsetxattr+0x83)[0x38a862e0e3] /usr/lib64/libglusterfs.so.0(default_fsetxattr+0x83)[0x38a862e0e3] /usr/lib64/glusterfs/3.7.0beta2/xlator/features/changelog.so(changelog_fsetxattr+0x18b)[0x7f4406fb160b] /usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so(br_stub_fd_versioning+0x1dd)[0x7f4406d9e57d] /usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so(+0x70ab)[0x7f4406da10ab] /usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so(br_stub_writev+0x30c)[0x7f4406da1e6c] /usr/lib64/libglusterfs.so.0(default_writev+0xa0)[0x38a862db90] /usr/lib64/glusterfs/3.7.0beta2/xlator/features/locks.so(pl_writev+0x1f6)[0x7f440697c666] /usr/lib64/glusterfs/3.7.0beta2/xlator/features/upcall.so(up_writev+0x1b0)[0x7f440676a080] bt from rhs-arch-srv2.lab.eng.blr.redhat.com ============================================ core.26340 : Core was generated by `/usr/sbin/glusterfsd -s rhs-arch-srv2.lab.eng.blr.redhat.com --volfile-id vol1.'. Program terminated with signal 11, Segmentation fault. #0 0x00007f4405d0889f in quota_writev_cbk () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/quota.so Missing separate debuginfos, use: debuginfo-install glusterfs-fuse-3.7.0beta2-0.0.el6.x86_64 (gdb) bt #0 0x00007f4405d0889f in quota_writev_cbk () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/quota.so #1 0x00007f4405f29ef7 in marker_writev_cbk () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/marker.so #2 0x00000038a86356fc in default_writev_cbk () from /usr/lib64/libglusterfs.so.0 #3 0x00007f440676d00f in up_writev_cbk () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/upcall.so #4 0x00007f4406979fdc in pl_writev_cbk () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/locks.so #5 0x00007f4406da06e0 in br_stub_writev_cbk () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so #6 0x00007f4406da0975 in br_stub_writev_resume () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so #7 0x00000038a8649c3a in call_resume_wind () from /usr/lib64/libglusterfs.so.0 #8 0x00000038a864bb60 in call_resume () from /usr/lib64/libglusterfs.so.0 #9 0x00007f4406d9e6fa in br_stub_fd_incversioning_cbk () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so #10 0x00007f4406fafb85 in changelog_fsetxattr_cbk () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/changelog.so #11 0x00007f4407c19715 in posix_fsetxattr () from /usr/lib64/glusterfs/3.7.0beta2/xlator/storage/posix.so #12 0x00000038a862e0e3 in default_fsetxattr () from /usr/lib64/libglusterfs.so.0 #13 0x00000038a862e0e3 in default_fsetxattr () from /usr/lib64/libglusterfs.so.0 #14 0x00007f4406fb160b in changelog_fsetxattr () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/changelog.so #15 0x00007f4406d9e57d in br_stub_fd_versioning () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so #16 0x00007f4406da10ab in ?? () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so #17 0x00007f4406da1e6c in br_stub_writev () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/bitrot-stub.so #18 0x00000038a862db90 in default_writev () from /usr/lib64/libglusterfs.so.0 #19 0x00007f440697c666 in pl_writev () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/locks.so #20 0x00007f440676a080 in up_writev () from /usr/lib64/glusterfs/3.7.0beta2/xlator/features/upcall.so #21 0x00000038a8632262 in default_writev_resume () from /usr/lib64/libglusterfs.so.0 #22 0x00000038a8649c3a in call_resume_wind () from /usr/lib64/libglusterfs.so.0 #23 0x00000038a864bb60 in call_resume () from /usr/lib64/libglusterfs.so.0 #24 0x00007f440655e398 in iot_worker () from /usr/lib64/glusterfs/3.7.0beta2/xlator/performance/io-threads.so #25 0x0000003a968079d1 in start_thread () from /lib64/libpthread.so.0 #26 0x0000003a964e89dd in clone () from /lib64/libc.so.6 Actual results: Expected results: Additional info:
REVIEW: http://review.gluster.org/10898 (features/quota : Do unwind if postbuf is NULL) posted (#1) for review on master by Anuradha Talur (atalur)
REVIEW: http://review.gluster.org/10898 (features/quota : Do unwind if postbuf is NULL) posted (#2) for review on master by Raghavendra G (rgowdapp)
REVIEW: http://review.gluster.org/10898 (features/quota : Do unwind if postbuf is NULL) posted (#3) for review on master by Anuradha Talur (atalur)
COMMIT: http://review.gluster.org/10898 committed in master by Raghavendra G (rgowdapp) ------ commit f3a340694fcb195aa8b546578c348b41fb2208d1 Author: Anuradha <atalur> Date: Mon May 25 11:07:27 2015 +0530 features/quota : Do unwind if postbuf is NULL If postbuf in quota_writev_cbk is NULL directly an unwind should be done. Trying to dereference it will lead to a crash. Change-Id: Idba6ce3cd1bbf37ede96c7f17d01007d6c07057a BUG: 1221577 Signed-off-by: Anuradha <atalur> Reviewed-on: http://review.gluster.org/10898 Tested-by: NetBSD Build System <jenkins.org> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Raghavendra G <rgowdapp> Tested-by: Raghavendra G <rgowdapp>
The required changes to fix this bug have not made it into glusterfs-3.7.1. This bug is now getting tracked for glusterfs-3.7.2.
Unfortunately glusterfs-3.7.2 did not contain a code change that was associated with this bug report. This bug is now proposed to be a blocker for glusterfs-3.7.3.
Fixed in Upstream patch: http://review.gluster.org/#/c/10898/ Release-3.7 patch: http://review.gluster.org/#/c/11040/
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user