Description of problem: ======================= Many nodes hosting bricks of my ecvolume have coredumps I looked into some of them and almost all are displaying same info I tried to recollect what could be the possible reason, and I seriously suspect it because of umount the brick forecfully using umount -l core was generated by `/usr/sbin/glusterfsd -s 10.70.35.239 --volfile-id ecvol.10.70.35.239.rhs-brick1'. Program terminated with signal 11, Segmentation fault. #0 0x00007f1f01148694 in vfprintf () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install glusterfs-fuse-3.8.4-5.el7rhgs.x86_64 (gdb) bt #0 0x00007f1f01148694 in vfprintf () from /lib64/libc.so.6 #1 0x00007f1f0120c8d5 in __vsnprintf_chk () from /lib64/libc.so.6 #2 0x00007f1f02a77a18 in gf_vasprintf () from /lib64/libglusterfs.so.0 #3 0x00007f1f02ac688a in gf_event () from /lib64/libglusterfs.so.0 #4 0x00007f1ef4e475f0 in posix_fs_health_check () from /usr/lib64/glusterfs/3.8.4/xlator/storage/posix.so #5 0x00007f1ef4e47774 in posix_health_check_thread_proc () from /usr/lib64/glusterfs/3.8.4/xlator/storage/posix.so #6 0x00007f1f018b2dc5 in start_thread () from /lib64/libpthread.so.0 #7 0x00007f1f011f773d in clone () from /lib64/libc.so.6 Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
[root@dhcp35-37 ~]# rpm -qa|grep gluster gluster-nagios-common-0.2.4-1.el7rhgs.noarch glusterfs-3.8.4-5.el7rhgs.x86_64 python-gluster-3.8.4-5.el7rhgs.noarch glusterfs-server-3.8.4-5.el7rhgs.x86_64 glusterfs-events-3.8.4-5.el7rhgs.x86_64 glusterfs-libs-3.8.4-5.el7rhgs.x86_64 glusterfs-client-xlators-3.8.4-5.el7rhgs.x86_64 glusterfs-api-3.8.4-5.el7rhgs.x86_64 glusterfs-cli-3.8.4-5.el7rhgs.x86_64 glusterfs-geo-replication-3.8.4-5.el7rhgs.x86_64 glusterfs-ganesha-3.8.4-5.el7rhgs.x86_64 glusterfs-fuse-3.8.4-5.el7rhgs.x86_64 nfs-ganesha-gluster-2.3.1-8.el7rhgs.x86_64 gluster-nagios-addons-0.2.7-1.el7rhgs.x86_64 [root@dhcp35-37 ~]# cat /etc/redhat-* cat: /etc/redhat-access-insights: Is a directory Red Hat Enterprise Linux Server release 7.3 (Maipo) Red Hat Gluster Storage Server 3.2.0 [root@dhcp35-37 ~]#
I just checked the version and did initial investigation. It looks like the core is getting triggered mainly from posix_fs_health_check when EVENT_POSIX_HEALTH_CHECK_FAILED event is being triggered. [root@dhcp35-37 /]# gluster --version glusterfs 3.8.4 built on Nov 11 2016 06:45:08 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License. I think this is a known issue and a patch has been sent by Pranith regarding this - http://review.gluster.org/#/c/15671/ This BZ is duplicate to https://bugzilla.redhat.com/show_bug.cgi?id=1386097 I would like to close it as duplicate
*** This bug has been marked as a duplicate of bug 1385606 ***