Bug 1227204 - glusterfsd: bricks crash while executing ls on nfs-ganesha vers=3
Summary: glusterfsd: bricks crash while executing ls on nfs-ganesha vers=3
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: upcall
Version: mainline
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Soumya Koduri
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1221941
TreeView+ depends on / blocked
 
Reported: 2015-06-02 07:22 UTC by Soumya Koduri
Modified: 2016-06-16 13:07 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.8rc2
Doc Type: Bug Fix
Doc Text:
Clone Of: 1221941
Environment:
Last Closed: 2016-06-16 13:07:33 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Soumya Koduri 2015-06-02 07:22:37 UTC
+++ This bug was initially created as a clone of Bug #1221941 +++

Description of problem:
Seen a coredump for several brick processes of the same volume, while executing the ls on mount-point. Volume mount using nfs-ganesha with vers=3

Version-Release number of selected component (if applicable):
glusterfs-3.7.0beta2-0.0.el6.x86_64
nfs-ganesha-2.2.0-0.el6.x86_64

How reproducible:
seen only once

Steps to Reproduce:
1. create a 6x2 volume, start it
2. bring up nfs-ganesha after completing the pre-requisites
3. disable_acl and do the needful as required to bringing up ganesha again
4. mount the volume with vers=3
5. execute ls on the mount-point

Actual results:

step 5 result,
[root@rhsauto010 ~]# time ls /mnt/nfs-test
dir  dir1  fstest_f017b1f6b87412d79e9052d0a289ce23  rhsauto010.test

real    144m12.193s
user    0m0.003s
sys     0m0.023s

(gdb) bt
#0  0x00007fcb200605bd in __gf_free (free_ptr=0x7fcabc0036a0) at mem-pool.c:312
#1  0x00007fcb0fbe1dc7 in upcall_reaper_thread (data=0x7fcb100127a0) at upcall-internal.c:426
#2  0x0000003890c079d1 in start_thread () from /lib64/libpthread.so.0
#3  0x00000038908e88fd in clone () from /lib64/libc.so.6


Expected results:
ls should not this long time and glusterfsd getting a coredump is wierd, need to rectify this problem

Additional info:

--- Additional comment from Saurabh on 2015-05-15 06:03:35 EDT ---

[root@nfs3 ~]# gluster volume status
Status of volume: gluster_shared_storage
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.37.148:/rhs/brick1/d1r1-share   49156     0          Y       3549 
Brick 10.70.37.77:/rhs/brick1/d1r2-share    49155     0          Y       3329 
Brick 10.70.37.76:/rhs/brick1/d2r1-share    49155     0          Y       3081 
Brick 10.70.37.69:/rhs/brick1/d2r2-share    49155     0          Y       3346 
Brick 10.70.37.148:/rhs/brick1/d3r1-share   49157     0          Y       3566 
Brick 10.70.37.77:/rhs/brick1/d3r2-share    49156     0          Y       3346 
Brick 10.70.37.76:/rhs/brick1/d4r1-share    49156     0          Y       3098 
Brick 10.70.37.69:/rhs/brick1/d4r2-share    49156     0          Y       3363 
Brick 10.70.37.148:/rhs/brick1/d5r1-share   49158     0          Y       3583 
Brick 10.70.37.77:/rhs/brick1/d5r2-share    49157     0          Y       3363 
Brick 10.70.37.76:/rhs/brick1/d6r1-share    49157     0          Y       3115 
Brick 10.70.37.69:/rhs/brick1/d6r2-share    49157     0          Y       3380 
Self-heal Daemon on localhost               N/A       N/A        Y       28389
Self-heal Daemon on 10.70.37.148            N/A       N/A        Y       22717
Self-heal Daemon on 10.70.37.77             N/A       N/A        Y       4784 
Self-heal Daemon on 10.70.37.76             N/A       N/A        Y       25893
 
Task Status of Volume gluster_shared_storage
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: vol2
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.37.148:/rhs/brick1/d1r1         49153     0          Y       22219
Brick 10.70.37.77:/rhs/brick1/d1r2          49152     0          Y       4321 
Brick 10.70.37.76:/rhs/brick1/d2r1          N/A       N/A        N       25654
Brick 10.70.37.69:/rhs/brick1/d2r2          49152     0          Y       27914
Brick 10.70.37.148:/rhs/brick1/d3r1         49154     0          Y       18842
Brick 10.70.37.77:/rhs/brick1/d3r2          49153     0          Y       4343 
Brick 10.70.37.76:/rhs/brick1/d4r1          N/A       N/A        N       25856
Brick 10.70.37.69:/rhs/brick1/d4r2          N/A       N/A        N       27934
Brick 10.70.37.148:/rhs/brick1/d5r1         49155     0          Y       22237
Brick 10.70.37.77:/rhs/brick1/d5r2          49154     0          Y       4361 
Brick 10.70.37.76:/rhs/brick1/d6r1          N/A       N/A        N       25874
Brick 10.70.37.69:/rhs/brick1/d6r2          N/A       N/A        N       27952
Self-heal Daemon on localhost               N/A       N/A        Y       28389
Self-heal Daemon on 10.70.37.77             N/A       N/A        Y       4784 
Self-heal Daemon on 10.70.37.148            N/A       N/A        Y       22717
Self-heal Daemon on 10.70.37.76             N/A       N/A        Y       25893
 
Task Status of Volume vol2
------------------------------------------------------------------------------
There are no active volume tasks
 

cat /etc/ganesha/exports/export.vol2.conf
# WARNING : Using Gluster CLI will overwrite manual
# changes made to this file. To avoid it, edit the
# file, copy it over to all the NFS-Ganesha nodes
# and run ganesha-ha.sh --refresh-config.
EXPORT{
      Export_Id= 2 ;
      Path = "/vol2";
      FSAL {
           name = GLUSTER;
           hostname="localhost";
          volume="vol2";
           }
      Access_type = RW;
      Squash="No_root_squash";
      Pseudo="/vol2";
      Protocols = "3", "4" ;
      Transports = "UDP","TCP";
      SecType = "sys";
      Disable_ACL = True;
     }

--- Additional comment from Saurabh on 2015-05-15 06:08:11 EDT ---



--- Additional comment from Saurabh on 2015-05-15 06:10:39 EDT ---



--- Additional comment from Saurabh on 2015-05-15 06:13:06 EDT ---

Comment 1 Anand Avati 2015-06-02 09:32:04 UTC
REVIEW: http://review.gluster.org/10909 (Upcall/cache-invalidation: Ignore fops with frame->root->client not set) posted (#4) for review on master by soumya k (skoduri)

Comment 2 Anand Avati 2015-06-02 18:39:10 UTC
REVIEW: http://review.gluster.org/10909 (Upcall/cache-invalidation: Ignore fops with frame->root->client not set) posted (#5) for review on master by soumya k (skoduri)

Comment 3 Anand Avati 2015-06-03 05:26:47 UTC
REVIEW: http://review.gluster.org/10909 (Upcall/cache-invalidation: Ignore fops with frame->root->client not set) posted (#6) for review on master by soumya k (skoduri)

Comment 4 Anand Avati 2015-06-04 05:57:49 UTC
REVIEW: http://review.gluster.org/10909 (Upcall/cache-invalidation: Ignore fops with frame->root->client not set) posted (#7) for review on master by soumya k (skoduri)

Comment 5 Anand Avati 2015-06-04 07:42:55 UTC
REVIEW: http://review.gluster.org/10909 (Upcall/cache-invalidation: Ignore fops with frame->root->client not set) posted (#8) for review on master by soumya k (skoduri)

Comment 6 Anand Avati 2015-06-04 07:45:14 UTC
REVIEW: http://review.gluster.org/10909 (Upcall/cache-invalidation: Ignore fops with frame->root->client not set) posted (#9) for review on master by soumya k (skoduri)

Comment 7 Anand Avati 2015-06-05 07:24:08 UTC
REVIEW: http://review.gluster.org/10909 (Upcall/cache-invalidation: Ignore fops with frame->root->client not set) posted (#10) for review on master by soumya k (skoduri)

Comment 8 Anand Avati 2015-06-05 15:34:53 UTC
REVIEW: http://review.gluster.org/10909 (Upcall/cache-invalidation: Ignore fops with frame->root->client not set) posted (#11) for review on master by Kaleb KEITHLEY (kkeithle)

Comment 9 Anand Avati 2015-06-06 16:50:51 UTC
REVIEW: http://review.gluster.org/10909 (Upcall/cache-invalidation: Ignore fops with frame->root->client not set) posted (#12) for review on master by Vijay Bellur (vbellur)

Comment 10 Anand Avati 2015-06-09 15:41:18 UTC
REVIEW: http://review.gluster.org/11141 (Upcall/cache-invalidation: Ignore fops with frame->root->client not set) posted (#1) for review on release-3.7 by soumya k (skoduri)

Comment 11 Niels de Vos 2016-06-16 13:07:33 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.