Bug 1410406

Summary: ganesha service crashed on all nodes of ganesha cluster on disperse volume when doing lookup while copying files remotely using scp
Product: Red Hat Gluster Storage Reporter: nchilaka <nchilaka>
Component: distributeAssignee: Nithya Balachandran <nbalacha>
Status: CLOSED ERRATA QA Contact: Prasad Desala <tdesala>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: rhgs-3.2CC: amukherj, aspandey, jthottan, nbalacha, pkarampu, rcyriac, rhinduja, rhs-bugs, skoduri, storage-qa-internal
Target Milestone: ---   
Target Release: RHGS 3.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.8.4-12 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1410777 (view as bug list) Environment:
Last Closed: 2017-03-23 06:01:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On: 1410777, 1412119    
Bug Blocks: 1351528    

Description nchilaka 2017-01-05 12:29:26 UTC
Description of problem:
==========================
The ganesha service crashed on all the nodes of ganesha, making the volume 
unavailable
I mounted the volume on two different clients with below mapping
GlusterNode1:GaneshaNode1:client1 mounted on this
GlusterNode2:GaneshaNode2:client1 mounted on this
GlusterNode3:GaneshaNode3:client1 mounted on this
GlusterNode4:
GlusterNode5:
GlusterNode6:GaneshaNode4:client2 mounted on this

I was doing an scp of a media file(size of 1.1GB) from my laptop to the client1  in a loop of 100 as below
 scp -r gluster.mp4 root@rhs-client45:/mnt/disperse/newdir/dir45/file.$i;done

I did a lookup from client2 which displays o/p immediately

THen i did a ls on the client 1 mount point (to check the scp status ie how many files have been copied)

At this point the ls hung(i am not able to abort  the command on terminal using ctrl+c)and the scp also didn't proceed
I saw in ganesha-gfapi logs as input/output error

ganesha on all nodes crashed



Note: I had already hit it once (yesterday)(post a healing while scp and ls was going on), when i did similar thing, but that time all 1,2,3 VIPs failedover to 4th VIP with 1,2,3 crashing(but couldn't get the core as abrt service was not enabled)




Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Missing separate debuginfo for /lib64/libntirpc.so.1.4
Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/43/bbc3c15781ba5ecae854444efe636ba5348c11.debug
Core was generated by `/usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N N'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f631166e72e in dht_fsync_cbk (frame=0x7f631ae3884c, 
    cookie=<optimized out>, this=0x7f6338019250, op_ret=-1, op_errno=5, 
    prebuf=0x0, postbuf=0x0, xdata=0x0) at dht-inode-read.c:843
843	        if (IS_DHT_MIGRATION_PHASE1 (postbuf)) {
Missing separate debuginfos, use: debuginfo-install dbus-libs-1.6.12-17.el7.x86_64 glibc-2.17-157.el7_3.1.x86_64 gssproxy-0.4.1-13.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.14.1-27.el7_3.x86_64 libacl-2.2.51-12.el7.x86_64 libattr-2.4.46-12.el7.x86_64 libblkid-2.23.2-33.el7.x86_64 libcap-2.22-8.el7.x86_64 libcom_err-1.42.9-9.el7.x86_64 libgcc-4.8.5-11.el7.x86_64 libnfsidmap-0.25-15.el7.x86_64 libselinux-2.5-6.el7.x86_64 libuuid-2.23.2-33.el7.x86_64 openssl-libs-1.0.1e-60.el7.x86_64 pcre-8.32-15.el7_2.1.x86_64 sssd-client-1.14.0-43.el7_3.4.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0  0x00007f631166e72e in dht_fsync_cbk (frame=0x7f631ae3884c, 
    cookie=<optimized out>, this=0x7f6338019250, op_ret=-1, op_errno=5, 
    prebuf=0x0, postbuf=0x0, xdata=0x0) at dht-inode-read.c:843
#1  0x00007f639c167286 in default_fsync_cbk (frame=0x7f631ae26674, 
    cookie=<optimized out>, this=<optimized out>, op_ret=-1, op_errno=5, 
    prebuf=0x0, postbuf=0x0, xdata=0x0) at defaults.c:1508
#2  0x00007f63118a5d8a in ec_manager_fsync (fop=0x7f630f35f660, 
    state=<optimized out>) at ec-generic.c:355
#3  0x00007f631189f79b in __ec_manager (fop=0x7f630f35f660, error=5)
    at ec-common.c:2287
#4  0x00007f631189f978 in ec_resume (fop=0x7f630f35f660, error=0)
    at ec-common.c:289
#5  0x00007f631189fa9f in ec_complete (fop=0x7f630f35f660) at ec-common.c:362
#6  0x00007f63118a37ef in ec_fsync_cbk (frame=<optimized out>, cookie=0x2, 
    this=0x7f6338017f90, op_ret=0, op_errno=0, prebuf=0x7f6312be2000, 
    postbuf=0x7f6312be2070, xdata=0x0) at ec-generic.c:279
#7  0x00007f6311b13f64 in client3_3_fsync_cbk (req=<optimized out>, 
    iov=<optimized out>, count=<optimized out>, myframe=0x7f631ae0246c)
    at client-rpc-fops.c:976
#8  0x00007f631fdf36e0 in rpc_clnt_handle_reply (
    clnt=clnt@entry=0x7f63381abd90, pollin=pollin@entry=0x7f63383152c0)
    at rpc-clnt.c:791
#9  0x00007f631fdf39cf in rpc_clnt_notify (trans=<optimized out>, 
    mydata=0x7f63381abde8, event=<optimized out>, data=0x7f63383152c0)
    at rpc-clnt.c:971
#10 0x00007f631fdef893 in rpc_transport_notify (
    this=this@entry=0x7f63381bbaf0, 
    event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, 
    data=data@entry=0x7f63383152c0) at rpc-transport.c:538
#11 0x00007f63121d12d4 in socket_event_poll_in (
    this=this@entry=0x7f63381bbaf0) at socket.c:2267
#12 0x00007f63121d3785 in socket_event_handler (fd=<optimized out>, idx=12, 
    data=0x7f63381bbaf0, poll_in=1, poll_out=0, poll_err=0) at socket.c:2397
#13 0x00007f639c13f650 in event_dispatch_epoll_handler (event=0x7f6312be2540, 
    event_pool=0x7f6398086b00) at event-epoll.c:571
#14 event_dispatch_epoll_worker (data=0x7f633c000dc0) at event-epoll.c:674
#15 0x00007f64287a5dc5 in start_thread () from /lib64/libpthread.so.0
#16 0x00007f6427e7473d in clone () from /lib64/libc.so.6
(gdb) f 0
#0  0x00007f631166e72e in dht_fsync_cbk (frame=0x7f631ae3884c, 
    cookie=<optimized out>, this=0x7f6338019250, op_ret=-1, op_errno=5, 
    prebuf=0x0, postbuf=0x0, xdata=0x0) at dht-inode-read.c:843
843	        if (IS_DHT_MIGRATION_PHASE1 (postbuf)) {



Version-Release number of selected component (if applicable):
==============================

[root@dhcp35-37 ccpp-2017-01-05-17:12:21-27585]# rpm -qa|grep gluster
[root@dhcp35-37 ccpp-2017-01-05-17:12:21-27585]# rpm -qa|grep gluster
glusterfs-geo-replication-3.8.4-10.el7rhgs.x86_64
nfs-ganesha-gluster-2.4.1-3.el7rhgs.x86_64
glusterfs-3.8.4-10.el7rhgs.x86_64
python-gluster-3.8.4-10.el7rhgs.noarch
glusterfs-cli-3.8.4-10.el7rhgs.x86_64
glusterfs-rdma-3.8.4-10.el7rhgs.x86_64
glusterfs-api-3.8.4-10.el7rhgs.x86_64
glusterfs-server-3.8.4-10.el7rhgs.x86_64
glusterfs-api-devel-3.8.4-10.el7rhgs.x86_64
glusterfs-debuginfo-3.8.4-10.el7rhgs.x86_64
glusterfs-libs-3.8.4-10.el7rhgs.x86_64
glusterfs-devel-3.8.4-10.el7rhgs.x86_64
glusterfs-fuse-3.8.4-10.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-10.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-10.el7rhgs.x86_64
glusterfs-events-3.8.4-10.el7rhgs.x86_64
[root@dhcp35-37 ccpp-2017-01-05-17:12:21-27585]# 
[root@dhcp35-37 ccpp-2017-01-05-17:12:21-27585]# 
[root@dhcp35-37 ccpp-2017-01-05-17:12:21-27585]# rpm -qa|grep pace
pacemaker-cli-1.1.15-11.el7.x86_64
pacemaker-libs-1.1.15-11.el7.x86_64
pacemaker-1.1.15-11.el7.x86_64
pacemaker-cluster-libs-1.1.15-11.el7.x86_64
userspace-rcu-0.7.9-2.el7rhgs.x86_64
[root@dhcp35-37 ccpp-2017-01-05-17:12:21-27585]# rpm -qa|grep pcs
pcs-0.9.152-10.el7.x86_64
[root@dhcp35-37 ccpp-2017-01-05-17:12:21-27585]# rpm -qa|grep coro
corosync-2.4.0-4.el7.x86_64
corosynclib-2.4.0-4.el7.x86_64
[root@dhcp35-37 ccpp-2017-01-05-17:12:21-27585]# rpm -qa|grep ganesha
nfs-ganesha-gluster-2.4.1-3.el7rhgs.x86_64
nfs-ganesha-debuginfo-2.4.1-3.el7rhgs.x86_64
nfs-ganesha-2.4.1-3.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-10.el7rhgs.x86_64



cores and log locations will be shared soon

Comment 2 nchilaka 2017-01-05 12:58:40 UTC
sosreports, /var/spool (which has the core) /var/log  available @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/nchilaka/bug.1410406/

Comment 3 nchilaka 2017-01-05 13:02:34 UTC
different related service status post crash(FYI):

Unit pcs.service could not be found.
[root@dhcp35-37 ~]# service pcsd status
Redirecting to /bin/systemctl status  pcsd.service
● pcsd.service - PCS GUI and remote configuration interface
   Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2017-01-05 16:26:36 IST; 2h 4min ago
 Main PID: 28346 (pcsd)
   CGroup: /system.slice/pcsd.service
           └─28346 /usr/bin/ruby /usr/lib/pcsd/pcsd > /dev/null &

Jan 05 16:26:36 dhcp35-37.lab.eng.blr.redhat.com systemd[1]: Starting PCS GU...
Jan 05 16:26:36 dhcp35-37.lab.eng.blr.redhat.com systemd[1]: Started PCS GUI...
Hint: Some lines were ellipsized, use -l to show in full.
[root@dhcp35-37 ~]# service corosync status
Redirecting to /bin/systemctl status  corosync.service
● corosync.service - Corosync Cluster Engine
   Loaded: loaded (/usr/lib/systemd/system/corosync.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2017-01-05 16:26:27 IST; 2h 4min ago
  Process: 28299 ExecStart=/usr/share/corosync/corosync start (code=exited, status=0/SUCCESS)
 Main PID: 28306 (corosync)
   CGroup: /system.slice/corosync.service
           └─28306 corosync

Jan 05 16:26:27 dhcp35-37.lab.eng.blr.redhat.com corosync[28306]:  [TOTEM ] ...
Jan 05 16:26:27 dhcp35-37.lab.eng.blr.redhat.com corosync[28306]:  [TOTEM ] ...
Jan 05 16:26:27 dhcp35-37.lab.eng.blr.redhat.com corosync[28306]:  [QUORUM] ...
Jan 05 16:26:27 dhcp35-37.lab.eng.blr.redhat.com corosync[28306]:  [MAIN  ] ...
Jan 05 16:26:27 dhcp35-37.lab.eng.blr.redhat.com corosync[28306]:  [TOTEM ] ...
Jan 05 16:26:27 dhcp35-37.lab.eng.blr.redhat.com corosync[28306]:  [QUORUM] ...
Jan 05 16:26:27 dhcp35-37.lab.eng.blr.redhat.com corosync[28306]:  [QUORUM] ...
Jan 05 16:26:27 dhcp35-37.lab.eng.blr.redhat.com corosync[28306]:  [MAIN  ] ...
Jan 05 16:26:27 dhcp35-37.lab.eng.blr.redhat.com corosync[28299]: Starting C...
Jan 05 16:26:27 dhcp35-37.lab.eng.blr.redhat.com systemd[1]: Started Corosyn...
Hint: Some lines were ellipsized, use -l to show in full.
[root@dhcp35-37 ~]# service pacemaker status
Redirecting to /bin/systemctl status  pacemaker.service
● pacemaker.service - Pacemaker High Availability Cluster Manager
   Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2017-01-05 16:26:27 IST; 2h 4min ago
     Docs: man:pacemakerd
           http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Explained/index.html
 Main PID: 28322 (pacemakerd)
   CGroup: /system.slice/pacemaker.service
           ├─ 5977 /bin/bash /usr/lib/ocf/resource.d/heartbeat/ganesha_mon m...
           ├─ 5989 sleep 5
           ├─28322 /usr/sbin/pacemakerd -f
           ├─28324 /usr/libexec/pacemaker/cib
           ├─28325 /usr/libexec/pacemaker/stonithd
           ├─28326 /usr/libexec/pacemaker/lrmd
           ├─28327 /usr/libexec/pacemaker/attrd
           ├─28328 /usr/libexec/pacemaker/pengine
           └─28329 /usr/libexec/pacemaker/crmd

Jan 05 17:12:22 dhcp35-37.lab.eng.blr.redhat.com lrmd[28326]:   notice: dhcp...
Jan 05 17:12:22 dhcp35-37.lab.eng.blr.redhat.com lrmd[28326]:   notice: dhcp...
Jan 05 17:12:27 dhcp35-37.lab.eng.blr.redhat.com crmd[28329]:   notice: Resu...
Jan 05 17:12:27 dhcp35-37.lab.eng.blr.redhat.com crmd[28329]:   notice: Resu...
Jan 05 17:12:32 dhcp35-37.lab.eng.blr.redhat.com lrmd[28326]:   notice: dhcp...
Jan 05 17:12:32 dhcp35-37.lab.eng.blr.redhat.com lrmd[28326]:   notice: dhcp...
Jan 05 17:12:32 dhcp35-37.lab.eng.blr.redhat.com lrmd[28326]:   notice: dhcp...
Jan 05 17:12:32 dhcp35-37.lab.eng.blr.redhat.com crmd[28329]:   notice: Resu...
Jan 05 17:12:33 dhcp35-37.lab.eng.blr.redhat.com crmd[28329]:   notice: Resu...
Jan 05 17:12:33 dhcp35-37.lab.eng.blr.redhat.com crmd[28329]:   notice: Resu...
Hint: Some lines were ellipsized, use -l to show in full.
[root@dhcp35-37 ~]# 
[root@dhcp35-37 ~]# showmount -e localhost
rpc mount export: RPC: Unable to receive; errno = Connection refused
[root@dhcp35-37 ~]#

Comment 4 nchilaka 2017-01-05 13:30:17 UTC
root@dhcp35-37 ~]# gluster v status disperse
Status of volume: disperse
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.37:/rhs/brick2/disperse      49153     0          Y       16824
Brick 10.70.35.116:/rhs/brick2/disperse     49153     0          Y       32272
Brick 10.70.35.239:/rhs/brick2/disperse     49153     0          Y       1549 
Brick 10.70.35.135:/rhs/brick2/disperse     49152     0          Y       1891 
Brick 10.70.35.8:/rhs/brick2/disperse       49152     0          Y       1737 
Brick 10.70.35.196:/rhs/brick2/disperse     49152     0          Y       20413
Brick 10.70.35.37:/rhs/brick3/disperse      49154     0          Y       4764 
Brick 10.70.35.116:/rhs/brick3/disperse     49154     0          Y       32292
Brick 10.70.35.239:/rhs/brick3/disperse     49154     0          Y       21605
Brick 10.70.35.135:/rhs/brick3/disperse     49153     0          Y       2280 
Brick 10.70.35.8:/rhs/brick3/disperse       49153     0          Y       1757 
Brick 10.70.35.196:/rhs/brick3/disperse     49153     0          Y       32629
Self-heal Daemon on localhost               N/A       N/A        Y       17759
Self-heal Daemon on dhcp35-116.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       7539 
Self-heal Daemon on dhcp35-8.lab.eng.blr.re
dhat.com                                    N/A       N/A        Y       7833 
Self-heal Daemon on dhcp35-239.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       30090
Self-heal Daemon on dhcp35-135.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       8104 
Self-heal Daemon on dhcp35-196.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       29680
 
Task Status of Volume disperse
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@dhcp35-37 ~]# gluster v info disperse
 
Volume Name: disperse
Type: Distributed-Disperse
Volume ID: ea0b329a-f2ff-462d-a3a2-457a80b35e08
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x (4 + 2) = 12
Transport-type: tcp
Bricks:
Brick1: 10.70.35.37:/rhs/brick2/disperse
Brick2: 10.70.35.116:/rhs/brick2/disperse
Brick3: 10.70.35.239:/rhs/brick2/disperse
Brick4: 10.70.35.135:/rhs/brick2/disperse
Brick5: 10.70.35.8:/rhs/brick2/disperse
Brick6: 10.70.35.196:/rhs/brick2/disperse
Brick7: 10.70.35.37:/rhs/brick3/disperse
Brick8: 10.70.35.116:/rhs/brick3/disperse
Brick9: 10.70.35.239:/rhs/brick3/disperse
Brick10: 10.70.35.135:/rhs/brick3/disperse
Brick11: 10.70.35.8:/rhs/brick3/disperse
Brick12: 10.70.35.196:/rhs/brick3/disperse
Options Reconfigured:
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
ganesha.enable: on
features.cache-invalidation: on
performance.readdir-ahead: on
nfs.disable: on
nfs-ganesha: enable
cluster.enable-shared-storage: enable
[root@dhcp35-37 ~]#

Comment 5 Pranith Kumar K 2017-01-05 13:34:38 UTC
After:

pk@dhcp35-190 - ~/workspace/gerrit-repo (ec-link)
19:01:46 :) ⚡ git show a9ccd0c8
commit a9ccd0c8ea6989c72073028b296f73a6fcb6b896
Author: Mohammed Rafi KC <rkavunga@redhat.com>
Date:   Fri Apr 22 12:07:31 2016 +0530

    tier/dht: check for rebalance completion for EIO error
    
    When an ongoing rebalance completion check task been
    triggered by dht, there is a possibility of a race
    between afr setting subvol as non-readable and dht updates
    the cached subvol. In this window a write can fail with EIO.
    
    Change-Id: I42638e6d4104c0dbe893d1bc73e1366188458c5d
    BUG: 1329503
    Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
    Reviewed-on: http://review.gluster.org/14049
    NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
    Smoke: Gluster Build System <jenkins@build.gluster.com>
    CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: N Balachandran <nbalacha@redhat.com>
    Reviewed-by: Jeff Darcy <jdarcy@redhat.com>

diff --git a/xlators/cluster/dht/src/dht-common.h b/xlators/cluster/dht/src/dht-common.h
index ad86893..1dfc5cd 100644
--- a/xlators/cluster/dht/src/dht-common.h
+++ b/xlators/cluster/dht/src/dht-common.h
@@ -613,7 +613,9 @@ typedef struct dht_fd_ctx {
                 }                                               \
         } while (0)
 
-#define dht_inode_missing(op_errno) (op_errno == ENOENT || op_errno == ESTALE)
+#define dht_inode_missing(op_errno) (op_errno == ENOENT || op_errno == ESTALE \
+                                     || op_errno == EIO) \
+/*Bad fix. Please revert the commit after fixing the bug 1329505*/

In this case EIO came with postbuf/prebuf to be NULL so it is NULL dereference in dht. Changing component.

Comment 7 Nithya Balachandran 2017-01-06 11:32:55 UTC
This is a bug in the dht_fsync_cbk handling. I will send a patch today.

Comment 11 Prasad Desala 2017-01-25 08:53:09 UTC
Verified this BZ on glusterfs version 3.8.4-12.el7rhgs.x86_64.

STEPS:
1) Created a ganesha cluster and created a Distributed-Disperse volume.
2) Enabled md-cache settings and started the volume.
3) Mounted the volume on two clients.
4) Started a loop of scp file transfer from my laptop to the client1.
5) While step-4 is in-progress, started a infinite loop of lookups from client-1 and client-2.

This test was ran for ~3hrs. I didn't see any ganesha crashes or IO errors, hence moving this BZ to Verified state.

Comment 13 errata-xmlrpc 2017-03-23 06:01:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html