Bug 1287503

Summary:	Full heal of volume fails on some nodes "Commit failed on X", and glustershd logs "Couldn't get xlator xl-0"
Product:	[Community] GlusterFS	Reporter:	Ravishankar N <ravishankar>
Component:	glusterd	Assignee:	Ravishankar N <ravishankar>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	mainline	CC:	bugs, bugs, rkavunga
Target Milestone:	---	Keywords:	Triaged
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	glusterfs-3.8rc2	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:	1284863	Environment:
Last Closed:	2016-06-16 13:47:58 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1284863
Bug Blocks:	1288988

Description Ravishankar N 2015-12-02 08:36:19 UTC

+++ This bug was initially created as a clone of Bug #1284863 +++

Description of problem:
-----------------------
Problems with unsuccessful full heal on all volumes started after upgrading the 6 node cluster from 3.7.2 to 3.7.6 on Ubuntu Trusty (kernel 3.13.0-49-generic)
On a Distributed-Replicate volume named test (vol info below), executing `gluster volume heal test full` is unsuccessful and returns different messages/errors depending on which node the command was executed on:

- When run from node *a*,*d* or *e* the cli tool returns:

> Launching heal operation to perform full self heal on volume test has been unsuccessful

With following errors/warnings on the node the command is run (no log items on other nodes)

> E [glusterfsd-mgmt.c:619:glusterfs_handle_translator_op] 0-glusterfs: Couldn't get xlator xl-0

==> /var/log/glusterfs/cli.log

> I [cli.c:721:main] 0-cli: Started running gluster with version 3.7.6
> I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
> W [socket.c:588:__socket_rwv] 0-glusterfs: readv on /var/run/gluster/quotad.socket failed (Invalid argument)
> I [cli-rpc-ops.c:8348:gf_cli_heal_volume_cbk] 0-cli: Received resp to heal volume
> I [input.c:36:cli_batch] 0-: Exiting with: -2

==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
> I [MSGID: 106533] [glusterd-volume-ops.c:861:__glusterd_handle_cli_heal_volume] 0-management: Received heal vol req for volume test

- When run from node *b* the cli tool returns:

> Commit failed on d.storage. Please check log file for details.
> Commit failed on e.storage. Please check log file for details.

No Errors in any log files on any nodes at that time-point (only info msg "starting full sweep on subvol" and "finished full sweep on subvol" on the other 4 nodes for which no commit failed msg was returned by the cli)

- When run from node *c* the cli tool returns:

> Commit failed on e.storage. Please check log file for details.
> Commit failed on a.storage. Please check log file for details.

No Errors in any log files on any nodes at that time-point (only info msg "starting full sweep on subvol" and "finished full sweep on subvol" on the other 4 nodes for which no commit failed msg was returned by the cli)

- When run from node *f* the cli tool returns:

> Commit failed on a.storage. Please check log file for details.
> Commit failed on d.storage. Please check log file for details.

No Errors in any log files on any nodes at that time-point (only log info msg "starting full sweep on subvol" and "finished full sweep on subvol" on the other 4 nodes for which no commit failed msg was returned by the cli)


Additional info:
----------------

**Volume info**
Volume Name: test
Type: Distributed-Replicate
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: a.storage:/storage/bricks/test/brick
Brick2: b.storage:/storage/bricks/test/brick
Brick3: c.storage:/storage/bricks/test/brick
Brick4: d.storage:/storage/bricks/test/brick
Brick5: e.storage:/storage/bricks/test/brick
Brick6: f.storage:/storage/bricks/test/brick
Options Reconfigured:
performance.readdir-ahead: on
features.trash: off
nfs.disable: off

**volume status info**
Status of volume: test
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick a.storage:/storage/bricks/test/brick  49156     0          Y       783  
Brick b.storage:/storage/bricks/test/brick  49160     0          Y       33394
Brick c.storage:/storage/bricks/test/brick  49156     0          Y       545  
Brick d.storage:/storage/bricks/test/brick  49158     0          Y       14983
Brick e.storage:/storage/bricks/test/brick  49156     0          Y       22585
Brick f.storage:/storage/bricks/test/brick  49155     0          Y       2397 
NFS Server on localhost                     2049      0          Y       49084
Self-heal Daemon on localhost               N/A       N/A        Y       49092
NFS Server on b.storage                     2049      0          Y       20138
Self-heal Daemon on b.storage               N/A       N/A        Y       20146
NFS Server on f.storage                     2049      0          Y       37158
Self-heal Daemon on f.storage               N/A       N/A        Y       37180
NFS Server on a.storage                     2049      0          Y       35744
Self-heal Daemon on a.storage               N/A       N/A        Y       35749
NFS Server on c.storage                     2049      0          Y       35479
Self-heal Daemon on c.storage               N/A       N/A        Y       35485
NFS Server on e.storage                     2049      0          Y       8512 
Self-heal Daemon on e.storage               N/A       N/A        Y       8520 
 
Task Status of Volume test
------------------------------------------------------------------------------
There are no active volume tasks

--- Additional comment from Ravishankar N on 2015-12-02 03:35:36 EST ---

Looks like a regression introduced by http://review.gluster.org/#/c/12344/. I'll send a fix for this particular error in itself but it might be worth noting that heal full does not work as expected in all scenarios (See BZ 1112158). But the idea is to eventually eliminate the  the need for heal full because 'replace-brick' and 'add-brick' use cases will automatically trigger heals. See comments 2 and 3 in 1112158.

Comment 1 Vijay Bellur 2015-12-02 08:38:09 UTC

REVIEW: http://review.gluster.org/12843 (glusterd: add pending_node only if hxlator_count is valid) posted (#1) for review on master by Ravishankar N (ravishankar)

Comment 2 Vijay Bellur 2015-12-08 04:59:27 UTC

REVIEW: http://review.gluster.org/12843 (glusterd: add pending_node only if hxlator_count is valid) posted (#2) for review on master by Ravishankar N (ravishankar)

Comment 3 Vijay Bellur 2015-12-08 04:59:55 UTC

REVIEW: http://review.gluster.org/12843 (glusterd: add pending_node only if hxlator_count is valid) posted (#3) for review on master by Ravishankar N (ravishankar)

Comment 4 Vijay Bellur 2015-12-08 14:44:24 UTC

REVIEW: http://review.gluster.org/12843 (glusterd: add pending_node only if hxlator_count is valid) posted (#4) for review on master by Ravishankar N (ravishankar)

Comment 5 Vijay Bellur 2015-12-08 21:48:22 UTC

REVIEW: http://review.gluster.org/12843 (glusterd: add pending_node only if hxlator_count is valid) posted (#5) for review on master by Vijay Bellur (vbellur)

Comment 6 Vijay Bellur 2015-12-10 06:42:53 UTC

COMMIT: http://review.gluster.org/12843 committed in master by Atin Mukherjee (amukherj) 
------
commit d57a5a57b8e87caffce94ed497240b37172f4a27
Author: Ravishankar N <root@ravi2.(none)>
Date:   Wed Dec 2 08:20:46 2015 +0000

    glusterd: add pending_node only if hxlator_count is valid
    
    Fixes a regression introduced by commit
    0ef62933649392051e73fe01c028e41baddec489 . See BZ for bug
    description.
    
    Problem:
        To perform GLUSTERD_BRICK_XLATOR_OP, the rpc requires number of xlators (n) the
        op needs to be performed on and the xlator names are populated in dictionary
        with xl-0, xl-1...  xl-n-1 as keys. When Volume heal full is executed, for each
        replica group, glustershd on the local node may or may not be selected to
        perform heal by glusterd.  XLATOR_OP rpc should be sent to the shd running on
        the same node by glusterd only when glustershd on that node is selected at
        least once. This bug occurs when glusterd sends the rpc to local glustershd
        even when it is not selected for any of the replica groups.
    
    Fix:
        Don't send the rpc to local glustershd when it is not selected even once.
    
    Change-Id: I2c8217a8f00f6ad5d0c6a67fa56e476457803e08
    BUG: 1287503
    Signed-off-by: Ravishankar N <ravishankar>
    Reviewed-on: http://review.gluster.org/12843
    Tested-by: NetBSD Build System <jenkins.org>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>

Comment 7 Niels de Vos 2016-06-16 13:47:58 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user