Bug 1184191

Summary:	DHT: Rebalance- Rebalance process crash after remove-brick
Product:	[Community] GlusterFS	Reporter:	Raghavendra Bhat <rabhat>
Component:	distribute	Assignee:	bugs <bugs>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	3.6.1	CC:	bugs, gluster-bugs, nbalacha, rabhat, rhs-bugs, shmohan, storage-qa-internal
Target Milestone:	---
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	glusterfs-3.6.2	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:	1159571	Environment:
Last Closed:	2015-02-11 11:59:53 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1159280, 1159571, 1162767
Bug Blocks:	1163723

Description Raghavendra Bhat 2015-01-20 18:58:05 UTC

+++ This bug was initially created as a clone of Bug #1159571 +++

+++ This bug was initially created as a clone of Bug #1159280 +++

Description of problem:


Version-Release number of selected component (if applicable):
glusterfs-3.6.1

How reproducible:


Steps to Reproduce:
1. created 6x2 dist-rep volume
2.  created some data on the mount point
3. started remove-brick

Actual results:
rebalance process crashed

Expected results:


Additional info:
Core was generated by `/usr/sbin/glusterfs --volfile-server=rhs-client4.lab.eng.blr.redhat.com --volfi'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007fbf389e9bcf in dht_lookup_everywhere_done (frame=0x7fbf3cb2c0f8, this=0x1016db0) at dht-common.c:1189
1189                                   gf_log (this->name, GF_LOG_DEBUG,
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.107.el6_4.6.x86_64 keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.10.3-10.el6_4.6.x86_64 libcom_err-1.41.12-14.el6_4.4.x86_64 libgcc-4.4.7-3.el6.x86_64 libselinux-2.0.94-5.3.el6_4.1.x86_64 openssl-1.0.1e-16.el6_5.15.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0  0x00007fbf389e9bcf in dht_lookup_everywhere_done (frame=0x7fbf3cb2c0f8, this=0x1016db0) at dht-common.c:1189
#1  0x00007fbf389ede1b in dht_lookup_everywhere_cbk (frame=0x7fbf3cb2c0f8, cookie=<value optimized out>, this=0x1016db0, 
    op_ret=<value optimized out>, op_errno=<value optimized out>, inode=0x7fbf30afe0c8, buf=0x7fbf3354085c, xattr=0x7fbf3c5271ac, 
    postparent=0x7fbf335408cc) at dht-common.c:1515
#2  0x00007fbf38c6a298 in afr_lookup_done (frame=0x7fbeffffffc6, cookie=0x7ffff2e0b8e8, this=0x1016320, 
    op_ret=<value optimized out>, op_errno=8, inode=0x11ce7e0, buf=0x7ffff2e0bb40, xattr=0x7fbf3c527238, postparent=0x7ffff2e0bad0)
    at afr-common.c:2223
#3  afr_lookup_cbk (frame=0x7fbeffffffc6, cookie=0x7ffff2e0b8e8, this=0x1016320, op_ret=<value optimized out>, op_errno=8, 
    inode=0x11ce7e0, buf=0x7ffff2e0bb40, xattr=0x7fbf3c527238, postparent=0x7ffff2e0bad0) at afr-common.c:2454
#4  0x00007fbf38ea6a33 in client3_3_lookup_cbk (req=<value optimized out>, iov=<value optimized out>, count=<value optimized out>, 
    myframe=0x7fbf3cb2ba40) at client-rpc-fops.c:2610
#5  0x00000035cac0e005 in rpc_clnt_handle_reply (clnt=0x1076630, pollin=0x10065e0) at rpc-clnt.c:773
#6  0x00000035cac0f5c7 in rpc_clnt_notify (trans=<value optimized out>, mydata=0x1076660, event=<value optimized out>, 
    data=<value optimized out>) at rpc-clnt.c:906
#7  0x00000035cac0ae48 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>)
    at rpc-transport.c:512
#8  0x00007fbf3a105e36 in socket_event_poll_in (this=0x1086060) at socket.c:2136
#9  0x00007fbf3a10775d in socket_event_handler (fd=<value optimized out>, idx=<value optimized out>, data=0x1086060, poll_in=1, 
    poll_out=0, poll_err=0) at socket.c:2246
#10 0x00000035ca462997 in event_dispatch_epoll_handler (event_pool=0xfe4ee0) at event-epoll.c:384
#11 event_dispatch_epoll (event_pool=0xfe4ee0) at event-epoll.c:445
#12 0x00000000004069d7 in main (argc=4, argv=0x7ffff2e0d7e8) at glusterfsd.c:2050


(gdb) bt
#0  0x00007f4d8d522bcf in dht_lookup_everywhere_done (frame=0x7f4d9145f85c, 
    this=0x22c1470) at dht-common.c:1189
#1  0x00007f4d8d526e1b in dht_lookup_everywhere_cbk (frame=0x7f4d9145f85c, 
    cookie=<value optimized out>, this=0x22c1470, 
    op_ret=<value optimized out>, op_errno=<value optimized out>, 
    inode=0x7f4d8396b53c, buf=0x7f4d8c100d38, xattr=0x7f4d90e5ab84, 
    postparent=0x7f4d8c100da8) at dht-common.c:1515
#2  0x00007f4d8d7a3298 in afr_lookup_done (frame=0x7f4cffffffc6, 
    cookie=0x7fffb7202cd8, this=0x22c09e0, op_ret=<value optimized out>, 
    op_errno=8, inode=0x25ccd70, buf=0x7fffb7202f30, xattr=0x7f4d90e5ab84, 
    postparent=0x7fffb7202ec0) at afr-common.c:2223
#3  afr_lookup_cbk (frame=0x7f4cffffffc6, cookie=0x7fffb7202cd8, 
    this=0x22c09e0, op_ret=<value optimized out>, op_errno=8, inode=0x25ccd70, 
    buf=0x7fffb7202f30, xattr=0x7f4d90e5ab84, postparent=0x7fffb7202ec0)
    at afr-common.c:2454
#4  0x00007f4d8d9dfa33 in client3_3_lookup_cbk (req=<value optimized out>, 
    iov=<value optimized out>, count=<value optimized out>, 
    myframe=0x7f4d9145eef4) at client-rpc-fops.c:2610
#5  0x00000035cac0e005 in rpc_clnt_handle_reply (clnt=0x22fc990, 
    pollin=0x230d380) at rpc-clnt.c:773
#6  0x00000035cac0f5c7 in rpc_clnt_notify (trans=<value optimized out>, 
    mydata=0x22fc9c0, event=<value optimized out>, data=<value optimized out>)
    at rpc-clnt.c:906
---Type <return> to continue, or q <return> to quit---
#7  0x00000035cac0ae48 in rpc_transport_notify (this=<value optimized out>, 
    event=<value optimized out>, data=<value optimized out>)
    at rpc-transport.c:512
#8  0x00007f4d8ea38e36 in socket_event_poll_in (this=0x230c420)
    at socket.c:2136
#9  0x00007f4d8ea3a75d in socket_event_handler (fd=<value optimized out>, 
    idx=<value optimized out>, data=0x230c420, poll_in=1, poll_out=0, 
    poll_err=0) at socket.c:2246
#10 0x00000035ca462997 in event_dispatch_epoll_handler (event_pool=0x2288ee0)
    at event-epoll.c:384
#11 event_dispatch_epoll (event_pool=0x2288ee0) at event-epoll.c:445
#12 0x00000000004069d7 in main (argc=11, argv=0x7fffb7204bd8)
    at glusterfsd.c:2050



(gdb) l
1184	                                goto unwind_hashed_and_cached;
1185	                        } else {
1186	
1187	                               local->skip_unlink.handle_valid_link = _gf_false;
1188	
1189	                               gf_log (this->name, GF_LOG_DEBUG,
1190	                                       "Linkto file found on hashed subvol "
1191	                                       "and data file found on cached "
1192	                                       "subvolume. But linkto points to "
1193	                                       "different cached subvolume (%s) "
(gdb) 
1194	                                       "path %s",
1195	                                       local->skip_unlink.hash_links_to->name,
1196	                                       local->loc.path);
1197	
1198	                               if (local->skip_unlink.opend_fd_count == 0) {
1199	


(gdb) p local->skip_unlink.hash_links_to
$2 = (xlator_t *) 0x0
(gdb) p local->skip_unlink.hash_links_to->name
Cannot access memory at address 0x0
(gdb) p local->loc.path
$1 = 0x7f4d7c019000 "/test/f411"


(gdb) p *(dht_conf_t *)this->private
$4 = {subvolume_lock = 1, subvolume_cnt = 8, subvolumes = 0x22d57c0, 
  subvolume_status = 0x22d5810 "\001\001\001\001\001\001\001\001", 
  last_event = 0x22d5830, file_layouts = 0x22d6650, dir_layouts = 0x0, 
....


The trusted.glusterfs.dht.linkto="qtest-replicate-8" for  "/test/f411". This points to the brick that was removed and is not found in the conf->subvolumes list.

(gdb) p ((dht_conf_t *)this->private)->subvolumes[0]->name
$18 = 0x22ba0a0 "qtest-replicate-0"
(gdb) p ((dht_conf_t *)this->private)->subvolumes[1]->name
$19 = 0x22bb780 "qtest-replicate-1"
(gdb) p ((dht_conf_t *)this->private)->subvolumes[2]->name
$20 = 0x22bc9b0 "qtest-replicate-2"
(gdb) p ((dht_conf_t *)this->private)->subvolumes[3]->name
$21 = 0x22bd420 "qtest-replicate-3"
(gdb) p ((dht_conf_t *)this->private)->subvolumes[4]->name
$22 = 0x22bdeb0 "qtest-replicate-4"
(gdb) p ((dht_conf_t *)this->private)->subvolumes[5]->name
$23 = 0x22be940 "qtest-replicate-5"
(gdb) p ((dht_conf_t *)this->private)->subvolumes[6]->name
$24 = 0x22bf3d0 "qtest-replicate-6"
(gdb) p ((dht_conf_t *)this->private)->subvolumes[7]->name
$25 = 0x22bfe60 "qtest-replicate-7"
(gdb) p ((dht_conf_t *)this->private)->subvolumes[8]->name
Cannot access memory at address 0x0



The local->skip_unlink.hash_links_to value is set in dht_lookup_everywhere_cbk() without checking if it NULL:


                if (is_linkfile) {
                        link_subvol = dht_linkfile_subvol (this, inode, buf,
                                                           xattr);
                        gf_msg_debug (this->name, 0,
                                      "found on %s linkfile %s (-> %s)",
                                      subvol->name, loc->path,
                                      link_subvol ? link_subvol->name : "''");
                        goto unlock;
                }
 ...
 ...

======================================================================================================
On the bricks:

[root@rhs-client4 ~]# getfattr  -d -m . /home/qtest*/test/f411
getfattr: Removing leading '/' from absolute path names
# file: home/qtest12/test/f411
trusted.gfid=0sJg43JQHHRJST/cXjXyY0wg==
trusted.glusterfs.dht.linkto="qtest-replicate-8"  <----------THIS!!!
trusted.glusterfs.quota.f3874c91-e295-45d9-a95a-252d54b15ba0.contri=0sAAAAAAAAAAA=
trusted.pgfid.f3874c91-e295-45d9-a95a-252d54b15ba0=0sAAAAAQ==

# file: home/qtest17/test/f411
trusted.afr.qtest-client-16=0sAAAAAAAAAAAAAAAA
trusted.afr.qtest-client-17=0sAAAAAAAAAAAAAAAA
trusted.gfid=0sJg43JQHHRJST/cXjXyY0wg==
trusted.glusterfs.quota.f3874c91-e295-45d9-a95a-252d54b15ba0.contri=0sAAAAAAAQAAA=
trusted.pgfid.f3874c91-e295-45d9-a95a-252d54b15ba0=0sAAAAAQ==


======================================================================================================

Comment 1 Anand Avati 2015-01-20 19:02:24 UTC

REVIEW: http://review.gluster.org/9467 (Cluster/DHT : Fixed crash due to null deref) posted (#1) for review on release-3.6 by Raghavendra Bhat (raghavendra)

Comment 2 Anand Avati 2015-01-21 12:00:14 UTC

COMMIT: http://review.gluster.org/9467 committed in release-3.6 by Raghavendra Bhat (raghavendra) 
------
commit 709d4712941adecdc0542672cd0cdea3b86ec729
Author: Nithya Balachandran <nbalacha>
Date:   Sat Nov 1 22:16:32 2014 +0530

    Cluster/DHT : Fixed crash due to null deref
    
    A lookup on a linkto file whose trusted.glusterfs.dht.linkto
    xattr points to a subvol that is not part of the volume
    can cause the brick process to segfault due to a null dereference.
    Modified to check for a non-null value before attempting to access
    the variable.
    
    > Change-Id: Ie8f9df058f842cfc0c2b52a8f147e557677386fa
    > BUG: 1159571
    > Signed-off-by: Nithya Balachandran <nbalacha>
    > Reviewed-on: http://review.gluster.org/9034
    > Tested-by: Gluster Build System <jenkins.com>
    > Reviewed-by: venkatesh somyajulu <vsomyaju>
    > Reviewed-by: Vijay Bellur <vbellur>
    > Signed-off-by: Raghavendra Bhat <raghavendra>
    
    Change-Id: I53b086289d2386d269648653629a0750baae07a4
    BUG: 1184191
    Reviewed-on: http://review.gluster.org/9467
    Reviewed-by: Vijay Bellur <vbellur>
    Reviewed-by: Shyamsundar Ranganathan <srangana>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Raghavendra Bhat <raghavendra>

Comment 3 Raghavendra Bhat 2015-02-11 11:59:53 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.2, please reopen this bug report.

glusterfs-3.6.2 has been announced on the Gluster Developers mailinglist [1], packages for several distributions should already be or become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

The fix for this bug likely to be included in all future GlusterFS releases i.e. release > 3.6.2.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/5978
[2] http://news.gmane.org/gmane.comp.file-systems.gluster.user
[3] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/6137