Description of problem: This is the run of the regression: http://build.gluster.org/job/rackspace-regression-2GB-triggered/1559/consoleFull (gdb) bt #0 0x00007fb74f7fc418 in dht_selfheal_layout_new_directory (frame=0x7fb75a34a1a4, loc=0x7fb74c5dc898, layout=0x144f960) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/dht/src/dht-selfheal.c:1068 #1 0x00007fb74f7fc7d0 in dht_selfheal_dir_getafix (frame=0x7fb75a34a1a4, loc=0x7fb74c5dc898, layout=0x144f960) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/dht/src/dht-selfheal.c:1133 #2 0x00007fb74f7fcbe4 in dht_selfheal_directory (frame=0x7fb75a34a1a4, dir_cbk=0x7fb74f808a19 <dht_lookup_selfheal_cbk>, loc=0x7fb74c5dc898, layout=0x144f960) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/dht/src/dht-selfheal.c:1243 #3 0x00007fb74f80acb0 in dht_lookup_dir_cbk (frame=0x7fb75a34a1a4, cookie=0x7fb75a34a2fc, this=0x1455580, op_ret=0, op_errno=22, inode=0x7fb74e2e204c, stbuf=0x7fff8f7b51e0, xattr=0x7fb759d45650, postparent=0x7fff8f7b5170) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/dht/src/dht-common.c:578 #4 0x00007fb74fa7871d in client3_3_lookup_cbk (req=0x7fb74c5924e4, iov=0x7fb74c592524, count=1, myframe=0x7fb75a34a2fc) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/protocol/client/src/client-rpc-fops.c:2769 #5 0x00007fb75c0d9c49 in rpc_clnt_handle_reply (clnt=0x14901e0, pollin=0x144f4b0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:766 #6 0x00007fb75c0da06a in rpc_clnt_notify (trans=0x14c61f0, mydata=0x1490210, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x144f4b0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:894 #7 0x00007fb75c0d65c0 in rpc_transport_notify (this=0x14c61f0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x144f4b0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-transport.c:516 #8 0x00007fb7512beec8 in socket_event_poll_in (this=0x14c61f0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2136 #9 0x00007fb7512bf383 in socket_event_handler (fd=8, idx=7, data=0x14c61f0, poll_in=1, poll_out=0, poll_err=0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2249 #10 0x00007fb75c379b2b in event_dispatch_epoll_handler (event_pool=0x142f3c0, events=0x144dd60, i=0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:384 #11 0x00007fb75c379d25 in event_dispatch_epoll (event_pool=0x142f3c0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:445 #12 0x00007fb75c346c53 in event_dispatch (event_pool=0x142f3c0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event.c:113 #13 0x0000000000409750 in main (argc=11, argv=0x7fff8f7b6908) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd.c:2043 One of the following divisions should be the reason: if (weight_by_size) { /* We know total_size is not zero. */ chunk = ((unsigned long) 0xffffffff) / total_size; gf_log (this->name, GF_LOG_INFO, "chunk size = 0xffffffff / %u = 0x%x", total_size, chunk); } else { chunk = ((unsigned long) 0xffffffff) / bricks_used; } Version-Release number of selected component (if applicable): How reproducible: Not sure Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
More information: (gdb) p bricks_used $1 = 0 (gdb) p weight_by_size $2 = _gf_false (gdb) p layout->cnt $3 = 1 (gdb) p layout->list[0].err $4 = -1 (gdb)
REVIEW: http://review.gluster.org/8792 (cluster/dht: Modified the calculation of brick_count) posted (#1) for review on master by venkatesh somyajulu (vsomyaju)
COMMIT: http://review.gluster.org/8792 committed in master by Vijay Bellur (vbellur) ------ commit f14d9bdd52b428466e7863d06c89b4684be3da07 Author: Venkatesh Somyajulu <vsomyaju> Date: Mon Sep 22 13:29:13 2014 +0530 cluster/dht: Modified the calculation of brick_count Whenever new_layout is calculated for a directory, we calculate the number of childs of dht, who will get the actual(Non-zero) layout-range, and assign range to only those subvolume and other will get 0 as their layout->start and layout->stop value. This calculation is based on either a) weight_by_size or b) number of brick who will be assigned the non-zero range So if in case we are not assigning the layout based on weight_by_size, we should choose the "bricks_to_use" instead of "bricks_used". In regression test, we found that priv->du_stat[0].chunks was zero. In this case "bricks_used" variable will be zero, which will cause crash for chunk = ((unsigned long) 0xffffffff) / bricks__used; calculation. Change-Id: I6f1b21eff972a80d9eb22771087c1e2f53e7e724 BUG: 1143835 Signed-off-by: Venkatesh Somyajulu <vsomyaju> Reviewed-on: http://review.gluster.org/8792 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Jeff Darcy <jdarcy>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report. glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939 [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user