+++ This bug was initially created as a clone of Bug #1247108 +++ Description of problem: OS installation on a vm image in a sharded volume hangs at some point. Statedump on the fuse client taken at several points reveals that readv() fop is hung: <statedump> ... ... [global.callpool.stack.1.frame.10] frame=0x7f0b0bcfd150 ref_count=0 translator=dis-rep-shard complete=0 <==== complete is 0. parent=dis-rep-trace wind_from=trace_readv wind_to=FIRST_CHILD(this)->fops->readv unwind_to=trace_readv_cbk ... ... [global.callpool.stack.1.frame.14] frame=0x7f0b0bcd6f40 ref_count=1 translator=dis-rep complete=0 <======== complete is 0 parent=fuse wind_from=fuse_readv_resume wind_to=FIRST_CHILD(this)->fops->readv unwind_to=fuse_readv_cbk ... ... </statedump> This was found to be due to call_count being reduced to -1 at the end of shard_common_lookup_shards() because of which this particular stack never gets unwound till FUSE: (gdb) p (call_frame_t *)0x7f0b0bcfd150 $1 = (call_frame_t *) 0x7f0b0bcfd150 (gdb) p (shard_local_t *)$1->local $2 = (shard_local_t *) 0x7f0b0086310c (gdb) p $2->call_count $3 = -1 (gdb) p $2->eexist_count $4 = 1 Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: --- Additional comment from Krutika Dhananjay on 2015-07-28 03:37:32 EDT --- http://review.gluster.org/#/c/11770/ --- Additional comment from Anand Avati on 2015-07-28 09:44:12 EDT --- REVIEW: http://review.gluster.org/11778 (features/shard: Fix block size get from xdata) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu) --- Additional comment from Anand Avati on 2015-07-28 21:53:52 EDT --- COMMIT: http://review.gluster.org/11770 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit d051bd14223d12ca8eaea85f6988ff41e5eef2c3 Author: Krutika Dhananjay <kdhananj> Date: Tue Jul 28 11:25:55 2015 +0530 features/shard: (Re)initialize local->call_count before winding lookup Change-Id: I616409c38b86c0acf1817b3472a1fed73db293f8 BUG: 1247108 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: http://review.gluster.org/11770 Tested-by: NetBSD Build System <jenkins.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu> Tested-by: Gluster Build System <jenkins.com>
REVIEW: http://review.gluster.org/11783 (features/shard: (Re)initialize local->call_count before winding lookup) posted (#1) for review on release-3.7 by Krutika Dhananjay (kdhananj)
REVIEW: http://review.gluster.org/11789 (features/shard: Fix block size get from xdata) posted (#1) for review on release-3.7 by Krutika Dhananjay (kdhananj)
http://review.gluster.org/#/c/11789/ http://review.gluster.org/#/c/11783/
REVIEW: http://review.gluster.org/11783 (features/shard: (Re)initialize local->call_count before winding lookup) posted (#2) for review on release-3.7 by Krutika Dhananjay (kdhananj)
COMMIT: http://review.gluster.org/11789 committed in release-3.7 by Pranith Kumar Karampuri (pkarampu) ------ commit 044a5623eb9af8e6f52ed2dd02f0f07d23479638 Author: Pranith Kumar K <pkarampu> Date: Tue Jul 28 18:38:56 2015 +0530 features/shard: Fix block size get from xdata Backport of: http://review.gluster.org/11778 Instead of using dict_get_ptr, dict_get_uint64 was used. If the first byte of the value is '\0' then size is returned as 0 because strtoull is used in data_to_uint64. This will make it seem like the file is not sharded at all. Original author: Pranith Kumar K <pkarampu> Change-Id: Id07a7d9523cb29d096b65dd68bbfcef395031aef BUG: 1247833 Signed-off-by: Pranith Kumar K <pkarampu> Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: http://review.gluster.org/11789 Tested-by: Gluster Build System <jenkins.com> Tested-by: NetBSD Build System <jenkins.org>
REVIEW: http://review.gluster.org/11802 (features/shard: Create /.shard with 0777 permissions (for now)) posted (#1) for review on release-3.7 by Krutika Dhananjay (kdhananj)
REVIEW: http://review.gluster.org/11810 (cluster/afr: Make [f]xattrop metadata transaction) posted (#1) for review on release-3.7 by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/11802 (features/shard: Create /.shard with 0777 permissions (for now)) posted (#2) for review on release-3.7 by Krutika Dhananjay (kdhananj)
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.4, please open a new bug report. glusterfs-3.7.4 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/12496 [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user