Description of problem: ======================= 'ls' on nfs mount lists only 21 entries while the directory has 1000's of directories in it. The nfs log file shows the below messages while deleting the entries from the client. [2015-04-06 06:56:41.451361] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-1: Invalid index 26 in readdirp request [2015-04-06 06:56:41.483035] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-0: Invalid index 18 in readdirp request [2015-04-06 06:56:41.483921] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-1: Invalid index 32 in readdirp request [2015-04-06 06:56:41.509809] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-0: Invalid index 22 in readdirp request [2015-04-06 06:56:41.510357] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-1: Invalid index 26 in readdirp request [2015-04-06 06:56:41.537172] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-0: Invalid index 14 in readdirp request [2015-04-06 06:56:41.537957] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-1: Invalid index 32 in readdirp request [2015-04-06 06:56:41.569386] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-0: Invalid index 18 in readdirp request [2015-04-06 06:56:41.570263] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-1: Invalid index 26 in readdirp request [2015-04-06 06:56:41.597343] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-0: Invalid index 22 in readdirp request [2015-04-06 06:56:41.597986] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-1: Invalid index 32 in readdirp request [2015-04-06 06:56:41.623217] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-0: Invalid index 14 in readdirp request [2015-04-06 06:56:41.623743] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-1: Invalid index 26 in readdirp request [2015-04-06 06:56:41.654130] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-0: Invalid index 18 in readdirp request [2015-04-06 06:56:41.654713] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-1: Invalid index 32 in readdirp request [2015-04-06 06:56:41.683289] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-0: Invalid index 21 in readdirp request [2015-04-06 06:56:41.683905] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-1: Invalid index 26 in readdirp request [2015-04-06 06:56:41.712598] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-0: Invalid index 13 in readdirp request [2015-04-06 06:56:41.713078] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-1: Invalid index 32 in readdirp request [2015-04-06 06:56:41.745242] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-0: Invalid index 18 in readdirp request [2015-04-06 06:56:41.746037] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-1: Invalid index 26 in readdirp request [2015-04-06 06:56:41.775139] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-0: Invalid index 22 in readdirp request [2015-04-06 06:56:41.775963] E [ec-dir-read.c:422:ec_manager_readdir] 0-testvol-disperse-1: Invalid index 32 in readdirp request Output of 'ls' command on nfs mount : ===================================== [root@dhcp37-61 nfs]# ls -ld dirs drwxr-xr-x. 8942 root root 901120 Apr 6 12:26 dirs [root@dhcp37-61 nfs]# cd dirs [root@dhcp37-61 dirs]# ls | wc -l 21 [root@dhcp37-61 dirs]# rm -rf * [root@dhcp37-61 dirs]# ls | wc -l 21 [root@dhcp37-61 dirs]# Version-Release number of selected component (if applicable): ============================================================== [root@dhcp37-61 dirs]# gluster --version glusterfs 3.7dev built on Apr 5 2015 01:10:28 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License. [root@dhcp37-61 dirs]# How reproducible: ================= 100% Steps to reproduce : 1. NFS mount the volume on client. 2. Create 100's of directories 3. Now delete with 'rm rf * " 4. List the entries from mount and check the nfs log file on server side. Gluster volume status and info : ================================ [root@vertigo ~]# gluster v status testvol Status of volume: testvol Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick vertigo:/rhs/brick1/b1 49152 0 Y 8995 Brick ninja:/rhs/brick1/b2 49152 0 Y 12400 Brick vertigo:/rhs/brick2/b3 49153 0 Y 9014 Brick ninja:/rhs/brick2/b4 49153 0 Y 12419 Brick vertigo:/rhs/brick3/b5 49154 0 Y 6143 Brick ninja:/rhs/brick3/b6 49154 0 Y 5874 Brick vertigo:/rhs/brick4/b7 49155 0 Y 6160 Brick ninja:/rhs/brick4/b8 49155 0 Y 5891 Brick vertigo:/rhs/brick1/b9 49156 0 Y 6177 Brick ninja:/rhs/brick1/b10 49156 0 Y 5908 Brick vertigo:/rhs/brick2/b11 49157 0 Y 6194 Brick ninja:/rhs/brick2/b12 49157 0 Y 5925 Brick vertigo:/rhs/brick1/b1-1 49159 0 Y 9149 Brick ninja:/rhs/brick1/b2-1 49159 0 Y 13401 Brick vertigo:/rhs/brick2/b3-1 49160 0 Y 9168 Brick ninja:/rhs/brick2/b4-1 49160 0 Y 13420 Brick vertigo:/rhs/brick3/b5-1 49161 0 Y 9187 Brick ninja:/rhs/brick3/b6-1 49161 0 Y 13439 Brick vertigo:/rhs/brick4/b7-1 49162 0 Y 9206 Brick ninja:/rhs/brick4/b8-1 49162 0 Y 13458 Brick vertigo:/rhs/brick1/b9-1 49163 0 Y 9225 Brick ninja:/rhs/brick1/b10-1 49163 0 Y 13477 Brick vertigo:/rhs/brick2/b11-1 49164 0 Y 9244 Brick ninja:/rhs/brick2/b12-1 49164 0 Y 13496 Snapshot Daemon on localhost 49158 0 Y 6336 NFS Server on localhost 2049 0 Y 2546 Quota Daemon on localhost N/A N/A Y 2584 Snapshot Daemon on ninja 49158 0 Y 6110 NFS Server on ninja 2049 0 Y 6657 Quota Daemon on ninja N/A N/A Y 6682 Task Status of Volume testvol ------------------------------------------------------------------------------ Task : Rebalance ID : f768cf44-3b79-487c-99a6-7b301c213f46 Status : in progress [root@vertigo ~]# gluster v info testvol Volume Name: testvol Type: Distributed-Disperse Volume ID: b9957725-69f5-496a-8b24-20a1c102ff1a Status: Started Number of Bricks: 2 x (8 + 4) = 24 Transport-type: tcp Bricks: Brick1: vertigo:/rhs/brick1/b1 Brick2: ninja:/rhs/brick1/b2 Brick3: vertigo:/rhs/brick2/b3 Brick4: ninja:/rhs/brick2/b4 Brick5: vertigo:/rhs/brick3/b5 Brick6: ninja:/rhs/brick3/b6 Brick7: vertigo:/rhs/brick4/b7 Brick8: ninja:/rhs/brick4/b8 Brick9: vertigo:/rhs/brick1/b9 Brick10: ninja:/rhs/brick1/b10 Brick11: vertigo:/rhs/brick2/b11 Brick12: ninja:/rhs/brick2/b12 Brick13: vertigo:/rhs/brick1/b1-1 Brick14: ninja:/rhs/brick1/b2-1 Brick15: vertigo:/rhs/brick2/b3-1 Brick16: ninja:/rhs/brick2/b4-1 Brick17: vertigo:/rhs/brick3/b5-1 Brick18: ninja:/rhs/brick3/b6-1 Brick19: vertigo:/rhs/brick4/b7-1 Brick20: ninja:/rhs/brick4/b8-1 Brick21: vertigo:/rhs/brick1/b9-1 Brick22: ninja:/rhs/brick1/b10-1 Brick23: vertigo:/rhs/brick2/b11-1 Brick24: ninja:/rhs/brick2/b12-1 Options Reconfigured: features.quota: on features.uss: on server.event-threads: 3 client.event-threads: 4 cluster.disperse-self-heal-daemon: enable [root@vertigo ~]# sosreports of the node will be attached.
REVIEW: http://review.gluster.org/10165 (cluster/ec: Fix readdir de-itransform) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10165 (cluster/ec: Fix readdir de-itransform) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10165 (cluster/ec: Fix readdir de-itransform) posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu)
COMMIT: http://review.gluster.org/10165 committed in master by Vijay Bellur (vbellur) ------ commit 4797cb1c9dbf3910952f9d28d8272ff83cd25e7b Author: Pranith Kumar K <pkarampu> Date: Wed Apr 8 21:42:49 2015 +0530 cluster/ec: Fix readdir de-itransform Problem: gf_deitransform returns the glbal client-id in the complete graph. So except for the first disperse subvolume under dht, all the other disperse subvolumes will return a client-id greater than ec->nodes, so readdir will always error out in those subvolumes. Fix: Get the client subvolume whose client-id matches the client-id returned by gf_deitransform of offset. Change-Id: I26aa17504352d48d7ff14b390b62f49d7ab2d699 BUG: 1209113 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/10165 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Xavier Hernandez <xhernandez>
Dan needs to send out the fix for the second issue. Pranith
REVIEW: http://review.gluster.org/10274 (This fix corrects the subvolume id in the presense of graphs composed out of multiple volumes. Example such graphs are created with the self heal daemon and snap uss. Prior, the number of bricks calculated was computed regardles of the total number of volumes combined within the graph. With this fix, the brick count only includes those owned by particular volume in question.) posted (#1) for review on master by Dan Lambright (dlambrig)
Please perform POST -> MODIFIED transitions after all patches needed are merged. Thanks!
Forked another bug for this which is assigned to tiering team and hence this moved to modified. Done per discussion with ec team.
Not observing this in the recent builds. For now closing the bug. Please feel free to re-open as soon as we observe it again.