Description of problem: ------------------------ 9*(4+2) volume,added 6 bricks. Triggered rebalance start force. I see a huge number of files being skipped. Files should _not_ be skipped with the "force" option,especially when I have lots of space on my bricks : [root@gqas013 glusterfs]# gluster v rebalance khal status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 180 3.8KB 488886 0 173827 in progress 1:27:20 gqas005.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 in progress 1:27:20 gqas006.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 in progress 1:27:20 gqas008.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 in progress 1:27:20 gqas014.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 in progress 0:00:00 gqas015.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 in progress 0:00:00 Estimated time left for rebalance to complete : 3:04:19 volume rebalance: khal: success [root@gqas013 glusterfs]# Version-Release number of selected component (if applicable): -------------------------------------------------------------- 3.8.4-25 How reproducible: ----------------- 100% on my setup. Additional info: ---------------- [root@gqas013 glusterfs]# gluster v info Volume Name: khal Type: Distributed-Disperse Volume ID: 415b2241-0f83-4339-a558-257212fe8682 Status: Started Snapshot Count: 0 Number of Bricks: 10 x (4 + 2) = 60 Transport-type: tcp Bricks: Brick1: gqas013:/bricks1/1 Brick2: gqas014:/bricks1/1 Brick3: gqas015:/bricks1/1 Brick4: gqas005:/bricks1/1 Brick5: gqas006:/bricks1/1 Brick6: gqas008:/bricks1/1 Brick7: gqas013:/bricks2/1 Brick8: gqas014:/bricks2/1 Brick9: gqas015:/bricks2/1 Brick10: gqas005:/bricks2/1 Brick11: gqas006:/bricks2/1 Brick12: gqas008:/bricks2/1 Brick13: gqas013:/bricks3/1 Brick14: gqas014:/bricks3/1 Brick15: gqas015:/bricks3/1 Brick16: gqas005:/bricks3/1 Brick17: gqas006:/bricks3/1 Brick18: gqas008:/bricks3/1 Brick19: gqas013:/bricks4/1 Brick20: gqas014:/bricks4/1 Brick21: gqas015:/bricks4/1 Brick22: gqas005:/bricks4/1 Brick23: gqas006:/bricks4/1 Brick24: gqas008:/bricks4/1 Brick25: gqas013:/bricks5/1 Brick26: gqas014:/bricks5/1 Brick27: gqas015:/bricks5/1 Brick28: gqas005:/bricks5/1 Brick29: gqas006:/bricks5/1 Brick30: gqas008:/bricks5/1 Brick31: gqas013:/bricks6/1 Brick32: gqas014:/bricks6/1 Brick33: gqas015:/bricks6/1 Brick34: gqas005:/bricks6/1 Brick35: gqas006:/bricks6/1 Brick36: gqas008:/bricks6/1 Brick37: gqas013:/bricks7/1 Brick38: gqas014:/bricks7/1 Brick39: gqas015:/bricks7/1 Brick40: gqas005:/bricks7/1 Brick41: gqas006:/bricks7/1 Brick42: gqas008:/bricks7/1 Brick43: gqas013:/bricks8/1 Brick44: gqas014:/bricks8/1 Brick45: gqas015:/bricks8/1 Brick46: gqas005:/bricks8/1 Brick47: gqas006:/bricks8/1 Brick48: gqas008:/bricks8/1 Brick49: gqas013:/bricks9/1 Brick50: gqas014:/bricks9/1 Brick51: gqas015:/bricks9/1 Brick52: gqas005:/bricks9/1 Brick53: gqas006:/bricks9/1 Brick54: gqas008:/bricks9/1 Brick55: gqas013:/bricks10/1 Brick56: gqas014:/bricks10/1 Brick57: gqas015:/bricks10/1 Brick58: gqas005:/bricks10/1 Brick59: gqas006:/bricks10/1 Brick60: gqas008:/bricks10/1 Options Reconfigured: network.inode-lru-limit: 50000 performance.md-cache-timeout: 600 performance.cache-invalidation: on performance.stat-prefetch: on features.cache-invalidation-timeout: 600 features.cache-invalidation: on client.event-threads: 4 server.event-threads: 4 cluster.lookup-optimize: on transport.address-family: inet nfs.disable: off [root@gqas013 glusterfs]#
Looks like a regression,I didn't see this happen on 3.2. Unsure what dev build introduced this,though.
[root@gqac011 gluster-mount]# find . -mindepth 1 -type f -links +1 [root@gqac011 gluster-mount]# There are no hardlinks.
I did a quick test on *2,I could not repro the error : [root@gqas013 ~]# gluster v rebalance test status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 34051 6.8MB 121275 0 0 completed 0:07:02 gqas005.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 completed 0:00:22 gqas014.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 completed 0:00:22 gqas015.sbu.lab.eng.bos.redhat.com 13782 745.9MB 50203 0 0 completed 0:03:28 volume rebalance: test: success [root@gqas013 ~]# gluster v info Volume Name: test Type: Distributed-Replicate Volume ID: 61d155ca-05cc-4ad0-8488-aaeb0e829b91 Status: Started Snapshot Count: 0 Number of Bricks: 4 x 2 = 8 Transport-type: tcp Bricks: Brick1: gqas013:/bricks1/A Brick2: gqas014:/bricks1/A Brick3: gqas015:/bricks1/A Brick4: gqas005:/bricks1/A Brick5: gqas013:/bricks4/Am Brick6: gqas015:/bricks4/A Brick7: gqas013:/bricks8/Am Brick8: gqas015:/bricks8/A Options Reconfigured: transport.address-family: inet nfs.disable: on [root@gqas013 ~]#
This is also due to the fallocate BZ (). As fallocate fails, dht_migrate_file returns with -1. Since the ret code is -1, the task completion function sets the 2132 static int 2133 rebalance_task_completion (int op_ret, call_frame_t *sync_frame, void *data) 2134 { 2135 int32_t op_errno = EINVAL; 2136 2137 if (op_ret == -1) { 2138 /* Failure of migration process, mostly due to write process. (gdb) 2139 as we can't preserve the exact errno, lets say there was 2140 no space to migrate-data 2141 */ 2142 op_errno = ENOSPC; 2143 } If the op_errno is ENOSPC, dht believes the migration has been skipped. I am marking this depends on BZ 1447559. This can be retested on the build with the fix for 1447559. There is nothing to be changed in DHT for this.
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/107051/
Works fine on glusterfs-3.8.4-32.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774