Description of problem: During the remove-brick on distribute type of volumen is not respect option cluster.min-free-disk, which can be problematic when unequal capacity bricks. This leads to the "No space left on device" on small bricks. Logs: [2016-07-26 10:39:06.589909] W [MSGID: 114031] [client-rpc-fops.c:904:client3_3_writev_cbk] 0-test-client-1: remote operation failed [No space left on device] [2016-07-26 10:39:06.589976] E [MSGID: 109023] [dht-rebalance.c:1124:dht_migrate_file] 0-test-dht: Migrate file failed: /test_file.424: failed to migrate data [2016-07-26 10:39:06.590375] W [MSGID: 114031] [client-rpc-fops.c:904:client3_3_writev_cbk] 0-test-client-1: remote operation failed [No space left on device] [2016-07-26 10:39:06.590408] E [MSGID: 109023] [dht-rebalance.c:1124:dht_migrate_file] 0-test-dht: Migrate file failed: /test_file.420: failed to migrate data Version-Release number of selected component (if applicable): glusterfs-3.7.1-16.el7rhs.x86_64 glusterfs-fuse-3.7.1-16.el7rhs.x86_64 glusterfs-client-xlators-3.7.1-16.el7rhs.x86_64 glusterfs-api-3.7.1-16.el7rhs.x86_64 glusterfs-server-3.7.1-16.el7rhs.x86_64 glusterfs-libs-3.7.1-16.el7rhs.x86_64 glusterfs-cli-3.7.1-16.el7rhs.x86_64 How reproducible: Always at different size of bricks. Steps to Reproduce: Two servers glusterfs with IP: - 10.209.2.164 - gluster-test1 - 10.209.2.165 - gluster-test2 XFS volumens mounted on each servers: - /srv/storage1 - 4GB - /srv/storage2 - 8GB - /srv/storage3 - 10GB 1. Create distribute volume 'test': ~ # gluster volume create test 10.209.2.164:/srv/storage1/test 10.209.2.165:/srv/storage1/test 10.209.2.164:/srv/storage2/test 10.209.2.165:/srv/storage2/test 10.209.2.164:/srv/storage3/test 10.209.2.165:/srv/storage3/test 2. Set option 'cluster.min-free-disk' to '2GB': ~ # gluster volume set test cluster.min-free-disk 2GB 3. Set option 'cluster.weighted-rebalance' to 'on' ~ # gluster volume set test cluster.weighted-rebalance on 4. Start volume 'test' ~ # gluster volume start test 5. Mount volume 'test' on /mnt/test 6. Generate 25GB testing data with 'sysbench' tool: /mnt/test # sysbench --test=fileio --file-total-size=25G --file-num=512 prepare 7. After that 'df' looks like this: root ~ # df -h /srv/storage1/ /srv/storage2/ /srv/storage3/ Filesystem Size Used Avail Use% Mounted on /dev/vdc 4.1G 2.1G 2.0G 51% /srv/storage1 /dev/vdd 8.1G 4.0G 4.1G 50% /srv/storage2 /dev/vde 10G 6.1G 4.0G 61% /srv/storage3 root ~ # df -h /srv/storage1/ /srv/storage2/ /srv/storage3/ Filesystem Size Used Avail Use% Mounted on /dev/vdc 4.1G 2.1G 2.0G 51% /srv/storage1 /dev/vdd 8.1G 5.2G 2.9G 65% /srv/storage2 /dev/vde 10G 6.0G 4.1G 60% /srv/storage3 8. Remove 8GB brick: ~ # gluster volume remove-brick test 10.209.2.164:/srv/storage2/test 9. After that, remove-brick reports 'completed' of job (with failures), but some files on 10.209.2.164:/srv/storage2/test still exist, and two another volumens are full: root ~ # gluster volume remove-brick test 10.209.2.164:/srv/storage2/test status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 113 5.5GB 301 69 0 completed 66.00 root ~ # df -h /srv/storage1/ /srv/storage2/ /srv/storage3/ Filesystem Size Used Avail Use% Mounted on /dev/vdc 4.1G 1.1G 3.0G 27% /srv/storage1 /dev/vdd 8.1G 2.2G 5.9G 28% /srv/storage2 /dev/vde 10G 3.7G 6.4G 37% /srv/storage3 root ~ # df -h /srv/storage1/ /srv/storage2/ /srv/storage3/ Filesystem Size Used Avail Use% Mounted on /dev/vdc 4.1G 4.0G 18M 100% /srv/storage1 /dev/vdd 8.1G 8.0G 13M 100% /srv/storage2 /dev/vde 10G 6.4G 3.7G 64% /srv/storage3 10. Logs from '/var/log/glusterfs/test-rebalance.log': [2016-07-26 10:39:06.671197] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /test_file.474: attempting to move from test-client-6 to test-client-1 [2016-07-26 10:39:07.019233] I [MSGID: 109022] [dht-rebalance.c:1316:dht_migrate_file] 0-test-dht: completed migration of /test_file.427 from subvolume test-client-6 to test-client-0 [2016-07-26 10:39:07.023070] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /test_file.475: attempting to move from test-client-6 to test-client-3 [2016-07-26 10:39:07.033200] E [MSGID: 109023] [dht-rebalance.c:672:__dht_check_free_space] 0-test-dht: data movement attempted from node (test-client-6) to node (test-client-3) which does not have required free space for (/test_file.475) [2016-07-26 10:39:07.036136] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /test_file.476: attempting to move from test-client-6 to test-client-3 [2016-07-26 10:39:07.046853] E [MSGID: 109023] [dht-rebalance.c:672:__dht_check_free_space] 0-test-dht: data movement attempted from node (test-client-6) to node (test-client-3) which does not have required free space for (/test_file.476) [2016-07-26 10:39:07.049549] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /test_file.478: attempting to move from test-client-6 to test-client-3 [2016-07-26 10:39:07.059378] E [MSGID: 109023] [dht-rebalance.c:672:__dht_check_free_space] 0-test-dht: data movement attempted from node (test-client-6) to node (test-client-3) which does not have required free space for (/test_file.478) [2016-07-26 10:39:07.061465] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /test_file.484: attempting to move from test-client-6 to test-client-0 [2016-07-26 10:39:07.768400] W [MSGID: 114031] [client-rpc-fops.c:904:client3_3_writev_cbk] 0-test-client-1: remote operation failed [No space left on device] [2016-07-26 10:39:07.768632] E [MSGID: 109023] [dht-rebalance.c:1124:dht_migrate_file] 0-test-dht: Migrate file failed: /test_file.496: failed to migrate data [2016-07-26 10:39:07.770780] W [MSGID: 114031] [client-rpc-fops.c:904:client3_3_writev_cbk] 0-test-client-1: remote operation failed [No space left on device] [2016-07-26 10:39:07.770842] E [MSGID: 109023] [dht-rebalance.c:1124:dht_migrate_file] 0-test-dht: Migrate file failed: /test_file.474: failed to migrate data [2016-07-26 10:39:07.783232] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /test_file.485: attempting to move from test-client-6 to test-client-3 [2016-07-26 10:39:07.788292] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /test_file.490: attempting to move from test-client-6 to test-client-1 [2016-07-26 10:39:07.795349] E [MSGID: 109023] [dht-rebalance.c:672:__dht_check_free_space] 0-test-dht: data movement attempted from node (test-client-6) to node (test-client-3) which does not have required free space for (/test_file.485) [2016-07-26 10:39:07.799550] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /test_file.491: attempting to move from test-client-6 to test-client-1 [2016-07-26 10:39:07.802247] E [MSGID: 109023] [dht-rebalance.c:672:__dht_check_free_space] 0-test-dht: data movement attempted from node (test-client-6) to node (test-client-1) which does not have required free space for (/test_file.490) [2016-07-26 10:39:07.806331] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /test_file.492: attempting to move from test-client-6 to test-client-1 [2016-07-26 10:39:07.813135] E [MSGID: 109023] [dht-rebalance.c:672:__dht_check_free_space] 0-test-dht: data movement attempted from node (test-client-6) to node (test-client-1) which does not have required free space for (/test_file.491) [2016-07-26 10:39:07.817428] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /test_file.497: attempting to move from test-client-6 to test-client-3 [2016-07-26 10:39:07.820004] E [MSGID: 109023] [dht-rebalance.c:672:__dht_check_free_space] 0-test-dht: data movement attempted from node (test-client-6) to node (test-client-1) which does not have required free space for (/test_file.492) [2016-07-26 10:39:07.823201] I [dht-rebalance.c:1002:dht_migrate_file] 0-test-dht: /test_file.511: attempting to move from test-client-6 to test-client-3 [2016-07-26 10:39:07.830852] E [MSGID: 109023] [dht-rebalance.c:672:__dht_check_free_space] 0-test-dht: data movement attempted from node (test-client-6) to node (test-client-3) which does not have required free space for (/test_file.497) [2016-07-26 10:39:07.835856] E [MSGID: 109023] [dht-rebalance.c:672:__dht_check_free_space] 0-test-dht: data movement attempted from node (test-client-6) to node (test-client-3) which does not have required free space for (/test_file.511) [2016-07-26 10:39:08.010471] I [MSGID: 109022] [dht-rebalance.c:1316:dht_migrate_file] 0-test-dht: completed migration of /test_file.484 from subvolume test-client-6 to test-client-0 [2016-07-26 10:39:08.415897] I [MSGID: 109022] [dht-rebalance.c:1316:dht_migrate_file] 0-test-dht: completed migration of /test_file.508 from subvolume test-client-4 to test-client-1 [2016-07-26 10:39:08.416988] I [MSGID: 109028] [dht-rebalance.c:3063:gf_defrag_status_get] 0-test-dht: Rebalance is completed. Time taken is 66.00 secs [2016-07-26 10:39:08.417036] I [MSGID: 109028] [dht-rebalance.c:3067:gf_defrag_status_get] 0-test-dht: Files migrated: 113, size: 5924454400, lookups: 301, failures: 0, skipped: 69 [2016-07-26 10:39:08.417495] W [glusterfsd.c:1219:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7dc5) [0x7fba9dc3fdc5] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7fba9f2a8785] -->/usr/sbin/glusterfs(cleanup_and_exit+0x69) [0x7fba9f2a8609] ) 0-: received signum (15), shutting down Actual results: During remove-brick on 'distribute' volume, on configuration unequal capacity bricks leds to 'no space on device' on small bricks. Expected results: During remove-brick should respect 'cluster.min-free-disk' which does not allow to overfill bricks.
__dht_check_free_space (dht-rebalance.c), as can be seen below checks whether dst has enough space to accomodate the file that is being migrated. Nowhere it is checking whether dst has free space configured in 'min-free-disk'. Now, since parallel migrations can happen to same brick (either from same rebalance process - multithreaded rebalance - or from multiple rebalance processes), we can end up saying yes to more than one parallel migrations of different files when there is only enough free space to hold the largest of files, which can result in ENOSPC during migration. check_avail_space: if (((dst_statfs.f_bavail * dst_statfs.f_bsize) / GF_DISK_SECTOR_SIZE) < stbuf->ia_blocks) { gf_msg (this->name, GF_LOG_ERROR, 0, DHT_MSG_MIGRATE_FILE_FAILED, "data movement attempted from node (%s) to node (%s) " "which does not have required free space for (%s)", from->name, to->name, loc->path); ret = -1; goto out; } We should either: * use a buffer space to account for parallel migrations (like a min-free-disk relatively larger than largest file size) * make check_free_space resilient against parallel migrations (may be by atomically decrementing free space in statfs by the size of file being migrated etc). regards, Raghavendra
upstream patch : https://review.gluster.org/#/c/17034/
downstream patches: https://code.engineering.redhat.com/gerrit/#/c/103914/ https://code.engineering.redhat.com/gerrit/#/c/103915/
Verified this BZ on glusterfs version: 3.8.4-36.el7rhgs.x86_64. This issue is fixed, now remove-brick is considering cluster.min-free-disk values during file migration. Moving this BZ to Verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774