+++ This bug was initially created as a clone of Bug #1553598 +++ +++ This bug was initially created as a clone of Bug #1546941 +++ Description of problem: ======================= After rebalance is triggered on the volume, on few files I am seeing ENOSPC errors in rebalance logs though there is enough space left on the bricks. 2018-02-20 05:31:01.051156] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/ABI/testing/configfs-usb-gadget-serial failed: [No space left on device] [2018-02-20 05:31:01.139027] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/ABI/testing/debugfs-pktcdvd failed: [No space left on device] [2018-02-20 05:31:01.769876] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/ABI/testing/sysfs-bus-iio-meas-spec failed: [No space left on device] [2018-02-20 05:31:03.973459] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/ABI/testing/sysfs-ibft failed: [No space left on device] [2018-02-20 05:31:09.066247] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/DocBook/media/v4l/vidioc-g-fbuf.xml failed: [No space left on device] [2018-02-20 05:31:13.597532] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/RCU/trace.txt failed: [No space left on device] [2018-02-20 05:31:24.303807] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/cgroup-v1/cgroups.txt failed: [No space left on device] [2018-02-20 05:31:24.992058] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/connector/connector.txt failed: [No space left on device] [2018-02-20 05:31:30.169573] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/devicetree/bindings/arm/calxeda/l2ecc.txt failed: [No space left on device] [2018-02-20 05:31:33.802256] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/devicetree/bindings/arm/omap/crossbar.txt failed: [No space left on device] [2018-02-20 05:31:39.996225] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/devicetree/bindings/clock/ti/mux.txt failed: [No space left on device] [2018-02-20 05:31:40.264191] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/devicetree/bindings/clock/clk-palmas-clk32kg-clocks.txt failed: [No space left on device] Grep output for a file in rebalance logs: # grep -i /linux-4.6.4/Documentation/hsi.txt /var/log/glusterfs/distrepx3-rebalance.log [2018-02-20 05:33:04.980564] I [dht-rebalance.c:1513:dht_migrate_file] 0-distrepx3-dht: /linux-4.6.4/Documentation/hsi.txt: attempting to move from distrepx3-replicate-0 to distrepx3-replicate-2 [2018-02-20 05:33:05.096191] W [MSGID: 109023] [dht-rebalance.c:962:__dht_check_free_space] 0-distrepx3-dht: data movement of file {blocks:6 name:(/linux-4.6.4/Documentation/hsi.txt) } would result in dst node (distrepx3-replicate-2:41451088) having lower disk space then the source node (distrepx3-replicate-0:41453744).Skipping file. [2018-02-20 05:33:05.127613] I [MSGID: 109126] [dht-rebalance.c:2714:gf_defrag_migrate_single_file] 0-distrepx3-dht: File migration skipped for /linux-4.6.4/Documentation/hsi.txt. [2018-02-20 05:33:05.127739] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/hsi.txt failed: [No space left on device] Version-Release number of selected component (if applicable): 3.12.2-4.el7rhgs.x86_64 How reproducible: 1/1 Steps to Reproduce: =================== 1) Create a x3 volume and start it. 2) FUSE mount it on multiple clients. 3) Run linux kernel untar from two clients . 4) While IO is in-progress, add bricks to the volume and start rebalance without force. Actual results: =============== Seeing ENOSPC errors on few files in rebalance logs though there is enough space left on the bricks. Expected results: ================= NO errors in rebalance logs. --- Additional comment from Worker Ant on 2018-03-09 02:38:07 EST --- REVIEW: https://review.gluster.org/19687 (cluster/dht: Skipped files are not treated as errors) posted (#1) for review on master by N Balachandran --- Additional comment from Worker Ant on 2018-03-12 06:16:23 EDT --- COMMIT: https://review.gluster.org/19687 committed in master by "Raghavendra G" <rgowdapp> with a commit message- cluster/dht: Skipped files are not treated as errors For skipped files, use a return value of 1 to prevent error messages being logged. Change-Id: I18de31ac1a64d4460e88dea7826c3ba03c895861 BUG: 1553598 Signed-off-by: N Balachandran <nbalacha>
REVIEW: https://review.gluster.org/19710 (cluster/dht: Skipped files are not treated as errors) posted (#1) for review on release-3.12 by N Balachandran
REVIEW: https://review.gluster.org/19710 (cluster/dht: Skipped files are not treated as errors) posted (#2) for review on release-3.12 by N Balachandran
REVIEW: https://review.gluster.org/19806 (cluster/dht: ENOSPC will not fail rebalance) posted (#1) for review on release-3.12 by N Balachandran
COMMIT: https://review.gluster.org/19806 committed in release-3.12 by "N Balachandran" <nbalacha> with a commit message- cluster/dht: ENOSPC will not fail rebalance ENOSPC returned by a file migration is no longer considered a rebalance failure. Change-Id: I21cf3a8acdc827bc478e138d6cb5db649d53a28c BUG: 1555161 Signed-off-by: N Balachandran <nbalacha>
REVIEW: https://review.gluster.org/19710 (cluster/dht: Skipped files are not treated as errors) posted (#3) for review on release-3.12 by N Balachandran
COMMIT: https://review.gluster.org/19710 committed in release-3.12 by "jiffin tony Thottan" <jthottan> with a commit message- cluster/dht: Skipped files are not treated as errors For skipped files, use a return value of 1 to prevent error messages being logged. > Change-Id: I18de31ac1a64d4460e88dea7826c3ba03c895861 > BUG: 1553598 > Signed-off-by: N Balachandran <nbalacha> Change-Id: I18de31ac1a64d4460e88dea7826c3ba03c895861 BUG: 1555161 Signed-off-by: N Balachandran <nbalacha>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.8, please open a new bug report. glusterfs-3.12.8 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/gluster-devel/2018-April/054749.html [2] https://www.gluster.org/pipermail/gluster-users/