Bug 1555161 - [Rebalance] ENOSPC errors on few files in rebalance logs
Summary: [Rebalance] ENOSPC errors on few files in rebalance logs
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: 3.12
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Nithya Balachandran
QA Contact:
URL:
Whiteboard:
Depends On: 1553598
Blocks: 1546941
TreeView+ depends on / blocked
 
Reported: 2018-03-14 04:36 UTC by Nithya Balachandran
Modified: 2018-04-24 06:53 UTC (History)
4 users (show)

Fixed In Version: glusterfs-3.12.8
Clone Of: 1553598
Environment:
Last Closed: 2018-04-24 06:53:38 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Nithya Balachandran 2018-03-14 04:36:31 UTC
+++ This bug was initially created as a clone of Bug #1553598 +++

+++ This bug was initially created as a clone of Bug #1546941 +++

Description of problem:
=======================
After rebalance is triggered on the volume, on few files I am seeing ENOSPC errors in rebalance logs though there is enough space left on the bricks.

2018-02-20 05:31:01.051156] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/ABI/testing/configfs-usb-gadget-serial failed: [No space left on device]
[2018-02-20 05:31:01.139027] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/ABI/testing/debugfs-pktcdvd failed: [No space left on device]
[2018-02-20 05:31:01.769876] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/ABI/testing/sysfs-bus-iio-meas-spec failed: [No space left on device]
[2018-02-20 05:31:03.973459] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/ABI/testing/sysfs-ibft failed: [No space left on device]
[2018-02-20 05:31:09.066247] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/DocBook/media/v4l/vidioc-g-fbuf.xml failed: [No space left on device]
[2018-02-20 05:31:13.597532] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/RCU/trace.txt failed: [No space left on device]
[2018-02-20 05:31:24.303807] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/cgroup-v1/cgroups.txt failed: [No space left on device]
[2018-02-20 05:31:24.992058] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/connector/connector.txt failed: [No space left on device]
[2018-02-20 05:31:30.169573] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/devicetree/bindings/arm/calxeda/l2ecc.txt failed: [No space left on device]
[2018-02-20 05:31:33.802256] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/devicetree/bindings/arm/omap/crossbar.txt failed: [No space left on device]
[2018-02-20 05:31:39.996225] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/devicetree/bindings/clock/ti/mux.txt failed: [No space left on device]
[2018-02-20 05:31:40.264191] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/devicetree/bindings/clock/clk-palmas-clk32kg-clocks.txt failed: [No space left on device]

Grep output for a file in rebalance logs:

# grep -i /linux-4.6.4/Documentation/hsi.txt /var/log/glusterfs/distrepx3-rebalance.log 
[2018-02-20 05:33:04.980564] I [dht-rebalance.c:1513:dht_migrate_file] 0-distrepx3-dht: /linux-4.6.4/Documentation/hsi.txt: attempting to move from distrepx3-replicate-0 to distrepx3-replicate-2
[2018-02-20 05:33:05.096191] W [MSGID: 109023] [dht-rebalance.c:962:__dht_check_free_space] 0-distrepx3-dht: data movement of file {blocks:6 name:(/linux-4.6.4/Documentation/hsi.txt) } would result in dst node (distrepx3-replicate-2:41451088) having lower disk space then the source node (distrepx3-replicate-0:41453744).Skipping file.
[2018-02-20 05:33:05.127613] I [MSGID: 109126] [dht-rebalance.c:2714:gf_defrag_migrate_single_file] 0-distrepx3-dht: File migration skipped for /linux-4.6.4/Documentation/hsi.txt.
[2018-02-20 05:33:05.127739] E [MSGID: 109023] [dht-rebalance.c:2749:gf_defrag_migrate_single_file] 0-distrepx3-dht: migrate-data on /linux-4.6.4/Documentation/hsi.txt failed: [No space left on device]


Version-Release number of selected component (if applicable):
3.12.2-4.el7rhgs.x86_64

How reproducible:
1/1

Steps to Reproduce:
===================
1) Create a x3 volume and start it.
2) FUSE mount it on multiple clients.
3) Run linux kernel untar from two clients .
4) While IO is in-progress, add bricks to the volume and start rebalance without force.

Actual results:
===============
Seeing ENOSPC errors on few files in rebalance logs though there is enough space left on the bricks.

Expected results:
=================
NO errors in rebalance logs.

--- Additional comment from Worker Ant on 2018-03-09 02:38:07 EST ---

REVIEW: https://review.gluster.org/19687 (cluster/dht:  Skipped files are not treated as errors) posted (#1) for review on master by N Balachandran

--- Additional comment from Worker Ant on 2018-03-12 06:16:23 EDT ---

COMMIT: https://review.gluster.org/19687 committed in master by "Raghavendra G" <rgowdapp> with a commit message- cluster/dht:  Skipped files are not treated as errors

For skipped files, use a return value of 1 to prevent
error messages being logged.

Change-Id: I18de31ac1a64d4460e88dea7826c3ba03c895861
BUG: 1553598
Signed-off-by: N Balachandran <nbalacha>

Comment 1 Worker Ant 2018-03-14 04:39:48 UTC
REVIEW: https://review.gluster.org/19710 (cluster/dht:  Skipped files are not treated as errors) posted (#1) for review on release-3.12 by N Balachandran

Comment 2 Worker Ant 2018-04-02 03:35:56 UTC
REVIEW: https://review.gluster.org/19710 (cluster/dht:  Skipped files are not treated as errors) posted (#2) for review on release-3.12 by N Balachandran

Comment 3 Worker Ant 2018-04-02 03:43:25 UTC
REVIEW: https://review.gluster.org/19806 (cluster/dht: ENOSPC will not fail rebalance) posted (#1) for review on release-3.12 by N Balachandran

Comment 4 Worker Ant 2018-04-05 15:40:55 UTC
COMMIT: https://review.gluster.org/19806 committed in release-3.12 by "N Balachandran" <nbalacha> with a commit message- cluster/dht: ENOSPC will not fail rebalance

ENOSPC returned by a file migration is no longer
considered a rebalance failure.

Change-Id: I21cf3a8acdc827bc478e138d6cb5db649d53a28c
BUG: 1555161
Signed-off-by: N Balachandran <nbalacha>

Comment 5 Worker Ant 2018-04-05 16:00:13 UTC
REVIEW: https://review.gluster.org/19710 (cluster/dht:  Skipped files are not treated as errors) posted (#3) for review on release-3.12 by N Balachandran

Comment 6 Worker Ant 2018-04-06 12:50:59 UTC
COMMIT: https://review.gluster.org/19710 committed in release-3.12 by "jiffin tony Thottan" <jthottan> with a commit message- cluster/dht:  Skipped files are not treated as errors

For skipped files, use a return value of 1 to prevent
error messages being logged.

> Change-Id: I18de31ac1a64d4460e88dea7826c3ba03c895861
> BUG: 1553598
> Signed-off-by: N Balachandran <nbalacha>

Change-Id: I18de31ac1a64d4460e88dea7826c3ba03c895861
BUG: 1555161
Signed-off-by: N Balachandran <nbalacha>

Comment 7 Jiffin 2018-04-24 06:53:38 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.8, please open a new bug report.

glusterfs-3.12.8 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-devel/2018-April/054749.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.