Bug 1784013 - Pending self-heal when a volume is full and bricks reaches storage.reserve limit [NEEDINFO]
Summary: Pending self-heal when a volume is full and bricks reaches storage.reserve limit
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: 5
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-12-16 13:52 UTC by david.spisla
Modified: 2020-03-12 12:44 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-12 12:44:24 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
ravishankar: needinfo? (david.spisla)


Attachments (Terms of Use)
Gluster vo info and status, df -hT, heal info, logs of glfsheal and all bricks (7.70 MB, application/gzip)
2019-12-16 13:52 UTC, david.spisla
no flags Details

Description david.spisla 2019-12-16 13:52:08 UTC
Created attachment 1645596 [details]
Gluster vo info and status, df -hT, heal info, logs of glfsheal and all bricks

Description of problem:

Setup: 3-Node VMWare Cluster (2 Storage Nodes and 1 Arbiter Node), Distribute-Replica 2 Volume with 1 Arbiter brick per Replica-Tupel (see attached file for the detail configuration).

Access to the volume provided via Samba (samba-vfs-glusterfs plugin) and CTDB.

After reaching the storage.reserve limit, there is a pending self-heal which is not resolved automatically.

Version-Release number of selected component (if applicable):
Gluster FS v5.10

How reproducible:
Steps to Reproduce:
1. Mount volume from a Win10 Client via SMB.
2. Copy a lot of small files (between 50-1000KB) recursively to the share 
3. Continue copying until volume is full and bricks reached the storage.reserve limit (we use the default of 1%)

During copy process all nodes were up and running

Actual results:
There is a pending self-heal for 1 file

Expected results:
No pending self-heal


Additional info:
See attached file

The above scenario was not only reproduced on a VM Cluster. We could also monitor it on a real HW Cluster and the number of pending files for self-heal varies (it can also be up to 7 or 10).

Comment 1 david.spisla 2019-12-17 09:27:34 UTC
The issue was reproduced with an SMB client but we assume that this is an server side issue. It should be reproducable with any type of client connection from a dedicated client machine.

Comment 2 Ravishankar N 2019-12-18 13:11:51 UTC
Can you provide the `getfattr -d -m. -e hex /bricks/name-of-file-needing heal` output from all 3 bricks of the replica where the file resides?

Comment 3 david.spisla 2019-12-18 14:10:48 UTC
My newest observation is that this pending heal was resolved automatically within approx. 48h .
I am askink myself why it takes so long and what happens in the backend? The file was small (500KB).

Comment 4 Ravishankar N 2019-12-19 05:10:36 UTC
> we assume that this is an server side issue

FWIW, I tried to reproduce this on the release-5 branch: created a pending data self-heal on a file on a volume hitting the storage.reserve limit. Self-heal did complete successfully. Also the conditional checks in the code for preventing writes when the reserve limit is hit is not applicable for internal clients like the self-heal daemon. So I'm not sure what the problem was in your case. If you have a consistent reproducer, maybe you could identify what type of heal was pending, check if there are any blocked blocks from the client preventing the shd from healing etc.

Comment 5 david.spisla 2019-12-19 14:03:36 UTC
Thank you for your efforts. I will try to get a consistent reproducer as soon as possible.
But why are internal clients like self-heal daemon not applicable for internal clients? Is this be design? In this Bug I observed that this behaviour can damage the functionality of a volume:
https://bugzilla.redhat.com/show_bug.cgi?id=1784402

Comment 6 Ravishankar N 2019-12-23 11:03:33 UTC
(In reply to david.spisla from comment #5)
> But why are internal clients like self-heal daemon not applicable for
> internal clients? Is this be design? In this Bug I observed that this
> behaviour can damage the functionality of a volume:
> https://bugzilla.redhat.com/show_bug.cgi?id=1784402

From https://review.gluster.org/#/c/glusterfs/+/17780/10/xlators/storage/posix/src/posix.h@67, it looks like it was coded like that by design. If you don't allow self-heals to go through, then the replicas won't able to be in sync, which is also bad. Because you returned success to the application for the writes, then the bricks of the replica must also be allowed to heal and be in sync.

Comment 7 Worker Ant 2020-03-12 12:44:24 UTC
This bug is moved to https://github.com/gluster/glusterfs/issues/934, and will be tracked there from now on. Visit GitHub issues URL for further details


Note You need to log in before you can comment on or make changes to this bug.