Created attachment 1645596 [details] Gluster vo info and status, df -hT, heal info, logs of glfsheal and all bricks Description of problem: Setup: 3-Node VMWare Cluster (2 Storage Nodes and 1 Arbiter Node), Distribute-Replica 2 Volume with 1 Arbiter brick per Replica-Tupel (see attached file for the detail configuration). Access to the volume provided via Samba (samba-vfs-glusterfs plugin) and CTDB. After reaching the storage.reserve limit, there is a pending self-heal which is not resolved automatically. Version-Release number of selected component (if applicable): Gluster FS v5.10 How reproducible: Steps to Reproduce: 1. Mount volume from a Win10 Client via SMB. 2. Copy a lot of small files (between 50-1000KB) recursively to the share 3. Continue copying until volume is full and bricks reached the storage.reserve limit (we use the default of 1%) During copy process all nodes were up and running Actual results: There is a pending self-heal for 1 file Expected results: No pending self-heal Additional info: See attached file The above scenario was not only reproduced on a VM Cluster. We could also monitor it on a real HW Cluster and the number of pending files for self-heal varies (it can also be up to 7 or 10).
The issue was reproduced with an SMB client but we assume that this is an server side issue. It should be reproducable with any type of client connection from a dedicated client machine.
Can you provide the `getfattr -d -m. -e hex /bricks/name-of-file-needing heal` output from all 3 bricks of the replica where the file resides?
My newest observation is that this pending heal was resolved automatically within approx. 48h . I am askink myself why it takes so long and what happens in the backend? The file was small (500KB).
> we assume that this is an server side issue FWIW, I tried to reproduce this on the release-5 branch: created a pending data self-heal on a file on a volume hitting the storage.reserve limit. Self-heal did complete successfully. Also the conditional checks in the code for preventing writes when the reserve limit is hit is not applicable for internal clients like the self-heal daemon. So I'm not sure what the problem was in your case. If you have a consistent reproducer, maybe you could identify what type of heal was pending, check if there are any blocked blocks from the client preventing the shd from healing etc.
Thank you for your efforts. I will try to get a consistent reproducer as soon as possible. But why are internal clients like self-heal daemon not applicable for internal clients? Is this be design? In this Bug I observed that this behaviour can damage the functionality of a volume: https://bugzilla.redhat.com/show_bug.cgi?id=1784402
(In reply to david.spisla from comment #5) > But why are internal clients like self-heal daemon not applicable for > internal clients? Is this be design? In this Bug I observed that this > behaviour can damage the functionality of a volume: > https://bugzilla.redhat.com/show_bug.cgi?id=1784402 From https://review.gluster.org/#/c/glusterfs/+/17780/10/xlators/storage/posix/src/posix.h@67, it looks like it was coded like that by design. If you don't allow self-heals to go through, then the replicas won't able to be in sync, which is also bad. Because you returned success to the application for the writes, then the bricks of the replica must also be allowed to heal and be in sync.
This bug is moved to https://github.com/gluster/glusterfs/issues/934, and will be tracked there from now on. Visit GitHub issues URL for further details
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days