Bug 1693223 - [Disperse] : Client side heal is not removing dirty flag for some of the files.
Summary: [Disperse] : Client side heal is not removing dirty flag for some of the files.
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: disperse
Version: 6
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On: 1593224 1739446 1805050
Blocks: 1600918
TreeView+ depends on / blocked
 
Reported: 2019-03-27 11:23 UTC by Ashish Pandey
Modified: 2020-02-20 07:35 UTC (History)
5 users (show)

Fixed In Version: glusterfs-6.1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1593224
Environment:
Last Closed: 2019-04-22 13:33:06 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gluster.org Gerrit 22429 0 None Open cluster/ec: Don't enqueue an entry if it is already healing 2019-04-16 10:51:27 UTC

Description Ashish Pandey 2019-03-27 11:23:39 UTC
+++ This bug was initially created as a clone of Bug #1593224 +++

Description of problem:

While server side heal is disabled, client side heal is not removing dirty flag for some files.

Version-Release number of selected component (if applicable):


How reproducible:
50%

Steps to Reproduce:
1. Create a 4+2 volume and mount 
2. Touch 100 files
3. Kill one brick
4. write some data all files using dd
5.  bricng the bricks UP
6. Append some data on all the bricks using dd, this will trigger heal on all the files
7. Read data from all the files using dd command.

At the end all files should be healed. However, I have observed that 2-3 files are still showing up in heal info.
When I looked for the getxattr, all the xattrs were same and dirty flag was still present for data fop.





 




  
Actual results:


Expected results:


Additional info:

--- Additional comment from Worker Ant on 2018-11-28 12:51:02 UTC ---

REVIEW: https://review.gluster.org/21744 (cluster/ec: Don't enqueue an entry if it is already healing) posted (#2) for review on master by Ashish Pandey

--- Additional comment from Ashish Pandey on 2018-12-06 10:23:52 UTC ---


There are two things we need to fix.

First, If there are number of entries to be healed and client touches all of them while SHD if off, it will trigger heal for all the files but not all the files will be placed in queue. This is because, for a single file there are number of fops coming and all are placing the same file in queue for healing which fills the queue quickly and new files will not get chance.

Second, When a file is started healing, sometimes we see that the dirty flag is not removed while version and size, have been healed and are same.
Need to find out why is this happening.

If the shd is OFF and if a client access this file it will not find any discrepancies as version and size on all the bricks are same, hence heal will not be triggered for this file and the dirty will remain as it is.

--- Additional comment from Ashish Pandey on 2018-12-11 06:51:11 UTC ---



While debugging the failure of this patch and thinking of incorporating comments given by Pranith and Xavi,
I found that there is some design constraints to implement the idea of not enqueue an entry if it is already healing.

Consider 2+1 config and following scenario - 

1 - Create volume and disable self heal daemon.
2 - Created a file wrote some data while all the bricks are UP.
3 - Kill one brick and write some data on the same file.
4 - Bring the brick UP.

5 - Now to trigger heal we will do "chmod 0666 file". This will do stst on file which will find the brick is not healthy and 
trigger the heal.
6 - Now a synctask for the heal will be created and started which will call ec_heal_do, which in turn calls ec_heal_metadata and ec_heal_data.

7 - A fop setattr will also be called on the file to set permission.

Now, a sequence of steps could be like this-

a > Stat- which saw unhealthy file and triggered heal

b  >ec_heal_metadata - took lock and healed metadata and healed metadata part of trusted.ec.version, release the lock on file. [At this point setattr is waiting for lock]

c > setattr takes the lock and found that the brick is still unhealthy as data version is not healed and miss matching. Mark the dirty for metadata version, unlock the file.

d > ec_heal_data takes the locks and heals the data.

Now, if we restrict only one fop to trigger heal, after step d, the file will contain dirty flag and mismatched metadata versions.
If we keep all the heal request from every fop in a queue and after every heal we check if the heal is needed or not then we will end up triggering heal for all the fop, defeats the purpose of the patch.

Xavi, Pranith,
Please provide your comments. Am I correct in my understanding?

--- Additional comment from Ashish Pandey on 2018-12-18 11:23:15 UTC ---


I have found a bug which was the actual cause of dirty flag remain set even after heal happened and all the version and size are matching.
This bug can only be visible when we have shd disabled as shd will clear the dirty flag if it has nothing to heal. 

1 - Let's say we have disabled shd
2 - Create a file and then kill a brick
3 - Write data, around 1GB, on file which will be healed after bricks comes UP
4 - Bring the brick UP
5 - Do "chmod 0777 file" this will trigger heal.
6 - Immediately, start write on this file append 512 bytes using dd.

Now, while data healing was happening, write from mount came (step 6) and took the lock. It saw that healing flag is set on version xattr of file so it will send write on all the bricks. Before releasing lock it will also update version and size on ALL the bricks including brick which is healing.

However, in ec_update_info, we consider lock->good_mask to decide if we should unset the "dirty" flag or not which was set by this write fop,  even if it has succeeded on all the bricks.

So, +1 of dirty flag will remain as it is by this write fop.

Now, data heal again got the lock and after completing data heal, we unset the dirty flag by decreasing it by the _same_ number which we found at the start of healing. This will not have the incremented value made by the write fop in step 6.

So, after healing a dirty flag will remain set on the file. This flag will never be unset if shd is not enabled.

--- Additional comment from Worker Ant on 2019-03-27 11:15:29 UTC ---

REVIEW: https://review.gluster.org/21744 (cluster/ec: Don't enqueue an entry if it is already healing) merged (#11) on master by Xavi Hernandez

Comment 1 Worker Ant 2019-03-27 11:25:40 UTC
REVIEW: https://review.gluster.org/22429 (cluster/ec: Don't enqueue an entry if it is already healing) posted (#1) for review on release-6 by Ashish Pandey

Comment 2 Ashish Pandey 2019-03-27 11:26:32 UTC
patch for rel-6
https://review.gluster.org/#/c/glusterfs/+/22429/

Comment 3 Worker Ant 2019-04-16 10:51:28 UTC
REVIEW: https://review.gluster.org/22429 (cluster/ec: Don't enqueue an entry if it is already healing) merged (#2) on release-6 by Shyamsundar Ranganathan

Comment 4 Shyamsundar 2019-04-22 13:33:06 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-6.1, please open a new bug report.

glusterfs-6.1 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2019-April/000124.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.