Bug 1593224
Summary: | [Disperse] : Client side heal is not removing dirty flag for some of the files. | |||
---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Ashish Pandey <aspandey> | |
Component: | disperse | Assignee: | Ashish Pandey <aspandey> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
Severity: | medium | Docs Contact: | ||
Priority: | medium | |||
Version: | mainline | CC: | bugs, jahernan, nchilaka, pkarampu, vavuthu | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1600918 1693223 1739446 1805050 (view as bug list) | Environment: | ||
Last Closed: | 2019-11-19 09:54:26 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1600918, 1693223, 1739446, 1805050 |
Description
Ashish Pandey
2018-06-20 10:14:25 UTC
REVIEW: https://review.gluster.org/21744 (cluster/ec: Don't enqueue an entry if it is already healing) posted (#2) for review on master by Ashish Pandey There are two things we need to fix. First, If there are number of entries to be healed and client touches all of them while SHD if off, it will trigger heal for all the files but not all the files will be placed in queue. This is because, for a single file there are number of fops coming and all are placing the same file in queue for healing which fills the queue quickly and new files will not get chance. Second, When a file is started healing, sometimes we see that the dirty flag is not removed while version and size, have been healed and are same. Need to find out why is this happening. If the shd is OFF and if a client access this file it will not find any discrepancies as version and size on all the bricks are same, hence heal will not be triggered for this file and the dirty will remain as it is. While debugging the failure of this patch and thinking of incorporating comments given by Pranith and Xavi, I found that there is some design constraints to implement the idea of not enqueue an entry if it is already healing. Consider 2+1 config and following scenario - 1 - Create volume and disable self heal daemon. 2 - Created a file wrote some data while all the bricks are UP. 3 - Kill one brick and write some data on the same file. 4 - Bring the brick UP. 5 - Now to trigger heal we will do "chmod 0666 file". This will do stst on file which will find the brick is not healthy and trigger the heal. 6 - Now a synctask for the heal will be created and started which will call ec_heal_do, which in turn calls ec_heal_metadata and ec_heal_data. 7 - A fop setattr will also be called on the file to set permission. Now, a sequence of steps could be like this- a > Stat- which saw unhealthy file and triggered heal b >ec_heal_metadata - took lock and healed metadata and healed metadata part of trusted.ec.version, release the lock on file. [At this point setattr is waiting for lock] c > setattr takes the lock and found that the brick is still unhealthy as data version is not healed and miss matching. Mark the dirty for metadata version, unlock the file. d > ec_heal_data takes the locks and heals the data. Now, if we restrict only one fop to trigger heal, after step d, the file will contain dirty flag and mismatched metadata versions. If we keep all the heal request from every fop in a queue and after every heal we check if the heal is needed or not then we will end up triggering heal for all the fop, defeats the purpose of the patch. Xavi, Pranith, Please provide your comments. Am I correct in my understanding? I have found a bug which was the actual cause of dirty flag remain set even after heal happened and all the version and size are matching. This bug can only be visible when we have shd disabled as shd will clear the dirty flag if it has nothing to heal. 1 - Let's say we have disabled shd 2 - Create a file and then kill a brick 3 - Write data, around 1GB, on file which will be healed after bricks comes UP 4 - Bring the brick UP 5 - Do "chmod 0777 file" this will trigger heal. 6 - Immediately, start write on this file append 512 bytes using dd. Now, while data healing was happening, write from mount came (step 6) and took the lock. It saw that healing flag is set on version xattr of file so it will send write on all the bricks. Before releasing lock it will also update version and size on ALL the bricks including brick which is healing. However, in ec_update_info, we consider lock->good_mask to decide if we should unset the "dirty" flag or not which was set by this write fop, even if it has succeeded on all the bricks. So, +1 of dirty flag will remain as it is by this write fop. Now, data heal again got the lock and after completing data heal, we unset the dirty flag by decreasing it by the _same_ number which we found at the start of healing. This will not have the incremented value made by the write fop in step 6. So, after healing a dirty flag will remain set on the file. This flag will never be unset if shd is not enabled. REVIEW: https://review.gluster.org/21744 (cluster/ec: Don't enqueue an entry if it is already healing) merged (#11) on master by Xavi Hernandez REVIEW: https://review.gluster.org/22907 (cluster/ec: Prevent double pre-op xattrops) posted (#1) for review on master by Pranith Kumar Karampuri REVIEW: https://review.gluster.org/22907 (cluster/ec: Prevent double pre-op xattrops) merged (#4) on master by Pranith Kumar Karampuri The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |