Bug 1734252
Summary: | Heal not completing after geo-rep session is stopped on EC volumes. | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Ashish Pandey <aspandey> |
Component: | disperse | Assignee: | Ashish Pandey <aspandey> |
Status: | CLOSED NEXTRELEASE | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | mainline | CC: | bugs, kiyer, nchilaka, rhs-bugs, sankarshan, storage-qa-internal |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1733531 | Environment: | |
Last Closed: | 2019-07-30 12:58:23 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1733531 | ||
Bug Blocks: |
Comment 1
Ashish Pandey
2019-07-30 05:02:11 UTC
Here is the root cause of the issue - when features.read-only is enabled, ro_fxattrop will check for the following condition - if (is_readonly_or_worm_enabled(frame, this) && !allzero) STACK_UNWIND_STRICT(fxattrop, frame, -1, EROFS, NULL, xdata); In this is_readonly_or_worm_enabled(frame, this) will return "false" for shd if frame->root->pid < 0, which we set for the frame used in healing as "-6". However, in this case this frame->root->pid is coming up with value as "0". That's why this condition is failing (0 < 0) and the function returning "true" and making this as read-only for shd process also. Why is it happening? when shd triggers heal for the file, it is finding that there is nothing to heal so it is calling "ec_data_undo_pending" to remove dirty flag for data part which in turn calling syncop_fxattrop->SYNCOP We do not pass frame to this SYNCOP and it gets the frame from the task - \ task = synctask_get(); \ stb->task = task; \ if (task) \ frame = task->opframe; However, while creating task we provided frame as NULL. ec_launch_heal(ec_t *ec, ec_fop_data_t *fop) { int ret = 0; ret = synctask_new(ec->xl->ctx->env, ec_synctask_heal_wrap, ec_heal_done, NULL, fop); ---------------------------- So synctask_create will create a task with new frame but it will not set frame->root->pid as -6 and it will be "0" only. This is what we are checking in "is_readonly_or_worm_enabled" and getting read-only as TRUE and the heal (fxattrop) is failing with "read-only file system" error. When we don't enable feature.read-only, this xlator will not be loaded and this condition will not be checked and hence fxattrop sent by "ec_data_undo_pending" will succeed and it will remove the dirty [data part] flag. REVIEW: https://review.gluster.org/23129 (cluster/ec: Create heal task with heal process id) posted (#1) for review on master by Ashish Pandey REVIEW: https://review.gluster.org/23129 (cluster/ec: Create heal task with heal process id) merged (#2) on master by Amar Tumballi |