Bug 1242543
Summary: | replacing a offline brick fails with "replace-brick" command | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | spandura | |
Component: | glusterd | Assignee: | Anuradha <atalur> | |
Status: | CLOSED ERRATA | QA Contact: | spandura | |
Severity: | urgent | Docs Contact: | ||
Priority: | high | |||
Version: | rhgs-3.1 | CC: | amukherj, annair, asrivast, atalur, ggarg, nlevinki, nsathyan, rcyriac, sasundar, smohan, spandura, vagarwal, vbellur | |
Target Milestone: | --- | |||
Target Release: | RHGS 3.1.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.7.1-10 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1242609 (view as bug list) | Environment: | ||
Last Closed: | 2015-07-29 05:12:07 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1202842, 1242609, 1242728 |
Description
spandura
2015-07-13 14:26:25 UTC
I notice that the BitD and Scrubber are not UP in all the nodes in the cluster. And also could you upload - sosreports or glusterd log files on all the nodes ? The bug was caused due to patch http://review.gluster.org/10101; commit # f9ebf5ab3cbec423f75e64c25385125d4b65e31b. In downstream rhgs-3.1 branch I did a revert of this patch to see if the failure occurs. I could replace the brick successfully even when it is down. I could see failure only after I applied the mentioned patch to the branch. I'm RCA'ing it. Meanwhile, adding need-info on the author to confirm if the suspicion is correct. In reply of comment 2 >>I notice that the BitD and Scrubber are not UP in all the nodes in the cluster. >>And also could you upload - sosreports or glusterd log files on all the nodes ? she have enabled bitrot using root@mia [Jul-13-2015-18:50:00] >gluster volume set testvol features.bitrot on volume set: success command. gluster does not support this command. Valid command will be "gluster volume bitrot <VOLNAME> enable/disable" by using "gluster volume set testvol features.bitrot on" command bitrot, scrubber daemon might crash. so user should following command for bitrot. # gluster v help | grep bitrot volume bitrot <VOLNAME> {enable|disable} | volume bitrot <volname> scrub-throttle {lazy|normal|aggressive} | volume bitrot <volname> scrub-frequency {hourly|daily|weekly|biweekly|monthly} | volume bitrot <volname> scrub {pause|resume} - Bitrot translator specific operation. For more information about bitrot command type 'man gluster' will to RCA of replace brick failed issue. RCA'ed and patch posted upstream for review : http://review.gluster.org/#/c/11651/ without this patch http://review.gluster.org/#/c/11651/ replace-brick commit force of dead brick is successful. It seems different issue. i am able to do replace-brick of dead brick successfully without facing any problem in both upstream and downstream branch. will further analysis this issue. (In reply to Gaurav Kumar Garg from comment #7) > without this patch http://review.gluster.org/#/c/11651/ replace-brick > commit force of dead brick is successful. > It seems different issue. i am able to do replace-brick of dead brick > successfully without facing any problem in both upstream and downstream > branch. > > will further analysis this issue. Did you kill the brick with SIGTERM? I could easily reproduce this. In response to comment #4, Shwetha, I'm unable to get the SOS reports. It says forbidden, do not have permission to access. I'm not sure if Sas would be facing the same issue. Could you attach glusterd logs here to the bug? (In reply to Atin Mukherjee from comment #8) > (In reply to Gaurav Kumar Garg from comment #7) > > without this patch http://review.gluster.org/#/c/11651/ replace-brick > > commit force of dead brick is successful. > > It seems different issue. i am able to do replace-brick of dead brick > > successfully without facing any problem in both upstream and downstream > > branch. > > > > will further analysis this issue. > > Did you kill the brick with SIGTERM? I could easily reproduce this. yes i killed brick using #kill -9 pidof_brick_process Patch on rhgs-3.1 : https://code.engineering.redhat.com/gerrit/#/c/52938/ moving back to modified, was moved to on_qa by errata tool Verified the bug on "glusterfs-3.7.1-10.el6rhs.x86_64". Bug is fixed. Moving the bug to verified state. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1495.html |