Bug 1401869
| Summary: | Rebalance not happened, which triggered after adding couple of bricks. | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Byreddy <bsrirama> |
| Component: | replicate | Assignee: | Pranith Kumar K <pkarampu> |
| Status: | CLOSED ERRATA | QA Contact: | Nag Pavan Chilakam <nchilaka> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | rhgs-3.2 | CC: | amukherj, ksandha, rcyriac, rhinduja, rhs-bugs, spalai, storage-qa-internal |
| Target Milestone: | --- | ||
| Target Release: | RHGS 3.2.0 | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | glusterfs-3.8.4-8 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-03-23 05:54:49 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1351528, 1400037 | ||
I was able to reproduce the issue. And after adding logs saw that blocking inodelk failing with EAGAIN. [2016-12-06 12:16:04.329982] E [MSGID: 109118] [dht-helper.c:2081:dht_blocking_inodelk_cbk] 0-test1-dht: inodelk failed with Resource temporarily unavailable on subvol test1-replicate-0 [Res ource temporarily unavailable] [2016-12-06 12:16:04.330109] E [dht-rebalance.c:3348:gf_defrag_fix_layout] 0-test1-dht: Setxattr failed for /dir2 There is an issue in AFR where afr on receiving BLOCKING inodelk tries to get non-blocking inodelk. In case of failure it passes the error back to parent translators. Pranith has already sent the patch for this. http://review.gluster.org/#/c/15984/. Moving the component to AFR. *** Bug 1400037 has been marked as a duplicate of this bug. *** QATP: ==== added bricks and did a rebalance on 2x2=>3x2 volume while IO is happening Didn't see any failures Did even remove bricks and it passed Ran with gnfs too hence moving to verified test version:3.8.4-8 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html |
Description of problem: ======================= Rebalance status showed some failures, triggered after adding couple of bricks to 2*2 volume. # gluster volume rebalance Dis-Rep status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 1 1 0 completed 0:0:0 10.70.41.217 0 0Bytes 0 0 0 completed 0:0:10 volume rebalance: Dis-Rep: success Errors in Glusterd Log: ----------------------- [2016-12-06 09:28:41.210721] E [MSGID: 106062] [glusterd-utils.c:9188:glusterd_volume_rebalance_use_rsp_dict] 0-glusterd: failed to get index The message "E [MSGID: 106062] [glusterd-utils.c:9188:glusterd_volume_rebalance_use_rsp_dict] 0-glusterd: failed to get index" repeated 2 times between [2016-12-06 09:28:41.210721] and [2016-12-06 09:28:46.241538] Errors in rebalance log: ------------------------ [2016-12-06 09:28:46.511955] I [MSGID: 109081] [dht-common.c:4006:dht_setxattr] 0-Dis-Rep-dht: fixing the layout of /linux-4.8.8 [2016-12-06 09:28:46.516510] E [dht-rebalance.c:3348:gf_defrag_fix_layout] 0-Dis-Rep-dht: Setxattr failed for /linux-4.8.8 [2016-12-06 09:28:46.525333] I [dht-rebalance.c:3884:gf_defrag_start_crawl] 0-DHT: crawling file-system completed Version-Release number of selected component (if applicable): ============================================================= glusterfs-3.8.4-7.el7rhgs.x86_64 How reproducible: ================= One time Steps to Reproduce: =================== 1. Have 2 node cluster 2. Create 2 *2 volume 3. Mount the volume using gnfs (v3) and untar the linux kernel in the mount point 4. Add couple of bricks 5. Trigger the rebalance. // gluster volume rebalance <vol-name> start Actual results: =============== Rebalance failed. Expected results: ================= Rebalance should happen successfully Additional info: