| Summary: | [GSS] - DHT hash layout corrupt | ||
|---|---|---|---|
| Product: | Red Hat Gluster Storage | Reporter: | Bipin Kunal <bkunal> |
| Component: | distribute | Assignee: | Susant Kumar Palai <spalai> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Prasad Desala <tdesala> |
| Severity: | urgent | Docs Contact: | |
| Priority: | high | ||
| Version: | rhgs-3.1 | CC: | amukherj, bkunal, olim, rgowdapp, rhinduja, rhs-bugs, rnalakka, spalai, storage-qa-internal |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-12-07 16:19:21 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Bipin Kunal
2016-12-05 07:22:53 UTC
Bipin, Can you update the brick sizes from all the nodes? RCA: There was bug with weighted-rebalance option in version 3.7.1-11 where the sum of size of all the bricks were stored in an unsigned integer (uint32_t). For big clusters with larger size bricks like the current one where each brick size is 55TB (totaling to (55TB * 80 = 4.8PB) ), the value will overflow causing incorrect chunk computation, giving rise to overflowing layout every few bricks We had hit a similar bug before here: https://bugzilla.redhat.com/show_bug.cgi?id=1281946 This was fixed in 3.1.2 Patch: https://code.engineering.redhat.com/gerrit/#/c/64630/ For workaround customer can turn the weighted-rebalance off and remount all the clients or upgrade to 3.1.2. For the new dir creation we saw that dht layout was getting proper hash range sometimes with gs9 bricks with no hash range as it is almost 100%, but we did see hash layout corruption for few of them. Not sure what is the condition when dht stops giving hash range to new dirs, Might be min-free disk, but not sure. We tried workaround for now to off weighted-rebalance. As of now weighted-rebalance off is working fine. We did lookup from new mounts(all the old mounts were unmounted) and saw layout getting rectified. We started lookup recursively on all the directory in order to fix the layout for all. I will close this bug as the fix is already available in newer releases. This should be marked as closed, current release. |