Bug 1762438
Summary: | DHT- gluster rebalance status shows wrong data size after rebalance is completed successfully | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Sanju <srakonde> |
Component: | glusterd | Assignee: | Sanju <srakonde> |
Status: | CLOSED NEXTRELEASE | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | mainline | CC: | amukherj, bmekala, bshetty, bugs, nbalacha, nchilaka, rhs-bugs, saraut, spalai, storage-qa-internal, vbellur |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1761486 | Environment: | |
Last Closed: | 2019-10-18 05:22:13 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1761486 | ||
Bug Blocks: |
Comment 1
Worker Ant
2019-10-16 18:15:33 UTC
Description of problem: ======================= While doing a gluster re-balance from 1*(4+2) to 2*(4+2) and then to 3*(4+2),the re-balance status shows inconsistency (wrong data size) after the re-balance is completed. The re-balance status shows proper data till 4~6 hours of completion, then it mismatches. This was seen in both 2*(4+2) and 3*(4+2) configuration. Though the size was in GB, after the completion of re-balance it shows in PB. (Made the initial description private as it contained internal data on server hostnames). RCA: --- Additional comment from Sanju on 2019-10-16 15:28:22 IST --- Looks like this is a day1 issue. From gdb: glusterd_store_retrieve_node_state (volinfo=volinfo@entry=0x555555e67fc0) at glusterd-store.c:2980 2980 volinfo->rebal.rebalance_data = atoi(value); 4: value = 0x5555557dfea0 "3145728000" 3: key = 0x5555557e41d0 "size" 2: volinfo->rebal.rebalance_data = 0 1: volinfo->volname = "test1", '\000' <repeats 250 times> (gdb) n 3048 GF_FREE(key); 4: value = 0x5555557dfea0 "3145728000" 3: key = 0x5555557e41d0 "size" 2: volinfo->rebal.rebalance_data = 18446744072560312320 1: volinfo->volname = "test1", '\000' <repeats 250 times> (gdb) p atoi(value) $20 = -1149239296 (gdb) p/u atoi(value) $21 = 3145728000 The issue here is, atoi() is returning negative value because of overflow. Below statements from gdb proves it. (gdb) set volinfo->rebal.rebalance_data=atoi("314572800") (gdb) p volinfo->rebal.rebalance_data $41 = 314572800 (gdb) set volinfo->rebal.rebalance_data=atoi("3145728000") (gdb) p volinfo->rebal.rebalance_data $42 = 18446744072560312320 <-- wrong value (gdb) REVIEW: https://review.gluster.org/23560 (glusterd: display correct rebalance data size after glusterd restart) merged (#4) on master by Atin Mukherjee |