Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
| Summary: |
Rebalance is causing glusterfs crash on client node |
| Product: |
[Community] GlusterFS
|
Reporter: |
Nithya Balachandran <nbalacha> |
| Component: |
distribute | Assignee: |
Nithya Balachandran <nbalacha> |
| Status: |
CLOSED
NEXTRELEASE
|
QA Contact: |
|
| Severity: |
high
|
Docs Contact: |
|
| Priority: |
unspecified
|
|
|
| Version: |
mainline | CC: |
bugs, moagrawa, nbalacha, rhs-bugs, saraut, spalai, storage-qa-internal, tdesala, ubansal, vbellur
|
| Target Milestone: |
--- | Keywords: |
Reopened |
| Target Release: |
--- | |
|
| Hardware: |
Unspecified | |
|
| OS: |
Unspecified | |
|
| Whiteboard: |
|
|
Fixed In Version:
|
|
Doc Type:
|
If docs needed, set a value
|
|
Doc Text:
|
|
Story Points:
|
---
|
|
Clone Of:
|
1756325
|
|
:
|
1769315 1786983 1804522
(view as bug list) |
Environment:
|
|
|
Last Closed:
|
2020-03-03 07:46:57 UTC
|
Type:
|
---
|
|
Regression:
|
---
|
Mount Type:
|
---
|
|
Documentation:
|
---
|
CRM:
|
|
|
Verified Versions:
|
|
Category:
|
---
|
|
oVirt Team:
|
---
|
RHEL 7.3 requirements from Atomic Host:
|
|
|
Cloudforms Team:
|
---
|
Target Upstream Version:
|
|
|
Embargoed:
|
|
| |
| Bug Depends On: |
1756325, 1759141, 1760779
|
|
|
| Bug Blocks: |
1769315, 1786983, 1804522
|
|
|
An unsafe loop while processing fds in the dht rebalance check tasks caused the client process to crash as it was operating on an fd that had already been freed. It looks like the fd has already been freed. fd->inode is set to NULL in fd_destroy. fds are allocated from the mempools using mem_get. Checking the pool header info: (gdb) f 1 #1 0x00007f3923004af7 in fd_unref (fd=0x7f3910ccec28) at fd.c:515 515 LOCK(&fd->inode->lock); (gdb) p *fd $1 = {pid = 13340, flags = 33345, refcount = {lk = 0x7f3910ccec38 "\t", value = 9}, inode_list = {next = 0x7f3910ccec40, prev = 0x7f3910ccec40}, inode = 0x0, lock = { spinlock = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = -1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 16 times>, "\377\377\377\377", '\000' <repeats 19 times>, __align = 0}}, _ctx = 0x7f3910c73b70, xl_count = 39, lk_ctx = 0x7f39100ede90, anonymous = false} (gdb) p ((pooled_obj_hdr_t *)fd)-1 $2 = (pooled_obj_hdr_t *) 0x7f3910ccec00 (gdb) p sizeof(pooled_obj_hdr_t) $3 = 40 (gdb) p/x sizeof(pooled_obj_hdr_t) $4 = 0x28 (gdb) p *$2 $5 = {magic = 3735929054, next = 0x7f3910c3cec0, pool_list = 0x7f3910000960, power_of_two = 8, pool = 0x7f39100605c0} (gdb) p/x *$2 $6 = {magic = 0xdeadc0de, next = 0x7f3910c3cec0, pool_list = 0x7f3910000960, power_of_two = 0x8, pool = 0x7f39100605c0} (gdb) $6->magic = 0xdeadc0de #define GF_MEM_INVALID_MAGIC 0xDEADC0DE In mem_put: hdr->magic = GF_MEM_INVALID_MAGIC; As fd_destroy calls mem_put, this indicates that the memory has already been freed. To double check, check the memory header for fd->_ctx which is allocated using GF_CALLOC: (gdb) p fd->_ctx $13 = (struct _fd_ctx *) 0x7f3910c73b70 (gdb) p *(((struct mem_header *)0x7f3910c73b70) -1) $14 = {type = 269061216, size = 139883059273280, mem_acct = 0x0, magic = 0, padding = {0, 0, 0, 0, 0, 0, 0, 0}} (gdb) p/x *(((struct mem_header *)0x7f3910c73b70) -1) $15 = {type = 0x10098c60, size = 0x7f39100ede40, mem_acct = 0x0, magic = 0x0, padding = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}} (gdb) The header struct members are invalid.