Bug 1450698
| Summary: | a core generated when running regression test /tests/bugs/glusterd/bug-1004744.t | ||
|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Zhou Zhengping <johnzzpcrystal> |
| Component: | distribute | Assignee: | Nithya Balachandran <nbalacha> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | mainline | CC: | bugs, moagrawa, nbalacha, srangana |
| Target Milestone: | --- | Keywords: | Triaged |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | glusterfs-4.1.3 (or higher) | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-08-29 03:18:49 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1452102 | ||
| Bug Blocks: | |||
|
Description
Zhou Zhengping
2017-05-14 21:59:18 UTC
Zhou Zhengping
5:40 AM
Patch Set 3:
http://build.gluster.org/job/centos6-regression/4583/consoleFull :
FAILED
18:52:49 1 test(s) generated core
18:52:49 ./tests/bugs/glusterd/bug-1004744.t
Seems there is a bug around dht xlator when add bricks because race condition when dht use conf->subvolume_cnt to start a for loop.like this:
local->call_cnt = conf->subvolume_cnt; //would be changed when add bricks
for (i = 0; i < conf->subvolume_cnt; i++) {
STACK_WIND_COOKIE (frame, dht_selfheal_dir_mkdir_lookup_cbk...
}
It seems core is generated by this test case /tests/bugs/glusterd/bug-1004744.t. (In reply to Mohit Agrawal from comment #2) > It seems core is generated by this test case > /tests/bugs/glusterd/bug-1004744.t. From the core dump:
Program terminated with signal 11, Segmentation fault.
#0 0x00007fe7b7d69bf1 in dht_selfheal_dir_mkdir_lookup_done (frame=0x7fe7a8002d70, this=0x7fe7b80115b0)
at /home/jenkins/root/workspace/centos6-regression/xlators/cluster/dht/src/dht-selfheal.c:1338
warning: Source file is more recent than executable.
1338 gf_msg_debug (this->name, 0,
(gdb) l
1333
1334 for (i = 0; i < layout->cnt; i++) {
1335 if (layout->list[i].err == ESTALE ||
1336 layout->list[i].err == ENOENT ||
1337 local->selfheal.force_mkdir) {
1338 gf_msg_debug (this->name, 0,
1339 "Creating directory %s on subvol %s",
1340 loc->path, layout->list[i].xlator->name);
1341
1342 STACK_WIND_COOKIE (frame, dht_selfheal_dir_mkdir_cbk,
(gdb) p loc->path
$1 = 0x0
(gdb) p *layout
$2 = {spread_cnt = -1379869184, cnt = 8382430, preset = 30064640, commit_hash = 8382392, gen = 41216, type = -768, ref = 32895, search_unhashed = _gf_false,
list = 0x7fe7ac00b330}
(gdb) p i
$3 = 469
(gdb) p loc
$4 = (loc_t *) 0x7fe7a8001d08
(gdb) p layout->list[i].xlator->name
Cannot access memory at address 0x0
(gdb) p *frame
$5 = {root = 0xfffffffffffffffd, parent = 0x0, frames = {next = 0x7fe7a8002d80, prev = 0x7fe7a8002d80}, local = 0x0, this = 0x0, ret = 0x0, ref_count = -1, lock = {
spinlock = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = -1476332528, __spins = 32743, __list = {__prev = 0x7,
__next = 0x7fe7a800f4d0}}, __size = '\000' <repeats 16 times>, "\020\364\000\250\347\177\000\000\a\000\000\000\000\000\000\000\320\364\000\250\347\177\000",
__align = 0}}, cookie = 0x0, complete = _gf_false, op = GF_FOP_NULL, begin = {tv_sec = 0, tv_usec = 0}, end = {tv_sec = 0, tv_usec = 0}, wind_from = 0x0,
wind_to = 0x0, unwind_from = 0x0, unwind_to = 0x0}
The frame and the layout variable have already been freed because the call has unwound.
This is similar to the issue reported by 1452102 and has been fixed by https://review.gluster.org/17343.
Marking this dependent on 1452102 and moving it to Modified.
|