Bug 1415038
Summary: | [Ganesha + EC] : Input/Output Error while creating LOTS of smallfiles | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Ambarish <asoman> | |
Component: | distribute | Assignee: | Pranith Kumar K <pkarampu> | |
Status: | CLOSED ERRATA | QA Contact: | Ambarish <asoman> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | rhgs-3.2 | CC: | amukherj, asoman, aspandey, asrivast, bturner, jthottan, pkarampu, rcyriac, rhinduja, rhs-bugs, skoduri, storage-qa-internal | |
Target Milestone: | --- | |||
Target Release: | RHGS 3.3.0 | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.8.4-23 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1438411 (view as bug list) | Environment: | ||
Last Closed: | 2017-09-21 04:30:55 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1417147, 1438411, 1438423, 1438424 |
Description
Ambarish
2017-01-20 03:42:30 UTC
In the nodes which those clients are connected to (i.e, with VIP 192.168.97.152 & 192.168.97.150) , I see ENOTCONN errors- ===gqas005 == 19/01/2017 14:34:52 : epoch 48930000 : gqas005.sbu.lab.eng.bos.redhat.com : ganesha.nfsd-28019[work-41] glusterfs_setattr2 :FSAL :CRIT :setattrs failed with error Transport endpoint is not connected ********** On gqac011 ********** Mount :192.168.97.150:/butcher [root@gqac011 W]# for i in {1..25};do mkdir dir$i;cp linux-4.9.4.tar.xz dir$i;cd dir$i;tar xvfJ linux-4.9.4.tar.xz >> /tar$i.log;cd ..;done cp: error reading ‘linux-4.9.4.tar.xz’: Input/output error cp: failed to extend ‘dir1/linux-4.9.4.tar.xz’: Input/output error From ganesha-gfapi.log - [2017-01-19 19:34:52.984372] I [MSGID: 122002] [ec-heal.c:2368:ec_heal_do] 0-butcher-disperse-0: /W/linux-4.9.4.tar.xz: name heal failed [Transport endpoint is not connected] There are many ENOTCONN errors on gqas006 as well. This is not necessarily multivolume. I could repro it with 1 EC volume as well - with small file creates and kernel untars. Adjusting summary accordingly. [root@gqas009 gluster]# grep -i mismatch /var/log/ganesha-gfapi.log [2017-01-22 05:34:47.319714] W [MSGID: 122019] [ec-helpers.c:354:ec_loc_gfid_check] 0-testvol-disperse-7: Mismatching GFID's in loc [2017-01-22 11:18:11.939976] W [MSGID: 122019] [ec-helpers.c:354:ec_loc_gfid_check] 10-testvol-disperse-7: Mismatching GFID's in loc [2017-01-22 12:35:54.265400] W [MSGID: 122019] [ec-helpers.c:354:ec_loc_gfid_check] 10-testvol-disperse-7: Mismatching GFID's in loc [2017-01-22 13:03:13.471747] W [MSGID: 122019] [ec-helpers.c:354:ec_loc_gfid_check] 10-testvol-disperse-4: Mismatching GFID's in loc [2017-01-22 13:04:04.590037] W [MSGID: 122019] [ec-helpers.c:354:ec_loc_gfid_check] 10-testvol-disperse-4: Mismatching GFID's in loc [2017-01-22 13:10:33.799298] W [MSGID: 122019] [ec-helpers.c:354:ec_loc_gfid_check] 10-testvol-disperse-4: Mismatching GFID's in loc [2017-01-22 13:11:06.763801] W [MSGID: 122019] [ec-helpers.c:354:ec_loc_gfid_check] 10-testvol-disperse-5: Mismatching GFID's in loc [2017-01-22 13:43:24.219911] W [MSGID: 122019] [ec-helpers.c:354:ec_loc_gfid_check] 10-testvol-disperse-7: Mismatching GFID's in loc [2017-01-22 14:35:53.688066] W [MSGID: 122019] [ec-helpers.c:354:ec_loc_gfid_check] 10-testvol-disperse-6: Mismatching GFID's in loc [2017-01-23 03:09:18.811044] W [MSGID: 122019] [ec-helpers.c:354:ec_loc_gfid_check] 10-testvol-disperse-7: Mismatching GFID's in loc Thanks to Pranith & Ashish.
Below is the snippet of the debugging done on the core
>>>>
(gdb) fr 7
#7 0x00007fde6c184df2 in ec_manager_lookup (fop=0x7fde575a5a80, state=<optimized out>) at ec-generic.c:837
837 ec_lookup_rebuild(fop->xl->private, fop, cbk);
(gdb) p *fop->req_frame->parent->parent->parent->parent
$17 = {root = 0x7fde75452350, parent = 0x7fde755344d4, frames = {next = 0x7fde755344e4, prev = 0x7fde7559c148}, local = 0x7fdc240cc420, this = 0x7fde6007ae20, ret = 0x7fde7a7de090 <default_lookup_cbk>, ref_count = 1, lock = {
spinlock = 1, mutex = {__data = {__lock = 1, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\001", '\000' <repeats 38 times>, __align = 1}},
cookie = 0x7fde7555340c, complete = _gf_false, op = GF_FOP_NULL, begin = {tv_sec = 0, tv_usec = 0}, end = {tv_sec = 0, tv_usec = 0}, wind_from = 0x7fde7a804d90 <__FUNCTION__.16554> "default_lookup_resume",
wind_to = 0x7fde7a803d30 "FIRST_CHILD(this)->fops->lookup", unwind_from = 0x0, unwind_to = 0x7fde7a8043cd "default_lookup_cbk"}
(gdb) p *fop->req_frame->parent->parent->parent->parent->parent
$18 = {root = 0x7fde75452350, parent = 0x7fde755929ac, frames = {next = 0x7fde755929bc, prev = 0x7fde7555341c}, local = 0x0, this = 0x7fde6007c300, ret = 0x7fde5eafbb30 <io_stats_lookup_cbk>, ref_count = 1, lock = {spinlock = 1,
mutex = {__data = {__lock = 1, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\001", '\000' <repeats 38 times>, __align = 1}}, cookie = 0x7fde755344d4,
complete = _gf_false, op = GF_FOP_NULL, begin = {tv_sec = 0, tv_usec = 0}, end = {tv_sec = 0, tv_usec = 0}, wind_from = 0x7fde5eb04ec0 <__FUNCTION__.16584> "io_stats_lookup", wind_to = 0x7fde5eb037b8 "FIRST_CHILD(this)->fops->lookup",
unwind_from = 0x0, unwind_to = 0x7fde5eb02997 "io_stats_lookup_cbk"}
(gdb) p *fop->req_frame->parent->parent->parent->parent
$19 = {root = 0x7fde75452350, parent = 0x7fde755344d4, frames = {next = 0x7fde755344e4, prev = 0x7fde7559c148}, local = 0x7fdc240cc420, this = 0x7fde6007ae20, ret = 0x7fde7a7de090 <default_lookup_cbk>, ref_count = 1, lock = {
spinlock = 1, mutex = {__data = {__lock = 1, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\001", '\000' <repeats 38 times>, __align = 1}},
cookie = 0x7fde7555340c, complete = _gf_false, op = GF_FOP_NULL, begin = {tv_sec = 0, tv_usec = 0}, end = {tv_sec = 0, tv_usec = 0}, wind_from = 0x7fde7a804d90 <__FUNCTION__.16554> "default_lookup_resume",
wind_to = 0x7fde7a803d30 "FIRST_CHILD(this)->fops->lookup", unwind_from = 0x0, unwind_to = 0x7fde7a8043cd "default_lookup_cbk"}
(gdb) p *fop->req_frame->parent->parent->parent->parent->this->name
$20 = 98 'b'
(gdb) p fop->req_frame->parent->parent->parent->parent->this->name
$21 = 0x7fde6007b9c0 "butcher-md-cache"
(gdb) p (mdc_local_t*)fop->req_frame->parent->parent->parent->parent->local
$22 = (mdc_local_t *) 0x7fdc240cc420
(gdb) p *$22->loc
Structure has no component named operator*.
(gdb) p $22->loc
$23 = {path = 0x7fdc24192880 "<gfid:9d7aeb82-3c80-4316-8128-bba593c65f5c>/..", name = 0x7fdc241928ac "..", inode = 0x7fde4d02a9ec, parent = 0x7fde4d355240, gfid = '\000' <repeats 15 times>,
pargfid = "\235z\353\202<\200C\026\201(\273\245\223\306_\\"}
(gdb) p *$22->loc.inode
$24 = {table = 0x7fde607fb710, gfid = '\000' <repeats 15 times>, lock = {spinlock = 1, mutex = {__data = {__lock = 1, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = "\001", '\000' <repeats 38 times>, __align = 1}}, nlookup = 0, fd_count = 0, ref = 28, ia_type = IA_INVAL, fd_list = {next = 0x7fde4d02aa44, prev = 0x7fde4d02aa44}, dentry_list = {next = 0x7fde4d02aa54,
prev = 0x7fde4d02aa54}, hash = {next = 0x7fde4d02aa64, prev = 0x7fde4d02aa64}, list = {next = 0x7fde4d28d9b8, prev = 0x7fde4d27befc}, _ctx = 0x7fdcac0191e0}
(gdb) p (mdc_local_t*)fop->req_frame->parent->parent->parent
$25 = (mdc_local_t *) 0x7fde7559c138
(gdb) p fop->req_frame->parent->parent->parent
$26 = (call_frame_t *) 0x7fde7559c138
(gdb) p *fop->req_frame->parent->parent->parent
$27 = {root = 0x7fde75452350, parent = 0x7fde7555340c, frames = {next = 0x7fde7555341c, prev = 0x7fde7558a6e4}, local = 0x7fde4d02a9ec, this = 0x7fde600784a0, ret = 0x7fde5ef22600 <mdc_lookup_cbk>, ref_count = 1, lock = {spinlock = 1,
mutex = {__data = {__lock = 1, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\001", '\000' <repeats 38 times>, __align = 1}}, cookie = 0x7fde7559c138,
complete = _gf_false, op = GF_FOP_NULL, begin = {tv_sec = 0, tv_usec = 0}, end = {tv_sec = 0, tv_usec = 0}, wind_from = 0x7fde5ef2856f <__FUNCTION__.15558> "mdc_lookup", wind_to = 0x7fde7a803d30 "FIRST_CHILD(this)->fops->lookup",
unwind_from = 0x0, unwind_to = 0x7fde5ef27709 "mdc_lookup_cbk"}
(gdb) p *fop->req_frame->parent->parent->parent->this->name
$28 = 98 'b'
(gdb) p fop->req_frame->parent->parent->parent->this->name
$29 = 0x7fde60079040 "butcher-quick-read"
(gdb) p fop->req_frame->parent->parent
$30 = (call_frame_t *) 0x7fde7558a6d4
(gdb) p *fop->req_frame->parent->parent
$31 = {root = 0x7fde75452350, parent = 0x7fde7559c138, frames = {next = 0x7fde7559c148, prev = 0x7fde754d2f30}, local = 0x7fde6009ad54, this = 0x7fde60076d80, ret = 0x7fde5f33ab30 <qr_lookup_cbk>, ref_count = 1, lock = {spinlock = 1,
mutex = {__data = {__lock = 1, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\001", '\000' <repeats 38 times>, __align = 1}}, cookie = 0x7fde7558a6d4,
complete = _gf_false, op = GF_FOP_NULL, begin = {tv_sec = 0, tv_usec = 0}, end = {tv_sec = 0, tv_usec = 0}, wind_from = 0x7fde5f33cae0 <__FUNCTION__.16084> "qr_lookup", wind_to = 0x7fde5f33c4c8 "FIRST_CHILD(this)->fops->lookup",
unwind_from = 0x0, unwind_to = 0x7fde5f33c7a4 "qr_lookup_cbk"}
(gdb) p *fop->req_frame->parent->parent->this->name
$32 = 98 'b'
(gdb) p fop->req_frame->parent->parent->this->name
$33 = 0x7fde60076c00 "butcher-io-cache"
(gdb) p ((ioc_local_t*)fop->req_frame->parent->parent->local)->file_loc
$34 = {path = 0x7fdc240798a0 "<gfid:9d7aeb82-3c80-4316-8128-bba593c65f5c>/..", name = 0x7fdc240798cc "..", inode = 0x7fde4d02a9ec, parent = 0x7fde4d355240, gfid = '\000' <repeats 15 times>,
pargfid = "\235z\353\202<\200C\026\201(\273\245\223\306_\\"}
(gdb) p *fop->req_frame->parent
$35 = {root = 0x7fde75452350, parent = 0x7fde7558a6d4, frames = {next = 0x7fde7558a6e4, prev = 0x7fde754ce3d4}, local = 0x7fde5de5b3cc, this = 0x7fde60070d00, ret = 0x7fde5f546c60 <ioc_lookup_cbk>, ref_count = 3, lock = {spinlock = 1,
mutex = {__data = {__lock = 1, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\001", '\000' <repeats 38 times>, __align = 1}}, cookie = 0x7fde754d2f20,
complete = _gf_false, op = GF_FOP_NULL, begin = {tv_sec = 0, tv_usec = 0}, end = {tv_sec = 0, tv_usec = 0}, wind_from = 0x7fde5f54ecd7 <__FUNCTION__.16334> "ioc_lookup", wind_to = 0x7fde7a803d30 "FIRST_CHILD(this)->fops->lookup",
unwind_from = 0x0, unwind_to = 0x7fde5f54df48 "ioc_lookup_cbk"}
(gdb) p fop->req_frame->parent->this->name
$36 = 0x7fde60062720 "butcher-dht"
(gdb) p (dht_local_t*)fop->req_frame->parent->local
$37 = (dht_local_t *) 0x7fde5de5b3cc
(gdb) p $37->loc
$38 = {path = 0x7fdc240420d0 "<gfid:9d7aeb82-3c80-4316-8128-bba593c65f5c>/..", name = 0x7fdc240420fc "..", inode = 0x7fde4d02a9ec, parent = 0x7fde4d355240, gfid = ";\202\020^@eC\024\240\306\b\365l,\270V",
pargfid = "\235z\353\202<\200C\026\201(\273\245\223\306_\\"}
(gdb) p /x $37->loc.gfid
$39 = {0x3b, 0x82, 0x10, 0x5e, 0x40, 0x65, 0x43, 0x14, 0xa0, 0xc6, 0x8, 0xf5, 0x6c, 0x2c, 0xb8, 0x56}
(gdb) fr 7
#7 0x00007fde6c184df2 in ec_manager_lookup (fop=0x7fde575a5a80, state=<optimized out>) at ec-generic.c:837
837 ec_lookup_rebuild(fop->xl->private, fop, cbk);
(gdb) p /x fop->loc[0].gfid
$40 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x43, 0x14, 0xa0, 0xc6, 0x8, 0xf5, 0x6c, 0x2c, 0xb8, 0x56}
(gdb) fr 6
#6 0x00007fde6c184b85 in ec_lookup_rebuild (ec=<optimized out>, fop=fop@entry=0x7fde575a5a80, cbk=cbk@entry=0x7fde56dd8aa8) at ec-generic.c:644
644 err = ec_loc_update(fop->xl, &fop->loc[0], cbk->inode, &cbk->iatt[0]);
(gdb) p /x cbk->iatt[0].ia_gfid
$41 = {0x3b, 0x82, 0x10, 0x5e, 0x40, 0x65, 0x43, 0x14, 0xa0, 0xc6, 0x8, 0xf5, 0x6c, 0x2c, 0xb8, 0x56}
(gdb) p /x $37->gfid
$42 = {0x3b, 0x82, 0x10, 0x5e, 0x40, 0x65, 0x43, 0x14, 0xa0, 0xc6, 0x8, 0xf5, 0x6c, 0x2c, 0xb8, 0x56}
(gdb) p *fop->req_frame
$43 = {root = 0x7fde75452350, parent = 0x7fde754d2f20, frames = {next = 0x7fde75526d94, prev = 0x7fde75546864}, local = 0x0, this = 0x7fde6006c320, ret = 0x7fde5fdafd90 <dht_lookup_dir_cbk>, ref_count = 0, lock = {spinlock = 1, mutex = {
__data = {__lock = 1, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\001", '\000' <repeats 38 times>, __align = 1}}, cookie = 0x7fde75503c0c,
complete = _gf_false, op = GF_FOP_NULL, begin = {tv_sec = 0, tv_usec = 0}, end = {tv_sec = 0, tv_usec = 0}, wind_from = 0x7fde5fdee590 <__FUNCTION__.18890> "dht_lookup_directory",
wind_to = 0x7fde5fde9528 "conf->subvolumes[i]->fops->lookup", unwind_from = 0x0, unwind_to = 0x7fde5fdeb3cd "dht_lookup_dir_cbk"}
(gdb) shell
[root@gqas013 tmp]# gluster v info butcher
Volume Name: butcher
Type: Distributed-Disperse
Volume ID: 5be704df-e1e1-43a2-8f67-5bf720efe9e5
Status: Started
Snapshot Count: 0
Number of Bricks: 12 x (4 + 2) = 72
Transport-type: tcp
Bricks:
Brick1: gqas013.sbu.lab.eng.bos.redhat.com:/bricks1/brick
Brick2: gqas014.sbu.lab.eng.bos.redhat.com:/bricks1/brick
Brick3: gqas015.sbu.lab.eng.bos.redhat.com:/bricks1/brick
Brick4: gqas005.sbu.lab.eng.bos.redhat.com:/bricks1/brick
Brick5: gqas006.sbu.lab.eng.bos.redhat.com:/bricks1/brick
Brick6: gqas008.sbu.lab.eng.bos.redhat.com:/bricks1/brick
Brick7: gqas013.sbu.lab.eng.bos.redhat.com:/bricks2/brick
Brick8: gqas014.sbu.lab.eng.bos.redhat.com:/bricks2/brick
Brick9: gqas015.sbu.lab.eng.bos.redhat.com:/bricks2/brick
Brick10: gqas005.sbu.lab.eng.bos.redhat.com:/bricks2/brick
Brick11: gqas006.sbu.lab.eng.bos.redhat.com:/bricks2/brick
Brick12: gqas008.sbu.lab.eng.bos.redhat.com:/bricks2/brick
Brick13: gqas013.sbu.lab.eng.bos.redhat.com:/bricks3/brick
Brick14: gqas014.sbu.lab.eng.bos.redhat.com:/bricks3/brick
Brick15: gqas015.sbu.lab.eng.bos.redhat.com:/bricks3/brick
Brick16: gqas005.sbu.lab.eng.bos.redhat.com:/bricks3/brick
Brick17: gqas006.sbu.lab.eng.bos.redhat.com:/bricks3/brick
Brick18: gqas008.sbu.lab.eng.bos.redhat.com:/bricks3/brick
Brick19: gqas013.sbu.lab.eng.bos.redhat.com:/bricks4/brick
Brick20: gqas014.sbu.lab.eng.bos.redhat.com:/bricks4/brick
Brick21: gqas015.sbu.lab.eng.bos.redhat.com:/bricks4/brick
Brick22: gqas005.sbu.lab.eng.bos.redhat.com:/bricks4/brick
Brick23: gqas006.sbu.lab.eng.bos.redhat.com:/bricks4/brick
Brick24: gqas008.sbu.lab.eng.bos.redhat.com:/bricks4/brick
Brick25: gqas013.sbu.lab.eng.bos.redhat.com:/bricks5/brick
Brick26: gqas014.sbu.lab.eng.bos.redhat.com:/bricks5/brick
Brick27: gqas015.sbu.lab.eng.bos.redhat.com:/bricks5/brick
Brick28: gqas005.sbu.lab.eng.bos.redhat.com:/bricks5/brick
Brick29: gqas006.sbu.lab.eng.bos.redhat.com:/bricks5/brick
Brick30: gqas008.sbu.lab.eng.bos.redhat.com:/bricks5/brick
Brick31: gqas013.sbu.lab.eng.bos.redhat.com:/bricks6/brick
Brick32: gqas014.sbu.lab.eng.bos.redhat.com:/bricks6/brick
Brick33: gqas015.sbu.lab.eng.bos.redhat.com:/bricks6/brick
Brick34: gqas005.sbu.lab.eng.bos.redhat.com:/bricks6/brick
Brick35: gqas006.sbu.lab.eng.bos.redhat.com:/bricks6/brick
Brick36: gqas008.sbu.lab.eng.bos.redhat.com:/bricks6/brick
Brick37: gqas013.sbu.lab.eng.bos.redhat.com:/bricks7/brick
Brick38: gqas014.sbu.lab.eng.bos.redhat.com:/bricks7/brick
Brick39: gqas015.sbu.lab.eng.bos.redhat.com:/bricks7/brick
Brick40: gqas005.sbu.lab.eng.bos.redhat.com:/bricks7/brick
Brick41: gqas006.sbu.lab.eng.bos.redhat.com:/bricks7/brick
Brick42: gqas008.sbu.lab.eng.bos.redhat.com:/bricks7/brick
Brick43: gqas013.sbu.lab.eng.bos.redhat.com:/bricks8/brick
Brick44: gqas014.sbu.lab.eng.bos.redhat.com:/bricks8/brick
Brick45: gqas015.sbu.lab.eng.bos.redhat.com:/bricks8/brick
Brick46: gqas005.sbu.lab.eng.bos.redhat.com:/bricks8/brick
Brick47: gqas006.sbu.lab.eng.bos.redhat.com:/bricks8/brick
Brick48: gqas008.sbu.lab.eng.bos.redhat.com:/bricks8/brick
Brick49: gqas013.sbu.lab.eng.bos.redhat.com:/bricks9/brick
Brick50: gqas014.sbu.lab.eng.bos.redhat.com:/bricks9/brick
Brick51: gqas015.sbu.lab.eng.bos.redhat.com:/bricks9/brick
Brick52: gqas005.sbu.lab.eng.bos.redhat.com:/bricks9/brick
Brick53: gqas006.sbu.lab.eng.bos.redhat.com:/bricks9/brick
Brick54: gqas008.sbu.lab.eng.bos.redhat.com:/bricks9/brick
Brick55: gqas013.sbu.lab.eng.bos.redhat.com:/bricks10/brick
Brick56: gqas014.sbu.lab.eng.bos.redhat.com:/bricks10/brick
Brick57: gqas015.sbu.lab.eng.bos.redhat.com:/bricks10/brick
Brick58: gqas005.sbu.lab.eng.bos.redhat.com:/bricks10/brick
Brick59: gqas006.sbu.lab.eng.bos.redhat.com:/bricks10/brick
Brick60: gqas008.sbu.lab.eng.bos.redhat.com:/bricks10/brick
Brick61: gqas013.sbu.lab.eng.bos.redhat.com:/bricks11/brick
Brick62: gqas014.sbu.lab.eng.bos.redhat.com:/bricks11/brick
Brick63: gqas015.sbu.lab.eng.bos.redhat.com:/bricks11/brick
Brick64: gqas005.sbu.lab.eng.bos.redhat.com:/bricks11/brick
Brick65: gqas006.sbu.lab.eng.bos.redhat.com:/bricks11/brick
Brick66: gqas008.sbu.lab.eng.bos.redhat.com:/bricks11/brick
Brick67: gqas013.sbu.lab.eng.bos.redhat.com:/bricks12/brick
Brick68: gqas014.sbu.lab.eng.bos.redhat.com:/bricks12/brick
Brick69: gqas015.sbu.lab.eng.bos.redhat.com:/bricks12/brick
Brick70: gqas005.sbu.lab.eng.bos.redhat.com:/bricks12/brick
Brick71: gqas006.sbu.lab.eng.bos.redhat.com:/bricks12/brick
Brick72: gqas008.sbu.lab.eng.bos.redhat.com:/bricks12/brick
Options Reconfigured:
ganesha.enable: on
features.cache-invalidation: on
nfs.disable: on
performance.readdir-ahead: on
nfs-ganesha: enable
cluster.enable-shared-storage: enable
[root@gqas013 tmp]# exit
(gdb) p *fop->req_frame
$44 = {root = 0x7fde75452350, parent = 0x7fde754d2f20, frames = {next = 0x7fde75526d94, prev = 0x7fde75546864}, local = 0x0, this = 0x7fde6006c320, ret = 0x7fde5fdafd90 <dht_lookup_dir_cbk>, ref_count = 0, lock = {spinlock = 1, mutex = {
__data = {__lock = 1, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\001", '\000' <repeats 38 times>, __align = 1}}, cookie = 0x7fde75503c0c,
complete = _gf_false, op = GF_FOP_NULL, begin = {tv_sec = 0, tv_usec = 0}, end = {tv_sec = 0, tv_usec = 0}, wind_from = 0x7fde5fdee590 <__FUNCTION__.18890> "dht_lookup_directory",
wind_to = 0x7fde5fde9528 "conf->subvolumes[i]->fops->lookup", unwind_from = 0x0, unwind_to = 0x7fde5fdeb3cd "dht_lookup_dir_cbk"}
<<<<
Pranith observed that loc->gfid has been set in dht layer but got modified by the time ec was validating it in ec_lookup_cbk. He shall be able to provide more details on the code-flow when this issue arises.
https://review.gluster.org/16986 - Patch upstream I could not repro it on two attempts to recreate it on 3.8.4-25. Since it's a stress test,I'll be revisiting it again during 3.3.Will reopen if I hit it again. Moving this to Verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774 |