Bug 960368
| Summary: | Hypervisor mount crashed after rebalance | |||
|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | shylesh <shmohan> | |
| Component: | glusterfs | Assignee: | Pranith Kumar K <pkarampu> | |
| Status: | CLOSED ERRATA | QA Contact: | shylesh <shmohan> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | high | |||
| Version: | unspecified | CC: | amarts, grajaiya, pkarampu, rhs-bugs, sdharane, sgowda, surs, vbellur | |
| Target Milestone: | --- | |||
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | glusterfs-3.4.0.6rhs-1 | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 961615 (view as bug list) | Environment: |
virt rhev integration
|
|
| Last Closed: | 2013-09-23 22:35:26 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 923534, 961615 | |||
|
Description
shylesh
2013-05-07 06:25:44 UTC
Please provide access to the host where the crash was seen Looks like a flush call has come in on a freed fd
(gdb) bt
#0 0x000000362780c170 in pthread_spin_lock () from /lib64/libpthread.so.0
#1 0x00007fa3304de4c3 in fd_ref (fd=0x2d54124) at fd.c:447
#2 0x00007fa32e8afd4e in fuse_resolve_fd (state=<value optimized out>) at fuse-resolve.c:461
#3 fuse_resolve (state=<value optimized out>) at fuse-resolve.c:622
#4 0x00007fa32e8b026e in fuse_resolve_all (state=<value optimized out>) at fuse-resolve.c:665
#5 0x00007fa32e8b02b8 in fuse_resolve_and_resume (state=0x7fa31c038cb0, fn=0x7fa32e8b87a0 <fuse_flush_resume>) at fuse-resolve.c:705
#6 0x00007fa32e8c66f8 in fuse_thread_proc (data=0x100fb70) at fuse-bridge.c:4562
#7 0x0000003627807851 in start_thread () from /lib64/libpthread.so.0
#8 0x00000036274e890d in clone () from /lib64/libc.so.6
(gdb) f 1
#1 0x00007fa3304de4c3 in fd_ref (fd=0x2d54124) at fd.c:447
447 LOCK (&fd->inode->lock);
(gdb) p fd->inode
$12 = (struct _inode *) 0xaaaaaaaa
(gdb) p *fd
$13 = {pid = 0, flags = 0, refcount = 0, inode_list = {next = 0x2d54134, prev = 0x2d54134}, inode = 0xaaaaaaaa, lock = 1, _ctx = 0x7fa31c03a090, xl_count = 32, lk_ctx = 0x7fa31c012230, anonymous = _gf_true} <=====ref count is zero
(gdb) f 5
#5 0x00007fa32e8b02b8 in fuse_resolve_and_resume (state=0x7fa31c038cb0, fn=0x7fa32e8b87a0 <fuse_flush_resume>) at fuse-resolve.c:705
705 fuse_resolve_all (state);
(gdb) p activefd
$15 = (fd_t *) 0x2d54124
(gdb) p *basefd
$17 = {pid = 8614, flags = 573442, refcount = 4, inode_list = {next = 0x7fa31b48f370, prev = 0x7fa31b48f370}, inode = 0x7fa31b48f338, lock = 0, _ctx = 0x7fa31c029d70, xl_count = 26, lk_ctx = 0x7fa31c00a5b0,
anonymous = _gf_false}
(gdb) p *basefd->inode
$18 = {table = 0x1482930, gfid = "K\362\247w\304\030G>\226\222\343q\343\270:\t", lock = 1, nlookup = 51, fd_count = 4294967247, ref = 1, ia_type = IA_IFREG, fd_list = {next = 0x1482d40, prev = 0x1482d40},
dentry_list = {next = 0x7fa31b1ee3c8, prev = 0x7fa31b1ee3c8}, hash = {next = 0x7fa31b1270a0, prev = 0x7fa31b1270a0}, list = {next = 0x7fa31b48f91c, prev = 0x7fa31b48f7e4}, _ctx = 0x7fa31c036100}
(gdb) p *(xlator_t *)fd->_ctx[2].xl_key
$30 = {name = 0x14a3790 "vstore", type = 0x14a37b0 "debug/io-stats", next = 0x14dc3a0, prev = 0x0, parents = 0x0, children = 0x14a6570, options = 0x7fa32ecbc8a0, dlhandle = 0x1041c30, fops = 0x7fa32b17a5c0,
cbks = 0x7fa32b17a860, dumpops = 0x0, volume_options = {next = 0x14a36c0, prev = 0x14a36c0}, fini = 0x7fa32af67f40 <fini>, init = 0x7fa32af683a0 <init>, reconfigure = 0x7fa32af68690 <reconfigure>,
mem_acct_init = 0x7fa32af68620 <mem_acct_init>, notify = 0x7fa32af70da0 <notify>, loglevel = GF_LOG_NONE, latencies = {{min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0} <repeats 46 times>},
history = 0x0, ctx = 0xff3010, graph = 0x14a5550, itable = 0x2c1d030, init_succeeded = 1 '\001', private = 0x14dd6e0, mem_acct = {num_types = 0, rec = 0x0}, winds = 1, switched = 0 '\000', local_pool = 0x0,
is_autoloaded = _gf_false}
(gdb) p *(xlator_t *)fd->_ctx[1].xl_key
$31 = {name = 0x149f340 "vstore-open-behind", type = 0x14a3770 "performance/open-behind", next = 0x14dba00, prev = 0x14dcd40, parents = 0x14a64d0, children = 0x14a3700, options = 0x7fa32ecbc814,
dlhandle = 0x1040bc0, fops = 0x7fa32b3824c0, cbks = 0x7fa32b382760, dumpops = 0x7fa32b382780, volume_options = {next = 0x14a36e0, prev = 0x14a36e0}, fini = 0x7fa32b17e650 <fini>,
init = 0x7fa32b17e690 <init>, reconfigure = 0x7fa32b17e7e0 <reconfigure>, mem_acct_init = 0x7fa32b17e860 <mem_acct_init>, notify = 0x7fa3304c1190 <default_notify>, loglevel = GF_LOG_NONE, latencies = {{
min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0} <repeats 46 times>}, history = 0x0, ctx = 0xff3010, graph = 0x14a5550, itable = 0x0, init_succeeded = 1 '\001', private = 0x149f370, mem_acct = {
num_types = 0, rec = 0x0}, winds = 0, switched = 0 '\000', local_pool = 0x0, is_autoloaded = _gf_false}
Looks like a open-behind xlator issue I looked at the fd_unrefs that are present in different xlators in the mount graph. I saw one unmatched fd_unref in dht. I am able to recreate a crash at a different place using the following script again in fd_ref in fuse_writev code path.
#!/bin/bash
. $(dirname $0)/../include.rc
. $(dirname $0)/../volume.rc
#This test tests that an extra fd_unref does not happen in rebalance
#migration completion check code path in dht
cleanup;
TEST glusterd
TEST pidof glusterd
TEST $CLI volume create $V0 $H0:$B0/${V0}0 $H0:$B0/${V0}1
TEST $CLI volume set $V0 performance.quick-read off
TEST $CLI volume set $V0 performance.io-cache off
TEST $CLI volume set $V0 performance.write-behind off
TEST $CLI volume set $V0 performance.stat-prefetch off
TEST $CLI volume set $V0 performance.read-ahead off
TEST $CLI volume start $V0
TEST glusterfs --volfile-id=/$V0 --volfile-server=$H0 $M0 --attribute-timeout=0 --entry-timeout=0
TEST touch $M0/1
#This rename creates a link file for 10 in the other volume.
TEST mv $M0/1 $M0/10
#Lets keep writing to the file which will trigger rebalance completion check
dd if=/dev/zero of=$M0/10 bs=1k &
bg_pid=$!
#Now rebalance force will migrate file '10'
TEST $CLI volume rebalance $V0 start force
EXPECT_WITHIN 60 "completed" rebalance_status_field $V0
#If the bug exists mount would have crashed by now
TEST ls $M0
kill -9 $bg_pid > /dev/null 2>&1
wait > /dev/null 2>&1
cleanup
Verified on 3.4.0.8rhs-1.el6rhs.x86_64 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1262.html |