Bug 1296134 - Rebalance crashed after detach tier.
Summary: Rebalance crashed after detach tier.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: tier
Version: rhgs-3.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: RHGS 3.1.2
Assignee: Bug Updates Notification Mailing List
QA Contact: Bhaskarakiran
URL:
Whiteboard:
Depends On:
Blocks: 1296611 1297309
TreeView+ depends on / blocked
 
Reported: 2016-01-06 12:12 UTC by Bhaskarakiran
Modified: 2016-11-23 23:12 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.7.5-16
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1296611 (view as bug list)
Environment:
Last Closed: 2016-03-01 06:06:58 UTC
Embargoed:


Attachments (Terms of Use)
core file (5.46 MB, application/zip)
2016-01-06 12:12 UTC, Bhaskarakiran
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:0193 0 normal SHIPPED_LIVE Red Hat Gluster Storage 3.1 update 2 2016-03-01 10:20:36 UTC

Description Bhaskarakiran 2016-01-06 12:12:28 UTC
Created attachment 1112126 [details]
core file

Description of problem:
=======================

Was tryng to reproduce https://bugzilla.redhat.com/show_bug.cgi?id=1296048 disabling uss and quota and did a detach tier and hit the crash.

Backtrace:
=========

(gdb) t a a bt

Thread 25 (Thread 0x7f7a0ffff700 (LWP 29525)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007f7a84a15f18 in syncenv_task (proc=proc@entry=0x7f7a8617c1d0) at syncop.c:607
#2  0x00007f7a84a16c50 in syncenv_processor (thdata=0x7f7a8617c1d0) at syncop.c:699
#3  0x00007f7a8383adc5 in start_thread (arg=0x7f7a0ffff700) at pthread_create.c:308
#4  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 24 (Thread 0x7f7a6e15d700 (LWP 26306)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f7a76a0175a in gf_defrag_task (opaque=0x7f7a70027230) at dht-rebalance.c:2095
#2  0x00007f7a8383adc5 in start_thread (arg=0x7f7a6e15d700) at pthread_create.c:308
#3  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 23 (Thread 0x7f7a6d15b700 (LWP 26308)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f7a76a0175a in gf_defrag_task (opaque=0x7f7a70027230) at dht-rebalance.c:2095
#2  0x00007f7a8383adc5 in start_thread (arg=0x7f7a6d15b700) at pthread_create.c:308
#3  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 22 (Thread 0x7f7a519f8700 (LWP 26461)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007f7a84a15f18 in syncenv_task (proc=proc@entry=0x7f7a8617a010) at syncop.c:607
#2  0x00007f7a84a16c50 in syncenv_processor (thdata=0x7f7a8617a010) at syncop.c:699
#3  0x00007f7a8383adc5 in start_thread (arg=0x7f7a519f8700) at pthread_create.c:308
#4  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 21 (Thread 0x7f7a2cff9700 (LWP 28208)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007f7a84a15f18 in syncenv_task (proc=proc@entry=0x7f7a8617be10) at syncop.c:607
#2  0x00007f7a84a16c50 in syncenv_processor (thdata=0x7f7a8617be10) at syncop.c:699
#3  0x00007f7a8383adc5 in start_thread (arg=0x7f7a2cff9700) at pthread_create.c:308
#4  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 20 (Thread 0x7f7a7afec700 (LWP 25758)):
#0  0x00007f7a83841e91 in do_sigwait (sig=0x7f7a7afebe1c, set=<optimized out>) at ../sysdeps/unix/sysv/linux/sigwait.c:61
#1  __sigwait (set=set@entry=0x7f7a7afebe20, sig=sig@entry=0x7f7a7afebe1c) at ../sysdeps/unix/sysv/linux/sigwait.c:99
#2  0x00007f7a84ea58bb in glusterfs_sigwaiter (arg=<optimized out>) at glusterfsd.c:2006
#3  0x00007f7a8383adc5 in start_thread (arg=0x7f7a7afec700) at pthread_create.c:308
#4  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 19 (Thread 0x7f7a52ffd700 (LWP 26312)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f7a84a19c1b in syncop_setxattr (subvol=subvol@entry=0x7f7a7001e9b0, loc=loc@entry=0x7f7a52ffcd70, dict=dict@entry=0x7f7a81ef8994, flags=flags@entry=0, xdata_in=xdata_in@entry=0x0, xdata_out=xdata_out@entry=0x0)
    at syncop.c:1577
#2  0x00007f7a76a01249 in gf_defrag_migrate_single_file (opaque=opaque@entry=0x7f7a28007ca0) at dht-rebalance.c:1963
#3  0x00007f7a76a01936 in gf_defrag_task (opaque=0x7f7a70027230) at dht-rebalance.c:2125
#4  0x00007f7a8383adc5 in start_thread (arg=0x7f7a52ffd700) at pthread_create.c:308
#5  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 18 (Thread 0x7f7a79fea700 (LWP 25760)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007f7a84a15f18 in syncenv_task (proc=proc@entry=0x7f7a86179c50) at syncop.c:607
#2  0x00007f7a84a16c50 in syncenv_processor (thdata=0x7f7a86179c50) at syncop.c:699
#3  0x00007f7a8383adc5 in start_thread (arg=0x7f7a79fea700) at pthread_create.c:308
#4  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 17 (Thread 0x7f7a537fe700 (LWP 26311)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f7a84a19c1b in syncop_setxattr (subvol=subvol@entry=0x7f7a7001e9b0, loc=loc@entry=0x7f7a537fdd70, dict=dict@entry=0x7f7a81ef8994, flags=flags@entry=0, xdata_in=xdata_in@entry=0x0, xdata_out=xdata_out@entry=0x0)
    at syncop.c:1577
#2  0x00007f7a76a01249 in gf_defrag_migrate_single_file (opaque=opaque@entry=0x7f7a28005380) at dht-rebalance.c:1963
#3  0x00007f7a76a01936 in gf_defrag_task (opaque=0x7f7a70027230) at dht-rebalance.c:2125
---Type <return> to continue, or q <return> to quit---
#4  0x00007f7a8383adc5 in start_thread (arg=0x7f7a537fe700) at pthread_create.c:308
#5  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 16 (Thread 0x7f7a53fff700 (LWP 26310)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f7a84a19c1b in syncop_setxattr (subvol=subvol@entry=0x7f7a7001e9b0, loc=loc@entry=0x7f7a53ffed70, dict=dict@entry=0x7f7a81ef8994, flags=flags@entry=0, xdata_in=xdata_in@entry=0x0, xdata_out=xdata_out@entry=0x0)
    at syncop.c:1577
#2  0x00007f7a76a01249 in gf_defrag_migrate_single_file (opaque=opaque@entry=0x7f7a600059d0) at dht-rebalance.c:1963
#3  0x00007f7a76a01936 in gf_defrag_task (opaque=0x7f7a70027230) at dht-rebalance.c:2125
#4  0x00007f7a8383adc5 in start_thread (arg=0x7f7a53fff700) at pthread_create.c:308
#5  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 15 (Thread 0x7f7a84e7c780 (LWP 25756)):
#0  0x00007f7a8383bef7 in pthread_join (threadid=140163968702208, thread_return=thread_return@entry=0x0) at pthread_join.c:92
#1  0x00007f7a84a33c28 in event_dispatch_epoll (event_pool=0x7f7a86168d10) at event-epoll.c:762
#2  0x00007f7a84ea27f7 in main (argc=33, argv=0x7ffc0dec5de8) at glusterfsd.c:2350

Thread 14 (Thread 0x7f7a6d95c700 (LWP 26307)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f7a76a0175a in gf_defrag_task (opaque=0x7f7a70027230) at dht-rebalance.c:2095
#2  0x00007f7a8383adc5 in start_thread (arg=0x7f7a6d95c700) at pthread_create.c:308
#3  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 13 (Thread 0x7f7a2effd700 (LWP 26974)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007f7a84a15f18 in syncenv_task (proc=proc@entry=0x7f7a8617af10) at syncop.c:607
#2  0x00007f7a84a16c50 in syncenv_processor (thdata=0x7f7a8617af10) at syncop.c:699
#3  0x00007f7a8383adc5 in start_thread (arg=0x7f7a2effd700) at pthread_create.c:308
#4  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 12 (Thread 0x7f7a6c95a700 (LWP 26309)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f7a76a0175a in gf_defrag_task (opaque=0x7f7a70027230) at dht-rebalance.c:2095
#2  0x00007f7a8383adc5 in start_thread (arg=0x7f7a6c95a700) at pthread_create.c:308
#3  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 11 (Thread 0x7f7a6ffff700 (LWP 26287)):
#0  0x00007f7a831817a3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f7a84a33720 in event_dispatch_epoll_worker (data=0x7f7a70038350) at event-epoll.c:668
#2  0x00007f7a8383adc5 in start_thread (arg=0x7f7a6ffff700) at pthread_create.c:308
#3  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 10 (Thread 0x7f7a2ffff700 (LWP 26941)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007f7a84a15f18 in syncenv_task (proc=proc@entry=0x7f7a8617a790) at syncop.c:607
#2  0x00007f7a84a16c50 in syncenv_processor (thdata=0x7f7a8617a790) at syncop.c:699
#3  0x00007f7a8383adc5 in start_thread (arg=0x7f7a2ffff700) at pthread_create.c:308
#4  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 9 (Thread 0x7f7a2f7fe700 (LWP 26973)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007f7a84a15f18 in syncenv_task (proc=proc@entry=0x7f7a8617ab50) at syncop.c:607
#2  0x00007f7a84a16c50 in syncenv_processor (thdata=0x7f7a8617ab50) at syncop.c:699
#3  0x00007f7a8383adc5 in start_thread (arg=0x7f7a2f7fe700) at pthread_create.c:308
#4  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 8 (Thread 0x7f7a2e7fc700 (LWP 26976)):
#0  swapcontext () at ../sysdeps/unix/sysv/linux/x86_64/swapcontext.S:79
#1  0x00007f7a84a13d80 in synctask_yield (task=0x7f7a643b6020) at syncop.c:343
#2  0x00007f7a830d2110 in ?? () from /lib64/libc.so.6
#3  0x0000000000000000 in ?? ()

Thread 7 (Thread 0x7f7a77909700 (LWP 25763)):
---Type <return> to continue, or q <return> to quit---
#0  0x00007f7a831817a3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f7a84a33720 in event_dispatch_epoll_worker (data=0x7f7a861b6ac0) at event-epoll.c:668
#2  0x00007f7a8383adc5 in start_thread (arg=0x7f7a77909700) at pthread_create.c:308
#3  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 6 (Thread 0x7f7a7b7ed700 (LWP 25757)):
#0  0x00007f7a8384196d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f7a849f1924 in gf_timer_proc (ctx=0x7f7a8614a010) at timer.c:205
#2  0x00007f7a8383adc5 in start_thread (arg=0x7f7a7b7ed700) at pthread_create.c:308
#3  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 5 (Thread 0x7f7a527fc700 (LWP 26313)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f7a84a19c1b in syncop_setxattr (subvol=subvol@entry=0x7f7a7001e9b0, loc=loc@entry=0x7f7a527fbd70, dict=dict@entry=0x7f7a81ef8994, flags=flags@entry=0, xdata_in=xdata_in@entry=0x0, xdata_out=xdata_out@entry=0x0)
    at syncop.c:1577
#2  0x00007f7a76a01249 in gf_defrag_migrate_single_file (opaque=opaque@entry=0x7f7a1400a830) at dht-rebalance.c:1963
#3  0x00007f7a76a01936 in gf_defrag_task (opaque=0x7f7a70027230) at dht-rebalance.c:2125
#4  0x00007f7a8383adc5 in start_thread (arg=0x7f7a527fc700) at pthread_create.c:308
#5  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 4 (Thread 0x7f7a7a7eb700 (LWP 25759)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007f7a84a15f18 in syncenv_task (proc=proc@entry=0x7f7a86179890) at syncop.c:607
#2  0x00007f7a84a16c50 in syncenv_processor (thdata=0x7f7a86179890) at syncop.c:699
#3  0x00007f7a8383adc5 in start_thread (arg=0x7f7a7a7eb700) at pthread_create.c:308
#4  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 3 (Thread 0x7f7a50df5700 (LWP 26465)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007f7a84a15f18 in syncenv_task (proc=proc@entry=0x7f7a8617a3d0) at syncop.c:607
#2  0x00007f7a84a16c50 in syncenv_processor (thdata=0x7f7a8617a3d0) at syncop.c:699
#3  0x00007f7a8383adc5 in start_thread (arg=0x7f7a50df5700) at pthread_create.c:308
#4  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 2 (Thread 0x7f7a2d7fa700 (LWP 27170)):
#0  list_empty (head=<optimized out>) at list.h:114
#1  syncenv_task (proc=proc@entry=0x7f7a8617ba50) at syncop.c:609
#2  0x00007f7a84a16c50 in syncenv_processor (thdata=0x7f7a8617ba50) at syncop.c:699
#3  0x00007f7a8383adc5 in start_thread (arg=0x7f7a2d7fa700) at pthread_create.c:308
#4  0x00007f7a831811cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 1 (Thread 0x7f7a2dffb700 (LWP 27152)):
#0  pthread_spin_lock () at ../nptl/sysdeps/x86_64/pthread_spin_lock.S:24
#1  0x00007f7a84a02e07 in fd_unref (fd=0x7f7a5c005b18) at fd.c:559
#2  0x00007f7a84a1d90e in syncop_close (fd=fd@entry=0x7f7a5c005b18) at syncop.c:2021
#3  0x00007f7a769fdd49 in dht_migrate_file (this=0x7f7a7001e9b0, loc=<optimized out>, from=0x7f7a7001da90, to=0x7f7a7001c5a0, flag=<optimized out>) at dht-rebalance.c:1644
#4  0x00007f7a84a13e02 in synctask_wrap (old_task=<optimized out>) at syncop.c:380
#5  0x00007f7a830d2110 in ?? () from /lib64/libc.so.6
#6  0x0000000000000000 in ?? ()
(gdb) 
(gdb) 


Version-Release number of selected component (if applicable):
=============================================================
3.7.5-14

How reproducible:
=================
seen once

Steps to Reproduce:
1. Create 2x(4+2) ec volume and nfs mount on the client
2. Disable uss and enable quota
3. Start io (linux untar, files and dir creation)
4. Attach a 2x2 hot tier.
5. Run IO for some time and then detach tier.

Actual results:
==============
Rebalance crash

Expected results:


Additional info:
================
Attaching the core file.

Comment 1 Bhaskarakiran 2016-01-06 12:14:41 UTC
The rebalance log file output :

pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
[2016-01-06 10:14:14.816071] W [MSGID: 114031] [client-rpc-fops.c:2325:client3_3_setattr_cbk] 0-disperse_vol1-client-13: remote operation failed [No such file or directory]
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 
2016-01-06 10:14:14
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.5
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7f7a849d2002]
/lib64/libglusterfs.so.0(gf_print_trace+0x31d)[0x7f7a849ee48d]
/lib64/libc.so.6(+0x35670)[0x7f7a830c0670]
/lib64/libpthread.so.0(pthread_spin_lock+0x0)[0x7f7a8383f210]
---------

Comment 5 Nithya Balachandran 2016-01-07 15:36:43 UTC
Analysis:

From the core:

(gdb) bt
#0  pthread_spin_lock () at ../nptl/sysdeps/x86_64/pthread_spin_lock.S:24
#1  0x00007f7a84a02e07 in fd_unref (fd=0x7f7a5c005b18) at fd.c:559
#2  0x00007f7a84a1d90e in syncop_close (fd=fd@entry=0x7f7a5c005b18) at syncop.c:2021
#3  0x00007f7a769fdd49 in dht_migrate_file (this=0x7f7a7001e9b0, loc=<optimized out>, from=0x7f7a7001da90, to=0x7f7a7001c5a0, 
    flag=<optimized out>) at dht-rebalance.c:1644
#4  0x00007f7a84a13e02 in synctask_wrap (old_task=<optimized out>) at syncop.c:380
#5  0x00007f7a830d2110 in ?? () from /lib64/libc.so.6
#6  0x0000000000000000 in ?? ()


loc is optimized out but tmp_loc is not.

(gdb) p tmp_loc
$5 = {path = 0x7f7a2000acb0 "/dirs/dir.2/testfile.919", name = 0x0, inode = 0x7f7a6e814e5c, parent = 0x0, 
  gfid = "\315\373\232\366\212\341OÙ :&R\273\\\330>", pargfid = '\000' <repeats 15 times>}
(gdb) 

From the rebalance log file:

[2016-01-06 10:14:14.801309] E [MSGID: 109023] [dht-rebalance.c:598:__dht_rebalance_create_dst_file] 0-disperse_vol1-tier-dht: /dirs/dir.2/testfile.919: file does not existson disperse_vol1-cold-dht (No such file or directory)

Examining the code, the dst_fd is unrefed twice.

Once in __dht_rebalance_create_dst_file:

        if (dst_fd)
                *dst_fd = fd;
...
        if (-ret == ENOENT) {
                gf_msg (this->name, GF_LOG_ERROR, 0,
                        DHT_MSG_MIGRATE_FILE_FAILED, "%s: file does not exists" 
                        "on %s (%s)", loc->path, to->name, strerror (-ret));
                ret = -1;
                fd_unref (fd);
                goto out;
        }

and once again in dht_migrate_file () -> syncop_close (dst_fd)

The core dump does not show an invalid inode but I was able to reproduce the crash by setting ret to -ENOENT in gdb.

Comment 6 Nithya Balachandran 2016-01-08 10:44:17 UTC
Patch posted upstream.

Comment 8 Nithya Balachandran 2016-01-11 07:04:51 UTC
Downstream patch :
https://code.engineering.redhat.com/gerrit/#/c/65199/

Comment 9 Bhaskarakiran 2016-01-27 09:17:28 UTC
Verified this on 3.7.5-17 and didn't see the crash. Marking this as verified.

Comment 11 errata-xmlrpc 2016-03-01 06:06:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html


Note You need to log in before you can comment on or make changes to this bug.