DescriptionNag Pavan Chilakam
2015-10-20 10:50:56 UTC
+++ This bug was initially created as a clone of Bug #1273272 +++
Description of problem:
While executing the fs-sanity test suite, the tiering volume throws a segfault
Version-Release number of selected component (if applicable):
glusterfs-3.7.5-1.el7.x86_64
nfs-ganesha-2.3-0.rc6.el7.centos.x86_64
How reproducible:
happen to be first time
Steps to Reproduce:
1. create volume of dist-rep type with tiering enabled.
2. export the volume for nfs-ganesha and mount it with vers=4
3. start executing the fs-sanity test suite
Actual results:
[2015-10-19 16:14:14.701640] E [MSGID: 109037] [tier.c:951:tier_start] 0-vol2-tier-dht: Demotion failed!
[2015-10-19 16:16:00.720267] I [MSGID: 109038] [tier.c:586:tier_build_migration_qfile] 0-vol2-tier-dht: Failed to remove /var/run/gluster/vol2-tier-dht/demotequeryfile-vol2-tier-dht
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2015-10-19 16:16:00
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.5
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7fc0dab7c002]
/lib64/libglusterfs.so.0(gf_print_trace+0x31d)[0x7fc0dab9848d]
/lib64/libc.so.6(+0x35650)[0x7fc0d926a650]
/lib64/libc.so.6(+0x86811)[0x7fc0d92bb811]
/usr/lib64/libgfdb.so.0(gf_sql_query_function+0x101)[0x7fc0cc3e4001]
/usr/lib64/libgfdb.so.0(gf_sqlite3_find_unchanged_for_time+0xd0)[0x7fc0cc3e55d0]
/usr/lib64/libgfdb.so.0(find_unchanged_for_time+0x41)[0x7fc0cc3dfbd1]
/usr/lib64/glusterfs/3.7.5/xlator/cluster/tier.so(+0x57947)[0x7fc0cca0f947]
/lib64/libpthread.so.0(+0x7df5)[0x7fc0d99e4df5]
/lib64/libc.so.6(clone+0x6d)[0x7fc0d932b1ad]
---------
(END)
(gdb) bt
#0 __strlen_sse2 () at ../sysdeps/x86_64/strlen.S:31
#1 0x00007fc0cc3e4001 in gf_sql_query_function (prep_stmt=0x7fc0a042b288, query_callback=query_callback@entry=0x7fc0cca0d640 <tier_gf_query_callback>, _query_cbk_args=_query_cbk_args@entry=0x7fc0b17f9e70)
at gfdb_sqlite3_helper.c:1157
#2 0x00007fc0cc3e55d0 in gf_sqlite3_find_unchanged_for_time (db_conn=0x7fc0a0445440, query_callback=0x7fc0cca0d640 <tier_gf_query_callback>, query_cbk_args=0x7fc0b17f9e70, for_time=0x7fc0b17f9e60)
at gfdb_sqlite3.c:809
#3 0x00007fc0cc3dfbd1 in find_unchanged_for_time (_conn_node=<optimized out>, query_callback=0x7fc0cca0d640 <tier_gf_query_callback>, _query_cbk_args=0x7fc0b17f9e70, for_time=0x7fc0b17f9e60)
at gfdb_data_store.c:506
#4 0x00007fc0cca0f947 in tier_process_brick_cbk (args=<synthetic pointer>, local_brick=0x7fc0b8003680) at tier.c:501
#5 tier_build_migration_qfile (is_promotion=_gf_false, query_cbk_args=0x7fc0b17f9e70, args=0x7fc0c501ecc0) at tier.c:606
#6 tier_demote (args=0x7fc0c501ecc0) at tier.c:666
#7 0x00007fc0d99e4df5 in start_thread (arg=0x7fc0b17fa700) at pthread_create.c:308
#8 0x00007fc0d932b1ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
# gluster volume status vol2
Status of volume: vol2
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Hot Bricks:
Brick 10.70.46.60:/rhs/brick2/cold-d2r21 49158 0 Y 16128
Brick 10.70.46.63:/rhs/brick2/cold-d2r11 49159 0 Y 30149
Brick 10.70.46.64:/rhs/brick2/cold-d1r21 49159 0 Y 23716
Brick 10.70.46.59:/rhs/brick2/cold-d1r11 49159 0 Y 19532
Cold Bricks:
Brick 10.70.46.59:/rhs/brick1/d1r11 49158 0 Y 19412
Brick 10.70.46.64:/rhs/brick1/d1r21 49158 0 Y 23610
Brick 10.70.46.63:/rhs/brick1/d2r11 49158 0 Y 30053
Brick 10.70.46.60:/rhs/brick1/d2r21 49157 0 Y 16026
Task Status of Volume vol2
------------------------------------------------------------------------------
Task : Tier migration
ID : 1dd3374a-f0d5-4aa0-8a79-c1a2776ed72e
Status : in progress
Expected results:
no coredump expected
Additional info:
Not very which test in the test-suite caused the issue
--- Additional comment from Saurabh on 2015-10-20 02:39 EDT ---
Do not plan to support ganesha with tiering for 3.1.2, removing from the list.
Comment 6Joseph Elwin Fernandes
2015-11-11 01:33:50 UTC
the Code where we are seeing the core dump has changed now, with the following fix
"tier/libgfdb: Replacing ASCII query file with binary"
https://code.engineering.redhat.com/gerrit/61006.
Moving this On QA.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://rhn.redhat.com/errata/RHBA-2016-0193.html