Bug 1001919 - Quota-build3: All the bricks of a distributed-replicate volume are eventually crashed upon executing quota command
Summary: Quota-build3: All the bricks of a distributed-replicate volume are eventually...
Keywords:
Status: CLOSED DUPLICATE of bug 1001631
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: 2.1
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Bug Updates Notification Mailing List
QA Contact: Sudhir D
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-08-28 06:42 UTC by shylesh
Modified: 2013-09-12 07:37 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-09-12 07:37:04 UTC
Embargoed:


Attachments (Terms of Use)

Description shylesh 2013-08-28 06:42:08 UTC
Description of problem:
after executing many quota commands eventually all the bricks of a distributed-replicate volume are crashed

Version-Release number of selected component (if applicable):
[root@gqac024 tmp]# rpm -qa| grep gluster
glusterfs-libs-3.4.0.20rhsquota5-1.el6rhs.x86_64
glusterfs-api-3.4.0.20rhsquota5-1.el6rhs.x86_64
glusterfs-devel-3.4.0.20rhsquota5-1.el6rhs.x86_64
glusterfs-3.4.0.20rhsquota5-1.el6rhs.x86_64
glusterfs-geo-replication-3.4.0.20rhsquota5-1.el6rhs.x86_64
glusterfs-api-devel-3.4.0.20rhsquota5-1.el6rhs.x86_64
glusterfs-rdma-3.4.0.20rhsquota5-1.el6rhs.x86_64
glusterfs-debuginfo-3.4.0.20rhsquota5-1.el6rhs.x86_64
glusterfs-fuse-3.4.0.20rhsquota5-1.el6rhs.x86_64
glusterfs-server-3.4.0.20rhsquota5-1.el6rhs.x86_64


How reproducible:


Steps to Reproduce:
1. I had a 4x2 distributed-replicate volume
2. randomly limiting usage on root and subdirectories and executing "quota list"
command


Actual results:
eventually all the bricks of the volume were crashed


Additional info:

Volume Name: dist-rep
Type: Distributed-Replicate
Volume ID: 5926689f-441d-40e7-b65a-e890eb886b09
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: gqac022.sbu.lab.eng.bos.redhat.com:/home/dr0
Brick2: gqac023.sbu.lab.eng.bos.redhat.com:/home/dr0
Brick3: gqac024.sbu.lab.eng.bos.redhat.com:/home/dr1
Brick4: gqac025.sbu.lab.eng.bos.redhat.com:/home/dr1
Brick5: gqac022.sbu.lab.eng.bos.redhat.com:/home/dr2
Brick6: gqac023.sbu.lab.eng.bos.redhat.com:/home/dr2
Brick7: gqac024.sbu.lab.eng.bos.redhat.com:/home/dr3
Brick8: gqac025.sbu.lab.eng.bos.redhat.com:/home/dr3
Options Reconfigured:
features.quota: on


cluster info
=========
gqac022.sbu.lab.eng.bos.redhat.com:
gqac023.sbu.lab.eng.bos.redhat.com:
gqac024.sbu.lab.eng.bos.redhat.com:
gqac025.sbu.lab.eng.bos.redhat.com:

mounted on 
============
gqac022.sbu.lab.eng.bos.redhat.com

mount point
==========
/mnt




Core was generated by `/usr/sbin/glusterfsd -s gqac024.sbu.lab.eng.bos.redhat.com --volfile-id dist-re'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000003178332d5f in __strlen_sse42 () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.107.el6_4.2.x86_64 keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.10.3-10.el6_4.4.x86_64 libaio-0.3.107-10.el6.x86_64 libcom_err-1.41.12-14.el6_4.2.x86_64 libgcc-4.4.7-3.el6.x86_64 libselinux-2.0.94-5.3.el6_4.1.x86_64 openssl-1.0.0-27.el6_4.2.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0  0x0000003178332d5f in __strlen_sse42 () from /lib64/libc.so.6
#1  0x00007f463da5e629 in posix_make_ancestryfromgfid (this=0xa20d70, path=0x7f46374179b0 "", pathsize=4097, head=0x7f4620003350, 
    type=1, gfid=<value optimized out>, handle_size=64, priv_base_path=0xa4e4d0 "/home/dr3", itable=0xa595f0, parent=0x7f46374189b8, 
    xdata=0x7f4640c8564c) at posix-handle.c:155
#2  0x00007f463da583dd in posix_get_ancestry_directory (this=0xa20d70, real_path=<value optimized out>, loc=0x7f4640d0dc70, 
    dict=0x7f4640c85994, type=1, op_errno=<value optimized out>, xdata=0x7f4640c8564c) at posix.c:2748
#3  0x00007f463da5b216 in _posix_xattr_get_set (xattr_req=0x7f4Core was generated by `/usr/sbin/glusterfsd -s gqac024.sbu.lab.eng.bos.redhat.com --volfile-id dist-re'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000003178332d5f in __strlen_sse42 () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.107.el6_4.2.x86_64 keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.10.3-10.el6_4.4.x86_64 libaio-0.3.107-10.el6.x86_64 libcom_err-1.41.12-14.el6_4.2.x86_64 libgcc-4.4.7-3.el6.x86_64 libselinux-2.0.94-5.3.el6_4.1.x86_64 openssl-1.0.0-27.el6_4.2.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0  0x0000003178332d5f in __strlen_sse42 () from /lib64/libc.so.6
#1  0x00007f463da5e629 in posix_make_ancestryfromgfid (this=0xa20d70, path=0x7f46374179b0 "", pathsize=4097, head=0x7f4620003350, 
    type=1, gfid=<value optimized out>, handle_size=64, priv_base_path=0xa4e4d0 "/home/dr3", itable=0xa595f0, parent=0x7f46374189b8, 
    xdata=0x7f4640c8564c) at posix-handle.c:155
#2  0x00007f463da583dd in posix_get_ancestry_directory (this=0xa20d70, real_path=<value optimized out>, loc=0x7f4640d0dc70, 
    dict=0x7f4640c85994, type=1, op_errno=<value optimized out>, xdata=0x7f4640c8564c) at posix.c:2748
#3  0x00007f463da5b216 in _posix_xattr_get_set (xattr_req=0x7f4640c8564c, key=<value optimized out>, data=0x7f4640aa9200, 
    xattrargs=0x7f4637418a80) at posix-helpers.c:319
#4  0x00007f4642478025 in dict_foreach (dict=0x7f4640c8564c, fn=0x7f463da5afc0 <_posix_xattr_get_set>, data=0x7f4637418a80)
    at dict.c:1109
#5  0x00007f463da5a8f5 in posix_lookup_xattr_fill (this=0xa20d70, real_path=0x7f4637418b20 "/home/dr3/another/", loc=0x7f4640d0dc70, 
    xattr_req=0x7f4640c8564c, buf=<value optimized out>) at posix-helpers.c:558
#6  0x00007f463da57401 in posix_lookup (frame=0x7f46412d3a68, this=0xa20d70, loc=0x7f4640d0dc70, xdata=0x7f4640c8564c) at posix.c:153
#7  0x00007f464247fefd in default_lookup (frame=0x7f46412d3a68, this=0xa225a0, loc=0x7f4640d0dc70, xdata=0x7f4640c8564c)
    at defaults.c:1253
#8  0x00007f463d42a8c2 in posix_acl_lookup (frame=0x7f46412d39bc, this=0xa23760, loc=0x7f4640d0dc70, xattr=<value optimized out>)
    at posix-acl.c:793
#9  0x00007f463d212892 in pl_lookup (frame=0x7f46412d3910, this=0xa248c0, loc=0x7f4640d0dc70, xdata=0x7f4640c8564c) at posix.c:2081
#10 0x00007f463cffeebc in iot_lookup_wrapper (frame=0x7f46412d3864, this=0xa258e0, loc=0x7f4640d0dc70, xdata=0x7f4640c8564c)
    at io-threads.c:346
#11 0x00007f4642494172 in call_resume_wind (stub=0x7f4640d0dc30) at call-stub.c:2312
#12 call_resume (stub=0x7f4640d0dc30) at call-stub.c:2645
#13 0x00007f463d0039f8 in iot_worker (data=0xa49f90) at io-threads.c:191
#14 0x0000003178a07851 in start_thread () from /lib64/libpthread.so.0
#15 0x00000031782e890d in clone () from /lib64/libc.so.6
640c8564c, key=<value optimized out>, data=0x7f4640aa9200, 
    xattrargs=0x7f4637418a80) at posix-helpers.c:319
#4  0x00007f4642478025 in dict_foreach (dict=0x7f4640c8564c, fn=0x7f463da5afc0 <_posix_xattr_get_set>, data=0x7f4637418a80)
    at dict.c:1109
#5  0x00007f463da5a8f5 in posix_lookup_xattr_fill (this=0xa20d70, real_path=0x7f4637418b20 "/home/dr3/another/", loc=0x7f4640d0dc70, 
    xattr_req=0x7f4640c8564c, buf=<value optimized out>) at posix-helpers.c:558
#6  0x00007f463da57401 in posix_lookup (frame=0x7f46412d3a68, this=0xa20d70, loc=0x7f4640d0dc70, xdata=0x7f4640c8564c) at posix.c:153
#7  0x00007f464247fefd in default_lookup (frame=0x7f46412d3a68, this=0xa225a0, loc=0x7f4640d0dc70, xdata=0x7f4640c8564c)
    at defaults.c:1253
#8  0x00007f463d42a8c2 in posix_acl_lookup (frame=0x7f46412d39bc, this=0xa23760, loc=0x7f4640d0dc70, xattr=<value optimized out>)
    at posix-acl.c:793
#9  0x00007f463d212892 in pl_lookup (frame=0x7f46412d3910, this=0xa248c0, loc=0x7f4640d0dc70, xdata=0x7f4640c8564c) at posix.c:2081
#10 0x00007f463cffeebc in iot_lookup_wrapper (frame=0x7f46412d3864, this=0xa258e0, loc=0x7f4640d0dc70, xdata=0x7f4640c8564c)
    at io-threads.c:346
#11 0x00007f4642494172 in call_resume_wind (stub=0x7f4640d0dc30) at call-stub.c:2312
#12 call_resume (stub=0x7f4640d0dc30) at call-stub.c:2645
#13 0x00007f463d0039f8 in iot_worker (data=0xa49f90) at io-threads.c:191
#14 0x0000003178a07851 in start_thread () from /lib64/libpthread.so.0
#15 0x00000031782e890d in clone () from /lib64/libc.so.6



bt full
=======

(gdb) bt full
#0  0x0000003178332d5f in __strlen_sse42 () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007f463da5e629 in posix_make_ancestryfromgfid (this=0xa20d70, path=0x7f46374179b0 "", pathsize=4097, head=0x7f4620003350, 
    type=1, gfid=<value optimized out>, handle_size=64, priv_base_path=0xa4e4d0 "/home/dr3", itable=0xa595f0, parent=0x7f46374189b8, 
    xdata=0x7f4640c8564c) at posix-handle.c:155
        linkname = 0x7f4637417620 ""
        dir_handle = <value optimized out>
        dir_name = 0x0
        pgfidstr = <value optimized out>
        saveptr = <value optimized out>
        len = <value optimized out>
        inode = 0x0
        iabuf = {ia_ino = 0, ia_gfid = '\000' <repeats 15 times>, ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', 
            sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {
              read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, 
          ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, 
          ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0}
        ret = -1
        tmp_gfid = '\000' <repeats 15 times>
#2  0x00007f463da583dd in posix_get_ancestry_directory (this=0xa20d70, real_path=<value optimized out>, loc=0x7f4640d0dc70, 
    dict=0x7f4640c85994, type=1, op_errno=<value optimized out>, xdata=0x7f4640c8564c) at posix.c:2748
        size = 0
        handle_size = 64
        value = 0x0
        priv = <value optimized out>
        head = 0x7f4620003350
        dirpath = '\000' <repeats 4096 times>
        inode = 0x0
        ret = -1
        __FUNCTION__ = "posix_get_ancestry_directory"
#3  0x00007f463da5b216 in _posix_xattr_get_set (xattr_req=0x7f4640c8564c, key=<value optimized out>, data=0x7f4640aa9200, 
    xattrargs=0x7f4637418a80) at posix-helpers.c:319
        filler = 0x7f4637418a80
        ret = -1
        databuf = 0x0
        _fd = -1
        loc = 0x0
        req_size = 0
        __FUNCTION__ = "_posix_xattr_get_set"
#4  0x00007f4642478025 in dict_foreach (dict=0x7f4640c8564c, fn=0x7f463da5afc0 <_posix_xattr_get_set>, data=0x7f4637418a80)
    at dict.c:1109
        __FUNCTION__ = "dict_foreach"
        ret = <value optimized out>
        pairs = <value optimized out>
        next = 0x7f4640b7a9fc
#5  0x00007f463da5a8f5 in posix_lookup_xattr_fill (this=0xa20d70, real_path=0x7f4637418b20 "/home/dr3/another/", loc=0x7f4640d0dc70, 
    xattr_req=0x7f4640c8564c, buf=<value optimized out>) at posix-helpers.c:558
        xattr = 0x7f4640c85994
        filler = {this = 0xa20d70, real_path = 0x7f4637418b20 "/home/dr3/another/", xattr = 0x7f4640c85994, stbuf = 0x7f4637418c00, 
          loc = 0x7f4640d0dc70, inode = 0x0, fd = 0, flags = 0, op_errno = 0}
#6  0x00007f463da57401 in posix_lookup (frame=0x7f46412d3a68, this=0xa20d70, loc=0x7f4640d0dc70, xdata=0x7f4640c8564c) at posix.c:153
        buf = {ia_ino = 9402059841903500362, ia_gfid = "\261TJ\032\070|F\247\202zӜ\251\237\234J", ia_dev = 64770, 
          ia_type = IA_IFDIR, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 1 '\001', 
              write = 1 '\001', exec = 1 '\001'}, group = {read = 1 '\001', write = 0 '\000', exec = 1 '\001'}, other = {
              read = 1 '\001', write = 0 '\000', exec = 1 '\001'}}, ia_nlink = 2, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 6, 
          ia_blksize = 4096, ia_blocks = 0, ia_atime = 1377670599, ia_atime_nsec = 276521591, ia_mtime = 1377670599, 
          ia_mtime_nsec = 276521591, ia_ctime = 1377670599, ia_ctime_nsec = 281521788}
        op_ret = 0
        entry_ret = 0
        op_errno = 0
        xattr = 0x0
        real_path = 0x7f4637418b20 "/home/dr3/another/"
        par_path = 0x0
        postparent = {ia_ino = 0, ia_gfid = '\000' <repeats 15 times>, ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', 
            sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {
              read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, 
          ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, 
          ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0}
        gfidless = 1
        pgfid_xattr_key = 0x0
        nlink_samepgfid = 0
        __FUNCTION__ = "posix_lookup"
#7  0x00007f464247fefd in default_lookup (frame=0x7f46412d3a68, this=0xa225a0, loc=0x7f4640d0dc70, xdata=0x7f4640c8564c)
    at defaults.c:1253
        old_THIS = 0xa225a0
#8  0x00007f463d42a8c2 in posix_acl_lookup (frame=0x7f46412d39bc, this=0xa23760, loc=0x7f4640d0dc70, xattr=<value optimized out>)
    at posix-acl.c:793
        _new = 0x7f46412d3a68
        old_THIS = 0xa23760
        tmp_cbk = 0x7f463d42bf20 <posix_acl_lookup_cbk>
        ret = <value optimized out>
        my_xattr = 0x7f4640c8564c
        __FUNCTION__ = "posix_acl_lookup"
#9  0x00007f463d212892 in pl_lookup (frame=0x7f46412d3910, this=0xa248c0, loc=0x7f4640d0dc70, xdata=0x7f4640c8564c) at posix.c:2081
        _new = 0x7f46412d39bc
        old_THIS = 0xa248c0
        tmp_cbk = 0x7f463d212fd0 <pl_lookup_cbk>
        local = <value optimized out>
        __FUNCTION__ = "pl_lookup"
#10 0x00007f463cffeebc in iot_lookup_wrapper (frame=0x7f46412d3864, this=0xa258e0, loc=0x7f4640d0dc70, xdata=0x7f4640c8564c)
    at io-threads.c:346
        _new = 0x7f46412d3910
        old_THIS = 0xa258e0
        tmp_cbk = 0x7f463cffa4e0 <iot_lookup_cbk>
        __FUNCTION__ = "iot_lookup_wrapper"
#11 0x00007f4642494172 in call_resume_wind (stub=0x7f4640d0dc30) at call-stub.c:2312
No locals.
#12 call_resume (stub=0x7f4640d0dc30) at call-stub.c:2645
        old_THIS = 0xa258e0
        __FUNCTION__ = "call_resume"
#13 0x00007f463d0039f8 in iot_worker (data=0xa49f90) at io-threads.c:191
        conf = 0xa49f90
        this = <value optimized out>
        stub = <value optimized out>
        sleep_till = {tv_sec = 1377670722, tv_nsec = 0}
        ret = <value optimized out>
        pri = 0
        timeout = 0 '\000'
        bye = 0 '\000'
        sleep = {tv_sec = 0, tv_nsec = 0}
        __FUNCTION__ = "iot_worker"
#14 0x0000003178a07851 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#15 0x00000031782e890d in clone () from /lib64/libc.so.6
No symbol table info available.


attached the sosreports

Comment 3 Raghavendra G 2013-09-12 07:37:04 UTC
Should be fixed in v3.4.0.30rhs. If the release was made on bigbend-quota downstream branch, following patch fixes the issue
https://code.engineering.redhat.com/gerrit/12036

Most likely a duplicate of BZ 1001631

*** This bug has been marked as a duplicate of bug 1001631 ***


Note You need to log in before you can comment on or make changes to this bug.