Bug 1227469 - should not spawn another migration daemon on graph switch
Summary: should not spawn another migration daemon on graph switch
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: tier
Version: rhgs-3.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: RHGS 3.1.0
Assignee: Dan Lambright
QA Contact: Nag Pavan Chilakam
URL:
Whiteboard:
Depends On: 1226005 1259078
Blocks: 1202842
TreeView+ depends on / blocked
 
Reported: 2015-06-02 17:49 UTC by Dan Lambright
Modified: 2016-09-17 15:45 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.7.1-2
Doc Type: Bug Fix
Doc Text:
Clone Of: 1226005
Environment:
Last Closed: 2015-07-29 04:55:22 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:1495 0 normal SHIPPED_LIVE Important: Red Hat Gluster Storage 3.1 update 2015-07-29 08:26:26 UTC

Description Dan Lambright 2015-06-02 17:49:52 UTC
+++ This bug was initially created as a clone of Bug #1226005 +++

When we did a graph switch on a rebalance daemon, a second call to gf_degrag_start() was done. This lead to multiple threads doing migration. When multiple threads try to move the same file there can be deadlocks.

--- Additional comment from Anand Avati on 2015-05-28 14:01:39 EDT ---

REVIEW: http://review.gluster.org/10977 (When we did a graph switch on a rebalance daemon, a second call to gf_degrag_start() was done. This lead to multiple threads doing migration. When multiple threads try to move the same file there can be deadlocks.) posted (#1) for review on master by Dan Lambright (dlambrig)

--- Additional comment from Anand Avati on 2015-05-28 17:29:10 EDT ---

REVIEW: http://review.gluster.org/10977 (When we did a graph switch on a rebalance daemon, a second call to gf_degrag_start() was done. This lead to multiple threads doing migration. When multiple threads try to move the same file there can be deadlocks.) posted (#3) for review on master by Dan Lambright (dlambrig)

--- Additional comment from Anand Avati on 2015-05-29 15:56:51 EDT ---

REVIEW: http://review.gluster.org/10977 (When we did a graph switch on a rebalance daemon, a second call to gf_degrag_start() was done. This lead to multiple threads doing migration. When multiple threads try to move the same file there can be deadlocks.) posted (#4) for review on master by Dan Lambright (dlambrig)

--- Additional comment from Anand Avati on 2015-05-30 06:34:53 EDT ---

REVIEW: http://review.gluster.org/10977 (When we did a graph switch on a rebalance daemon, a second call to gf_degrag_start() was done. This lead to multiple threads doing migration. When multiple threads try to move the same file there can be deadlocks.) posted (#5) for review on master by Niels de Vos (ndevos)

--- Additional comment from Anand Avati on 2015-05-30 11:59:50 EDT ---

REVIEW: http://review.gluster.org/10977 (cluster/dht: maintain start state of rebalance daemon across graph switch.) posted (#6) for review on master by Dan Lambright (dlambrig)

--- Additional comment from Anand Avati on 2015-06-01 14:12:41 EDT ---

COMMIT: http://review.gluster.org/10977 committed in master by Vijay Bellur (vbellur) 
------
commit 3f11b8e8ec6d78ebe33636b64130d5d133729f2c
Author: Dan Lambright <dlambrig>
Date:   Thu May 28 14:00:37 2015 -0400

    cluster/dht: maintain start state of rebalance daemon across graph switch.
    
    When we did a graph switch on a rebalance daemon, a second call
    to gf_degrag_start() was done. This lead to multiple threads
    doing migration. When multiple threads try to move the same
    file there can be deadlocks.
    
    Change-Id: I931ca7fe600022f245e3dccaabb1ad004f732c56
    BUG: 1226005
    Signed-off-by: Dan Lambright <dlambrig>
    Reviewed-on: http://review.gluster.org/10977
    Tested-by: NetBSD Build System <jenkins.org>
    Reviewed-by: Shyamsundar Ranganathan <srangana>

Comment 2 Nag Pavan Chilakam 2015-07-06 14:23:59 UTC
hi Dan,
Can you kindly tell what QE must do to verify this bug?

thanks,
nagpavan

Comment 3 Nag Pavan Chilakam 2015-07-15 06:57:47 UTC
### QE validation Testcase####
1)created a tier volume
2)checked the tier rebalance process was started and noted its PID

[root@nchilaka-tier-01 ~]# ps -ef|grep tier
root     15440 15015  0 17:33 pts/0    00:00:00 grep tier
root     30815     1  0 00:06 ?        00:00:08 /usr/sbin/glusterfs -s localhost --volfile-id rebalance/vol1 --xlator-option *dht.use-readdirp=yes --xlator-option *dht.lookup-unhashed=yes --xlator-option *dht.assert-no-child-down=yes --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=off --xlator-option *replicate*.readdir-failover=off --xlator-option *dht.readdir-optimize=on --xlator-option *tier-dht.xattr-name=trusted.tier-gfid --xlator-option *dht.rebalance-cmd=6 --xlator-option *dht.node-uuid=d1a42b37-f19d-4b76-a676-a2443b0ccec8 --xlator-option *dht.commit-hash=2905257114 --socket-file /var/run/gluster/gluster-tier-bc3c0dd1-dfd0-4a58-9652-aa231333202c.sock --pid-file /var/lib/glusterd/vols/vol1/tier/d1a42b37-f19d-4b76-a676-a2443b0ccec8.pid -l /var/log/glusterfs/vol1-tier.log

3)Now did a pstack of the PID to see the tier_start thread information

[root@nchilaka-tier-01 ~]# pstack 30815|grep tier
#2  0x00007f6420121895 in tier_start () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/tier.so

4)Now turned on and off some performance options of volume and played with them


The, Previous Behavior, was that there was another tier_start thread triggered along with the existing. This meant there were 2 threads for rebalance of tier and hence could cause deadlock.

Current Fix: Now there is only one thread still and hence no deadlock
Also the PID was not killed.




test version :
=============
[root@nchilaka-tier-01 ~]# gluster --version
glusterfs 3.7.1 built on Jul 12 2015 22:27:42
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.
[root@nchilaka-tier-01 ~]# rpm -qa|grep gluster
gluster-nagios-common-0.2.0-1.el6rhs.noarch
glusterfs-3.7.1-9.el6rhs.x86_64
glusterfs-cli-3.7.1-9.el6rhs.x86_64
gluster-nagios-addons-0.2.4-4.el6rhs.x86_64
glusterfs-libs-3.7.1-9.el6rhs.x86_64
glusterfs-client-xlators-3.7.1-9.el6rhs.x86_64
glusterfs-api-3.7.1-9.el6rhs.x86_64
glusterfs-server-3.7.1-9.el6rhs.x86_64
glusterfs-rdma-3.7.1-9.el6rhs.x86_64
python-gluster-3.7.1-9.el6rhs.x86_64
vdsm-gluster-4.16.20-1.2.el6rhs.noarch
glusterfs-fuse-3.7.1-9.el6rhs.x86_64
glusterfs-geo-replication-3.7.1-9.el6rhs.x86_64
[root@nchilaka-tier-01 ~]# 







CLI logs:
========
[root@nchilaka-tier-01 ~]# gluster --version
glusterfs 3.7.1 built on Jul 12 2015 22:27:42
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.
[root@nchilaka-tier-01 ~]# rpm -qa|grep gluster
gluster-nagios-common-0.2.0-1.el6rhs.noarch
glusterfs-3.7.1-9.el6rhs.x86_64
glusterfs-cli-3.7.1-9.el6rhs.x86_64
gluster-nagios-addons-0.2.4-4.el6rhs.x86_64
glusterfs-libs-3.7.1-9.el6rhs.x86_64
glusterfs-client-xlators-3.7.1-9.el6rhs.x86_64
glusterfs-api-3.7.1-9.el6rhs.x86_64
glusterfs-server-3.7.1-9.el6rhs.x86_64
glusterfs-rdma-3.7.1-9.el6rhs.x86_64
python-gluster-3.7.1-9.el6rhs.x86_64
vdsm-gluster-4.16.20-1.2.el6rhs.noarch
glusterfs-fuse-3.7.1-9.el6rhs.x86_64
glusterfs-geo-replication-3.7.1-9.el6rhs.x86_64
[root@nchilaka-tier-01 ~]# 



######################

bash-4.3$ ssh root.42.129
root.42.129's password: 
Last login: Tue Jul 14 21:30:15 2015 from dhcp35-163.lab.eng.blr.redhat.com
[root@nchilaka-tier-01 ~]# ps -ef|grep tier
root     15119 15015  0 17:30 pts/0    00:00:00 grep tier
root     30815     1  0 00:06 ?        00:00:08 /usr/sbin/glusterfs -s localhost --volfile-id rebalance/vol1 --xlator-option *dht.use-readdirp=yes --xlator-option *dht.lookup-unhashed=yes --xlator-option *dht.assert-no-child-down=yes --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=off --xlator-option *replicate*.readdir-failover=off --xlator-option *dht.readdir-optimize=on --xlator-option *tier-dht.xattr-name=trusted.tier-gfid --xlator-option *dht.rebalance-cmd=6 --xlator-option *dht.node-uuid=d1a42b37-f19d-4b76-a676-a2443b0ccec8 --xlator-option *dht.commit-hash=2905257114 --socket-file /var/run/gluster/gluster-tier-bc3c0dd1-dfd0-4a58-9652-aa231333202c.sock --pid-file /var/lib/glusterd/vols/vol1/tier/d1a42b37-f19d-4b76-a676-a2443b0ccec8.pid -l /var/log/glusterfs/vol1-tier.log
[root@nchilaka-tier-01 ~]# pstack 30815
Thread 19 (Thread 0x7f642547f700 (LWP 30816)):
#0  0x00007f642d031fbd in nanosleep () from /lib64/libpthread.so.0
#1  0x00007f642df6168a in gf_timer_proc () from /usr/lib64/libglusterfs.so.0
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 18 (Thread 0x7f6424a7e700 (LWP 30817)):
#0  0x00007f642d032535 in sigwait () from /lib64/libpthread.so.0
#1  0x00007f642e40902b in glusterfs_sigwaiter ()
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 17 (Thread 0x7f642407d700 (LWP 30818)):
#0  0x00007f642c958aad in nanosleep () from /lib64/libc.so.6
#1  0x00007f642c958920 in sleep () from /lib64/libc.so.6
#2  0x00007f6420121895 in tier_start () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/tier.so
#3  0x00007f642034fd27 in gf_defrag_start_crawl () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#4  0x00007f642df872b2 in synctask_wrap () from /usr/lib64/libglusterfs.so.0
#5  0x00007f642c8ef8f0 in ?? () from /lib64/libc.so.6
#6  0x0000000000000000 in ?? ()
Thread 16 (Thread 0x7f642367c700 (LWP 30819)):
#0  0x00007f642d02ea0e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642df86d6b in syncenv_task () from /usr/lib64/libglusterfs.so.0
#2  0x00007f642df8bb80 in syncenv_processor () from /usr/lib64/libglusterfs.so.0
#3  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 15 (Thread 0x7f6421449700 (LWP 30820)):
#0  0x00007f642c994f63 in epoll_wait () from /lib64/libc.so.6
#1  0x00007f642dfa38c1 in ?? () from /usr/lib64/libglusterfs.so.0
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 14 (Thread 0x7f6419fce700 (LWP 30824)):
#0  0x00007f642c994f63 in epoll_wait () from /lib64/libc.so.6
#1  0x00007f642dfa38c1 in ?? () from /usr/lib64/libglusterfs.so.0
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 13 (Thread 0x7f640fce8700 (LWP 30830)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034d995 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 12 (Thread 0x7f640f2e7700 (LWP 30831)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034d995 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 11 (Thread 0x7f640e8e6700 (LWP 30832)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034d995 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 10 (Thread 0x7f640dee5700 (LWP 30833)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034d995 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 9 (Thread 0x7f640d4e4700 (LWP 30834)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034d995 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 8 (Thread 0x7f640cae3700 (LWP 30835)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034d995 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 7 (Thread 0x7f63f3fff700 (LWP 30836)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034db13 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 6 (Thread 0x7f63f35fe700 (LWP 30837)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034db13 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 5 (Thread 0x7f63f2bfd700 (LWP 30838)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034db13 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x7f63f21fc700 (LWP 30839)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034db13 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x7f63f17fb700 (LWP 30840)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034db13 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7f63f0dfa700 (LWP 30841)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034db13 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7f642e3ee740 (LWP 30815)):
#0  0x00007f642d02b2ad in pthread_join () from /lib64/libpthread.so.0
#1  0x00007f642dfa353d in ?? () from /usr/lib64/libglusterfs.so.0
#2  0x00007f642e40aef1 in main ()
[root@nchilaka-tier-01 ~]# pstack 30815|grep tier
#2  0x00007f6420121895 in tier_start () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/tier.so
[root@nchilaka-tier-01 ~]# pstack 30815
Thread 19 (Thread 0x7f642547f700 (LWP 30816)):
#0  0x00007f642d031fbd in nanosleep () from /lib64/libpthread.so.0
#1  0x00007f642df6168a in gf_timer_proc () from /usr/lib64/libglusterfs.so.0
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 18 (Thread 0x7f6424a7e700 (LWP 30817)):
#0  0x00007f642d032535 in sigwait () from /lib64/libpthread.so.0
#1  0x00007f642e40902b in glusterfs_sigwaiter ()
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 17 (Thread 0x7f642407d700 (LWP 30818)):
#0  0x00007f642c958aad in nanosleep () from /lib64/libc.so.6
#1  0x00007f642c958920 in sleep () from /lib64/libc.so.6
#2  0x00007f6420121895 in tier_start () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/tier.so
#3  0x00007f642034fd27 in gf_defrag_start_crawl () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#4  0x00007f642df872b2 in synctask_wrap () from /usr/lib64/libglusterfs.so.0
#5  0x00007f642c8ef8f0 in ?? () from /lib64/libc.so.6
#6  0x0000000000000000 in ?? ()
Thread 16 (Thread 0x7f642367c700 (LWP 30819)):
#0  0x00007f642d02ea0e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642df86d6b in syncenv_task () from /usr/lib64/libglusterfs.so.0
#2  0x00007f642df8bb80 in syncenv_processor () from /usr/lib64/libglusterfs.so.0
#3  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 15 (Thread 0x7f6421449700 (LWP 30820)):
#0  0x00007f642c994f63 in epoll_wait () from /lib64/libc.so.6
#1  0x00007f642dfa38c1 in ?? () from /usr/lib64/libglusterfs.so.0
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 14 (Thread 0x7f6419fce700 (LWP 30824)):
#0  0x00007f642c994f63 in epoll_wait () from /lib64/libc.so.6
#1  0x00007f642dfa38c1 in ?? () from /usr/lib64/libglusterfs.so.0
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 13 (Thread 0x7f640fce8700 (LWP 30830)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034d995 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 12 (Thread 0x7f640f2e7700 (LWP 30831)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034d995 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 11 (Thread 0x7f640e8e6700 (LWP 30832)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034d995 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 10 (Thread 0x7f640dee5700 (LWP 30833)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034d995 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 9 (Thread 0x7f640d4e4700 (LWP 30834)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034d995 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 8 (Thread 0x7f640cae3700 (LWP 30835)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034d995 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 7 (Thread 0x7f63f3fff700 (LWP 30836)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034db13 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 6 (Thread 0x7f63f35fe700 (LWP 30837)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034db13 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 5 (Thread 0x7f63f2bfd700 (LWP 30838)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034db13 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x7f63f21fc700 (LWP 30839)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034db13 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x7f63f17fb700 (LWP 30840)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034db13 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7f63f0dfa700 (LWP 30841)):
#0  0x00007f642d02e63c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f642034db13 in gf_defrag_task () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/distribute.so
#2  0x00007f642d02aa51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f642c99496d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7f642e3ee740 (LWP 30815)):
#0  0x00007f642d02b2ad in pthread_join () from /lib64/libpthread.so.0
#1  0x00007f642dfa353d in ?? () from /usr/lib64/libglusterfs.so.0
#2  0x00007f642e40aef1 in main ()
[root@nchilaka-tier-01 ~]# pstack 30815tier
Process 30815tier not found.
[root@nchilaka-tier-01 ~]# pstack 30815|grep tier
#2  0x00007f6420121895 in tier_start () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/tier.so
[root@nchilaka-tier-01 ~]# pstack 30815|grep tier_start
#2  0x00007f6420121895 in tier_start () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/tier.so
[root@nchilaka-tier-01 ~]# pstack 30815|grep tier_start
#2  0x00007f6420121895 in tier_start () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/tier.so
[root@nchilaka-tier-01 ~]# pstack 30815|grep tier_start
#2  0x00007f6420121895 in tier_start () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/tier.so
[root@nchilaka-tier-01 ~]# ps -ef|grep tier
root     15354 15015  0 17:32 pts/0    00:00:00 grep tier
root     30815     1  0 00:06 ?        00:00:08 /usr/sbin/glusterfs -s localhost --volfile-id rebalance/vol1 --xlator-option *dht.use-readdirp=yes --xlator-option *dht.lookup-unhashed=yes --xlator-option *dht.assert-no-child-down=yes --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=off --xlator-option *replicate*.readdir-failover=off --xlator-option *dht.readdir-optimize=on --xlator-option *tier-dht.xattr-name=trusted.tier-gfid --xlator-option *dht.rebalance-cmd=6 --xlator-option *dht.node-uuid=d1a42b37-f19d-4b76-a676-a2443b0ccec8 --xlator-option *dht.commit-hash=2905257114 --socket-file /var/run/gluster/gluster-tier-bc3c0dd1-dfd0-4a58-9652-aa231333202c.sock --pid-file /var/lib/glusterd/vols/vol1/tier/d1a42b37-f19d-4b76-a676-a2443b0ccec8.pid -l /var/log/glusterfs/vol1-tier.log
[root@nchilaka-tier-01 ~]# 
[root@nchilaka-tier-01 ~]# pstack 30815|grep tier
#2  0x00007f6420121895 in tier_start () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/tier.so
[root@nchilaka-tier-01 ~]# pstack 30815|grep tier
#2  0x00007f6420121895 in tier_start () from /usr/lib64/glusterfs/3.7.1/xlator/cluster/tier.so
[root@nchilaka-tier-01 ~]# ps -ef|grep tier
root     15440 15015  0 17:33 pts/0    00:00:00 grep tier
root     30815     1  0 00:06 ?        00:00:08 /usr/sbin/glusterfs -s localhost --volfile-id rebalance/vol1 --xlator-option *dht.use-readdirp=yes --xlator-option *dht.lookup-unhashed=yes --xlator-option *dht.assert-no-child-down=yes --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=off --xlator-option *replicate*.readdir-failover=off --xlator-option *dht.readdir-optimize=on --xlator-option *tier-dht.xattr-name=trusted.tier-gfid --xlator-option *dht.rebalance-cmd=6 --xlator-option *dht.node-uuid=d1a42b37-f19d-4b76-a676-a2443b0ccec8 --xlator-option *dht.commit-hash=2905257114 --socket-file /var/run/gluster/gluster-tier-bc3c0dd1-dfd0-4a58-9652-aa231333202c.sock --pid-file /var/lib/glusterd/vols/vol1/tier/d1a42b37-f19d-4b76-a676-a2443b0ccec8.pid -l /var/log/glusterfs/vol1-tier.log
[root@nchilaka-tier-01 ~]# 







Moving bug to verified







[root@nchilaka-tier-01 ~]# 
[root@nchilaka-tier-01 ~]# gluster --version
glusterfs 3.7.1 built on Jul 12 2015 22:27:42
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.
[root@nchilaka-tier-01 ~]# rpm -qa|grep gluster
gluster-nagios-common-0.2.0-1.el6rhs.noarch
glusterfs-3.7.1-9.el6rhs.x86_64
glusterfs-cli-3.7.1-9.el6rhs.x86_64
gluster-nagios-addons-0.2.4-4.el6rhs.x86_64
glusterfs-libs-3.7.1-9.el6rhs.x86_64
glusterfs-client-xlators-3.7.1-9.el6rhs.x86_64
glusterfs-api-3.7.1-9.el6rhs.x86_64
glusterfs-server-3.7.1-9.el6rhs.x86_64
glusterfs-rdma-3.7.1-9.el6rhs.x86_64
python-gluster-3.7.1-9.el6rhs.x86_64
vdsm-gluster-4.16.20-1.2.el6rhs.noarch
glusterfs-fuse-3.7.1-9.el6rhs.x86_64
glusterfs-geo-replication-3.7.1-9.el6rhs.x86_64
[root@nchilaka-tier-01 ~]#

Comment 4 Nag Pavan Chilakam 2015-07-15 07:00:14 UTC
Got the information on how to verify from Rafi

Comment 5 Nag Pavan Chilakam 2015-07-15 09:55:02 UTC
sosreports for qe verified @ [qe-admin@rhsqe-repo bug.1227469]$ pwd
/home/repo/sosreports/bug.1227469

Comment 6 errata-xmlrpc 2015-07-29 04:55:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1495.html


Note You need to log in before you can comment on or make changes to this bug.