Bug 983061 - [RHS-RHOS] Cinder fuse client crashed in afr_fd_has_witnessed_unstable_write after remove-brick operation
[RHS-RHOS] Cinder fuse client crashed in afr_fd_has_witnessed_unstable_write ...
Status: CLOSED DUPLICATE of bug 978802
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterfs (Show other bugs)
2.0
Unspecified Unspecified
unspecified Severity high
: ---
: ---
Assigned To: Amar Tumballi
Sudhir D
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-10 08:03 EDT by Anush Shetty
Modified: 2013-12-18 19:09 EST (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-07-10 08:33:01 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Anush Shetty 2013-07-10 08:03:25 EDT
Description of problem: One a 6x2 Distributed-Replicate volume, we added 2 more bricks using add-brick to make it 7x2 Distributed Replicate cinder volume. We created 10 cinder volumes of 15G each. 10 Nova instances were created. 2 pairs of replica bricks were removed using remove-brick start and then commited the same. And trying to attach the cinder volume to the instances, the cinder fuse process crashed.


Version-Release number of selected component (if applicable):
RHS: glusterfs-3.3.0.11rhs-1.el6rhs.x86_64
Cinder: openstack-cinder-2013.1.2-3.el6ost.noarch
Puddle repo:  http://download.lab.bos.redhat.com/rel-eng/OpenStack/Grizzly/2013-07-08.1/puddle.repo

How reproducible: Filing it first time we saw it.

Steps to Reproduce:
1. Create 6x2 Distributed-Replicate volume
2. Configure cinder to use RHS 
3. Create cinder volumes
4. Remove brick operations on RHS volume
5. Attach cinder volume to instance


Actual results:

Cinder fuse client crashed.

Expected results:

Should be able to seamlessly attach the cinder volumes to the instances after remove brick operations.

Additional info:

1. RHOS hostname: rhs-client28.lab.eng.blr.redhat.com

2. RHS nodes (hostname and IP address): 10.70.37.66, 10.70.37.173, 10.70.37.71, 10.70.37.158

3. RHS node from where the gluster commands were executed: 10.70.37.173

4. Volume info
# gluster volume info
 
Volume Name: cinder-vol
Type: Distributed-Replicate
Volume ID: 19f5abf1-5739-417a-bcff-e56d0a5baa74
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: 10.70.37.66:/brick4/s1
Brick2: 10.70.37.173:/brick4/s2
Brick3: 10.70.37.66:/brick5/s3
Brick4: 10.70.37.173:/brick5/s4
Brick5: 10.70.37.71:/brick4/s7
Brick6: 10.70.37.158:/brick4/s8
Brick7: 10.70.37.71:/brick6/s11
Brick8: 10.70.37.158:/brick6/s12
Options Reconfigured:
storage.owner-gid: 165
storage.owner-uid: 165
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: on

5. Volume status
# gluster volume status cinder-vol
Status of volume: cinder-vol
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick 10.70.37.66:/brick4/s1				24009	Y	2106
Brick 10.70.37.173:/brick4/s2				24009	Y	3243
Brick 10.70.37.66:/brick5/s3				24010	Y	2111
Brick 10.70.37.173:/brick5/s4				24010	Y	3249
Brick 10.70.37.71:/brick4/s7				24009	Y	2683
Brick 10.70.37.158:/brick4/s8				24009	Y	14982
Brick 10.70.37.71:/brick6/s11				24011	Y	2695
Brick 10.70.37.158:/brick6/s12				24011	Y	14992
NFS Server on localhost					38467	Y	15718
Self-heal Daemon on localhost				N/A	Y	15724
NFS Server on 10.70.37.66				38467	Y	4693
Self-heal Daemon on 10.70.37.66				N/A	Y	4699
NFS Server on 10.70.37.71				38467	Y	4660
Self-heal Daemon on 10.70.37.71				N/A	Y	4666
NFS Server on 10.70.37.158				38467	Y	25999
Self-heal Daemon on 10.70.37.158			N/A	Y	26005


6. Mount point on the client: 
/var/lib/cinder/volumes/cf55327cba40506e44b37f45f55af5e7
/var/lib/nova/mnt/cf55327cba40506e44b37f45f55af5e7

7. # tail /var/log/glusterfs/var-lib-nova-mnt-cf55327cba40506e44b37f45f55af5e7.log

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2013-07-10 15:53:01
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.3.0.10rhs
/lib64/libc.so.6[0x362a232920]
/usr/lib64/glusterfs/3.3.0.10rhs/xlator/cluster/replicate.so(afr_fd_has_witnessed_unstable_write+0x32)[0x7f77902c4442]
/usr/lib64/glusterfs/3.3.0.10rhs/xlator/cluster/replicate.so(afr_fsync+0xa6)[0x7f77902eeaa6]
/usr/lib64/glusterfs/3.3.0.10rhs/xlator/cluster/distribute.so(dht_fsync+0x154)[0x7f77900883d4]
/usr/lib64/glusterfs/3.3.0.10rhs/xlator/performance/write-behind.so(wb_fsync+0x292)[0x7f778bdfbeb2]
/usr/lib64/glusterfs/3.3.0.10rhs/xlator/debug/io-stats.so(io_stats_fsync+0x14d)[0x7f778bbe1fcd]
/usr/lib64/libglusterfs.so.0(syncop_fsync+0x174)[0x3b97a50c94]
/usr/lib64/glusterfs/3.3.0.10rhs/xlator/mount/fuse.so(fuse_migrate_fd+0x36b)[0x7f779337d9ab]
/usr/lib64/glusterfs/3.3.0.10rhs/xlator/mount/fuse.so(fuse_handle_opened_fds+0xa4)[0x7f779337e674]
/usr/lib64/glusterfs/3.3.0.10rhs/xlator/mount/fuse.so(+0xe749)[0x7f779337e749]
/usr/lib64/libglusterfs.so.0(synctask_wrap+0x12)[0x3b97a4c332]
/lib64/libc.so.6[0x362a243b70]
Comment 3 shilpa 2013-07-10 08:19:41 EDT
Reproduced the same bug.
Comment 4 Pranith Kumar K 2013-07-10 08:33:01 EDT

*** This bug has been marked as a duplicate of bug 978802 ***

Note You need to log in before you can comment on or make changes to this bug.