Bug 1193995 - [RHEV-RHS] Fuse mount process crashed, while using gluster volume as storage domain in RHEV
Summary: [RHEV-RHS] Fuse mount process crashed, while using gluster volume as storage ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: rhgs-3.0
Hardware: All
OS: Linux
high
high
Target Milestone: ---
: RHGS 3.0.4
Assignee: krishnan parthasarathi
QA Contact: SATHEESARAN
URL:
Whiteboard:
Depends On: 1194525 1197118
Blocks: 1104459 1182947
TreeView+ depends on / blocked
 
Reported: 2015-02-18 17:43 UTC by SATHEESARAN
Modified: 2015-05-13 17:53 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.6.0.48-1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
RHEV-RHS Integration
Last Closed: 2015-03-26 06:36:08 UTC
Embargoed:


Attachments (Terms of Use)
fuse mount log file (215.14 KB, text/plain)
2015-02-18 18:08 UTC, SATHEESARAN
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:0682 0 normal SHIPPED_LIVE Red Hat Storage 3.0 enhancement and bug fix update #4 2015-03-26 10:32:55 UTC

Description SATHEESARAN 2015-02-18 17:43:22 UTC
Description of problem:
-----------------------
RHEV setup uses the gluster volume to store virtual machine images.
Gluster volume is fuse mounted on 2 RHEL 6.6 Hypervisors and Application VMs are created. 

After few hours, the mount process got crashed in one of the hypervisor and VMs running in those machines are paused

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
glusterfs-3.6.0.45-1.el6rhs

How reproducible:
-----------------
never tried to reproduce

Steps to Reproduce:
-------------------
1. Create 2x2 distribute-replicate volume

2. Optimize the volume for virt-store
(i.e) gluster volume set <vol-name> group virt
      gluster volume set storage.owner-uid 36
      gluster volume set storage.owner-gid 36

3. Set up epoll configuration
(i.e) gluster volume set <vol-name> client.event-threads 2 
      gluster volume set <vol-name> server.event-threads 2

4. Start the volume. Use this volume as the Data Domain ( storage-backend for imagestore ) in RHEV

5. Use 2 RHEL 6.6 as Hypervisors

6. Create 4 App VMs installed with RHEL 6.6. In my setup, there were 2 App VMs running on each hypervisor

7. Continuously create files, delete them from App VMs.
This is done to simulate IO Load on the VMs

8. Check for the status of the VM after sometime

Actual results:
---------------
Fuse mount process on one of the Hypervisor got crashed.

Expected results:
-----------------
Everything should be working fine and there shouldn't be any problems neither to App VMs nor to storage domain

Comment 2 SATHEESARAN 2015-02-18 17:56:40 UTC
No operations on the volume was performed. This volume is just fuse mounted and used to storing VM Images.

I have created 4 App VMs and left it for around 10 hours ( approx) and found these crash. 

I noticed that there were continuous flow of error messages in the fuse mount logs as follows :

<error_from_fuse_mount_logs>

[2015-02-18 14:45:49.728257] E [dht-helper.c:1345:dht_inode_ctx_get] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_readdirp_cbk+0x30c) [0x7f1dbfdd9
b6c] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_layout_preset+0x5e) [0x7f1dbfdb1a0e] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute
.so(dht_inode_ctx_layout_set+0x34) [0x7f1dbfdb3ca4]))) 0-Imstore1-dht: invalid argument: inode
[2015-02-18 14:45:49.728293] E [dht-helper.c:1364:dht_inode_ctx_set] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_readdirp_cbk+0x30c) [0x7f1dbfdd9
b6c] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute.so(dht_layout_preset+0x5e) [0x7f1dbfdb1a0e] (-->/usr/lib64/glusterfs/3.6.0.45/xlator/cluster/distribute
.so(dht_inode_ctx_layout_set+0x52) [0x7f1dbfdb3cc2]))) 0-Imstore1-dht: invalid argument: inode

</error_from_fuse_mount_logs>

The above error messages were repeated right from using this volume for image store till it crashed.

Comment 3 SATHEESARAN 2015-02-18 17:59:53 UTC
Crash information as seen in the fuse mount logs:
--------------------------------------------------

pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(WRITE)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 
2015-02-18 15:56:19
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.6.0.45
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x7f1dc9a947b6]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x7f1dc9aaf3cf]
/lib64/libc.so.6[0x36d84329a0]
/usr/lib64/glusterfs/3.6.0.45/rpc-transport/socket.so(+0x9594)[0x7f1dc550e594]
/usr/lib64/glusterfs/3.6.0.45/rpc-transport/socket.so(+0xad1d)[0x7f1dc550fd1d]
/usr/lib64/libglusterfs.so.0(+0x77d1c)[0x7f1dc9aebd1c]
/lib64/libpthread.so.0[0x36d88079d1]
/lib64/libc.so.6(clone+0x6d)[0x36d84e8b6d]
---------

Comment 4 SATHEESARAN 2015-02-18 18:08:26 UTC
Created attachment 993260 [details]
fuse mount log file

Attaching the fuse mount log file

Comment 8 SATHEESARAN 2015-03-17 10:31:23 UTC
Tested with glusterfs-3.6.0.50-1.el6rhs with the steps mentioned in comment0.

I am not seeing any fuse mount crash.
Marking this bug as verified.

Comment 10 errata-xmlrpc 2015-03-26 06:36:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0682.html


Note You need to log in before you can comment on or make changes to this bug.