Bug 1005272 - dd on fuse mount hung
dd on fuse mount hung
Status: CLOSED DUPLICATE of bug 923809
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterd (Show other bugs)
2.1
Unspecified Unspecified
unspecified Severity high
: ---
: ---
Assigned To: Amar Tumballi
Sudhir D
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-06 10:44 EDT by spandura
Modified: 2013-12-18 19:09 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-09-10 02:34:38 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description spandura 2013-09-06 10:44:25 EDT
Description of problem:
=========================
On a pure replicate volume (1 x 2) while running dd on a file from fuse mounts , one of the dd's hung. 

Bring log messages:
===================
[2013-09-06 12:06:07.514045] E [rpcsvc.c:448:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully
[2013-09-06 12:06:07.514155] E [server-helpers.c:779:server_alloc_frame] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x103) [0x7f480a910773] (-->/usr/lib64/libgfrpc.so.0
(rpcsvc_handle_rpc_call+0x245) [0x7f480a910625] (-->/usr/lib64/glusterfs/3.4.0.31rhs/xlator/protocol/server.so(server3_3_finodelk+0x8a) [0x7f4804c3471a]))) 0-server: in
valid argument: conn


Version-Release number of selected component (if applicable):
===============================================================
glusterfs 3.4.0.31rhs built on Sep  5 2013 08:23:59

How reproducible:
==================
Executed the case only once. 

Steps to Reproduce:
=====================
1. Create a replicate volume (1 x 2). Start the volume.

2. Create 4 fuse mounts. From all the mount start dd on a file: "dd if=/dev/urandom of=./test_file1 bs=1K count=20480000" 

3. While dd in progress, bring down a brick.

4. Bring back brick online while dd is still in progress.

Actual results:
=====================
dd on all 3 mounts were successful.

dd on one of the mount hung.

Expected results:
==================
dd should be successfully completed. 

Additional info:
===================

root@fan [Sep-06-2013-14:42:49] >gluster v info
 
Volume Name: vol_dis_1_rep_2
Type: Replicate
Volume ID: f5c43519-b5eb-4138-8219-723c064af71c
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: fan.lab.eng.blr.redhat.com:/rhs/bricks/vol_dis_1_rep_2_b0
Brick2: mia.lab.eng.blr.redhat.com:/rhs/bricks/vol_dis_1_rep_2_b1
Options Reconfigured:
cluster.self-heal-daemon: on
performance.write-behind: on
performance.stat-prefetch: off
server.allow-insecure: on
root@fan [Sep-06-2013-14:42:53] >


root@fan [Sep-06-2013-14:42:53] >gluster v status
Status of volume: vol_dis_1_rep_2
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick fan.lab.eng.blr.redhat.com:/rhs/bricks/vol_dis_1_
rep_2_b0						49152	Y	29411
Brick mia.lab.eng.blr.redhat.com:/rhs/bricks/vol_dis_1_
rep_2_b1						49152	Y	3625
NFS Server on localhost					2049	Y	2996
Self-heal Daemon on localhost				N/A	Y	3006
NFS Server on mia.lab.eng.blr.redhat.com		2049	Y	3637
Self-heal Daemon on mia.lab.eng.blr.redhat.com		N/A	Y	3645
 
There are no active volume tasks
root@fan [Sep-06-2013-14:43:09] >
Comment 1 spandura 2013-09-06 10:53:11 EDT
Mount process on which dd hung
================================
root@darrel [Sep-06-2013-14:48:41] >ps -ef | grep gm4
root     18605     1  4 10:14 ?        00:13:01 /usr/sbin/glusterfs --volfile-id=/vol_dis_1_rep_2 --volfile-server=mia /mnt/gm4
root     20028 19597  0 14:48 pts/0    00:00:00 grep gm4


SOS Reports and statedumps:
===========================
http://rhsqe-repo.lab.eng.blr.redhat.com/bugs_necessary_info/1005272/
Comment 3 Pranith Kumar K 2013-09-10 02:34:38 EDT
We were able to figure out why the mount could have hanged after looking at the logs in the bug https://bugzilla.redhat.com/show_bug.cgi?id=1005272 Similar logs are present in the sosreports:

[2013-03-20 06:24:48.320459] E [server-helpers.c:763:server_alloc_frame] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x93) [0x333160a8a3] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x293) [0x333160a733] (-->/usr/lib64/glusterfs/3.3.0.6rhs/xlator/protocol/server.so(server_finodelk+0xf8) [0x7f88bde218d8]))) 0-server: invalid argument: conn
[2013-03-20 06:24:48.320563] E [server-helpers.c:763:server_alloc_frame] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x93) [0x333160a8a3] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x293) [0x333160a733] (-->/usr/lib64/glusterfs/3.3.0.6rhs/xlator/protocol/server.so(server_finodelk+0xf8) [0x7f88bde218d8]))) 0-server: invalid argument: conn
[2013-03-20 06:24:48.320690] E [server-helpers.c:763:server_alloc_frame] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x93) [0x333160a8a3] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x293) [0x333160a733] (-->/usr/lib64/glusterfs/3.3.0.6rhs/xlator/protocol/server.so(server_finodelk+0xf8) [0x7f88bde218d8]))) 0-server: invalid argument: conn

marking 1005272 as duplicate of this bug.

*** This bug has been marked as a duplicate of bug 923809 ***

Note You need to log in before you can comment on or make changes to this bug.