Bug 994956

Summary: AFR: readdirp hangs on failure
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: spandura
Component: glusterfsAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED ERRATA QA Contact: spandura
Severity: urgent Docs Contact:
Priority: high    
Version: 2.1CC: amarts, nsathyan, pkarampu, rhs-bugs, shaines, surs, vbellur
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0.19rhs-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 994959 (view as bug list) Environment:
Last Closed: 2013-09-23 22:29:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
SOS Reports and Statedumps none

Description spandura 2013-08-08 09:55:04 UTC
Description of problem:
=======================
On a 1 x 2 replicate volume, running dbench on 2 fuse mounts failed. Removing the directories from mount point hanged. 

Version-Release number of selected component (if applicable):
==============================================================
glusterfs 3.4.0.18rhs built on Aug  7 2013 08:13:16

How reproducible:


Steps to Reproduce:
===================
1. Create a replicate volume ( 1 X 2 ). Start the volume.

2. Create 2 fuse mounts. 

3. From both fuse mounts execute: 
"dbench -s -F -S -x  --one-byte-write-fix --stat-check 10"

// dbench failed. 

4. From both mount points execute:
rm -rf *

Actual results:
=================
rm on the mount hangs. 

Expected results:
=================
rm shouldn't hang. 

Additional info:
==============

root@hicks [Aug-08-2013-15:24:16] >gluster v status
Status of volume: vol_rep
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick hicks:/rhs/bricks/b0				49152	Y	9987
Brick king:/rhs/bricks/b1				49152	Y	9512
NFS Server on localhost					2049	Y	10001
Self-heal Daemon on localhost				N/A	Y	10005
NFS Server on king					2049	Y	9524
Self-heal Daemon on king				N/A	Y	9531
 
There are no active volume tasks
root@hicks [Aug-08-2013-15:24:18] >
root@hicks [Aug-08-2013-15:24:19] >gluster v info
 
Volume Name: vol_rep
Type: Replicate
Volume ID: f08d5970-e230-4164-a90c-72f5173314a8
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: hicks:/rhs/bricks/b0
Brick2: king:/rhs/bricks/b1

Comment 3 spandura 2013-08-08 10:12:46 UTC
Created attachment 784293 [details]
SOS Reports and Statedumps

Comment 4 Amar Tumballi 2013-08-12 07:11:00 UTC
ttps://code.engineering.redhat.com/gerrit/#/c/11240

Comment 5 spandura 2013-08-14 10:08:36 UTC
Verified the fix on the build with the steps mentioned in steps to recreate the issue:
===============================
glusterfs 3.4.0.19rhs built on Aug 14 2013 00:11:42

Bug is fixed. Moving it to verified state.

Comment 6 Scott Haines 2013-09-23 22:29:53 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html