Bug 1006866

Summary: [RFE] AFR : "rm -rf *" from fuse mount failed with I/O error when changing replicate count from 2 to 3
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: spandura
Component: glusterfsAssignee: Ravishankar N <ravishankar>
Status: CLOSED WORKSFORME QA Contact: spandura
Severity: medium Docs Contact:
Priority: medium    
Version: 2.1CC: nsathyan, ravishankar, spandura, vagarwal, vbellur
Target Milestone: ---Keywords: FutureFeature
Target Release: RHGS 2.1.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-11-26 10:19:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description spandura 2013-09-11 12:17:09 UTC
Description of problem:
=======================
On a replicate volume with 2 bricks (1 x 2) , added a brick to the volume to increase the replica count. There were few files to self-heal when performed add-brick. Did "rm -rf *" from fuse mount point. rm failed with "Input/output error". 

Version-Release number of selected component (if applicable):
===============================================================
glusterfs 3.4.0.33rhs built on Sep  8 2013 13:20:26

How reproducible:
=================
Executed the case only once. 

Steps to Reproduce:
===================
1. Create a replicate volume (1x2). Start the volume

2. Create a fuse mount. 

3. From fuse mount create files and directories. 

4. Bring down brick1. 

5. Modify existing files and create more files and directories. (we had around 10k files to self-heal)

6. Bring back brick1. 

7. Add one more brick to the replicate volume to have 3 way replica. 

8. From fuse mount perform "rm -rf *"

Actual results:
==================
root@darrel [Sep-11-2013-11:56:41] >rm -rf *
rm: cannot remove `F_l1_dir.9/l2_dir.4/test.1': Input/output error
rm: cannot remove `F_l1_dir.9/l2_dir.4/test.2': Input/output error
rm: cannot remove `F_l1_dir.9/l2_dir.4/test.3': Input/output error
rm: cannot remove `F_l1_dir.9/l2_dir.4/test.4': Input/output error
rm: cannot remove `F_l1_dir.9/l2_dir.4/test.5': Input/output error

Mount log message:
==================
[2013-09-11 11:58:35.234282] W [client-rpc-fops.c:1631:client3_3_entrylk_cbk] 4-vol_dis_1_rep_2-client-2: remote operation failed: No such file or directory
[2013-09-11 11:58:35.234736] I [afr-self-heal-common.c:2086:afr_sh_post_nb_entrylk_missing_entry_sh_cbk] 4-vol_dis_1_rep_2-replicate-0: Non blocking entrylks failed.
[2013-09-11 11:58:35.234751] E [afr-self-heal-common.c:2843:afr_log_self_heal_completion_status] 4-vol_dis_1_rep_2-replicate-0:  gfid or missing entry self heal  failed
, on <gfid:c40da2de-80d3-4169-978e-062cc6c27558>/F_l1_dir.9/l2_dir.4/test.1
[2013-09-11 11:58:35.234771] W [fuse-bridge.c:567:fuse_entry_cbk] 0-glusterfs-fuse: 3398295: LOOKUP() <gfid:c40da2de-80d3-4169-978e-062cc6c27558>/F_l1_dir.9/l2_dir.4/te
st.1 => -1 (Input/output error)

Expected results:
==================
Shouldn't report I/O Error. 

Additional info:
===================

root@fan [Sep-11-2013-12:16:22] >gluster v status
Status of volume: vol_dis_1_rep_2
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick fan.lab.eng.blr.redhat.com:/rhs/bricks/vol_dis_1_
rep_2_b0						49155	Y	9695
Brick mia.lab.eng.blr.redhat.com:/rhs/bricks/vol_dis_1_
rep_2_b1						49154	Y	13997
Brick fan:/rhs/bricks/vol_dis_1_rep_2_b2		49156	Y	14919
NFS Server on localhost					2049	Y	14931
Self-heal Daemon on localhost				N/A	Y	14938
NFS Server on mia.lab.eng.blr.redhat.com		2049	Y	14247
Self-heal Daemon on mia.lab.eng.blr.redhat.com		N/A	Y	14254
 
There are no active volume tasks
root@fan [Sep-11-2013-12:16:25] >gluster v info
 
Volume Name: vol_dis_1_rep_2
Type: Replicate
Volume ID: 0d88d703-d0e1-4206-93cd-ad1279741a1b
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: fan.lab.eng.blr.redhat.com:/rhs/bricks/vol_dis_1_rep_2_b0
Brick2: mia.lab.eng.blr.redhat.com:/rhs/bricks/vol_dis_1_rep_2_b1
Brick3: fan:/rhs/bricks/vol_dis_1_rep_2_b2
Options Reconfigured:
cluster.self-heal-daemon: on

Comment 3 Ravishankar N 2013-10-23 11:21:50 UTC
Unable to hit the issue on 3.4.0.36rhs. Requesting QA to see if it is reproducible on latest build.

Comment 4 spandura 2013-11-12 09:45:22 UTC
Tried to recreate the issue on build "glusterfs 3.4.0.35.1u2rhs built on Oct 21 2013 14:00:58". Unable to recreate the issue .

Comment 5 Vivek Agarwal 2013-11-26 10:19:17 UTC
per comment 4, not seeing it any more