Bug 1005227

Summary: AFR : Observed "1,96,81,301" ACTIVE locks in brick process statedump
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: spandura
Component: replicateAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED EOL QA Contact: storage-qa-internal <storage-qa-internal>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 2.1CC: pkarampu, rhs-bugs, storage-qa-internal, vagarwal, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-03 17:13:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description spandura 2013-09-06 13:17:46 UTC
Description of problem:
========================
On a 1 x 2 replicate volume was running dd on a file from 4 gluster mount points. While running dd, one of the brick went offline and came back online. Once the self-heal is completed and dd is still in progress took statedump of the volume. 

Observed "1,96,81,301" ACTIVE locks. 

Is this behavior acceptable? 

Version-Release number of selected component (if applicable):
================================================================
glusterfs 3.4.0.31rhs built on Sep  5 2013 08:23:16

How reproducible:
===================
Executed the case only once. 

Steps to Reproduce:
====================
1. Create a replicate volume ( 1 x 2 ). Start the volume 
root@fan [Sep-06-2013-13:08:00] >gluster v info
 
Volume Name: vol_dis_1_rep_2
Type: Replicate
Volume ID: f5c43519-b5eb-4138-8219-723c064af71c
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: fan.lab.eng.blr.redhat.com:/rhs/bricks/vol_dis_1_rep_2_b0
Brick2: mia.lab.eng.blr.redhat.com:/rhs/bricks/vol_dis_1_rep_2_b1
Options Reconfigured:
cluster.self-heal-daemon: on
performance.write-behind: on
performance.stat-prefetch: off
server.allow-insecure: on

2. Create 4 fuse mounts. 

3. Start dd on a file from all the mounts: " dd if=/dev/urandom of=./test_file1 bs=1K count=20480000" 

4. While dd is in progress bring down a brick offline. 

5. After some time while dd is still in progress bring back the brick online {gluster volume start <volume_name> force}

6. Once the self-heal completes (check mount logs for successful completion) took statedump of the volume. 

Actual results:
=================
[root@rhsqe-repo locks_in_transit]# grep "ACTIVE" rhs-bricks-vol_dis_1_rep_2_b0.29411.dump.1378469030 | wc -l 
"19681301"

Expected results:
====================
TBD

Additional info:
=================
Statedumps: http://rhsqe-repo.lab.eng.blr.redhat.com/bugs_necessary_info/locks_in_transit/


root@fan [Sep-06-2013-13:15:06] >gluster v status
Status of volume: vol_dis_1_rep_2
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick fan.lab.eng.blr.redhat.com:/rhs/bricks/vol_dis_1_
rep_2_b0						49152	Y	29411
Brick mia.lab.eng.blr.redhat.com:/rhs/bricks/vol_dis_1_
rep_2_b1						49152	Y	3625
NFS Server on localhost					2049	Y	2996
Self-heal Daemon on localhost				N/A	Y	3006
NFS Server on mia.lab.eng.blr.redhat.com		2049	Y	3637
Self-heal Daemon on mia.lab.eng.blr.redhat.com		N/A	Y	3645
 
There are no active volume tasks

Comment 3 Vivek Agarwal 2015-12-03 17:13:30 UTC
Thank you for submitting this issue for consideration in Red Hat Gluster Storage. The release for which you requested us to review, is now End of Life. Please See https://access.redhat.com/support/policy/updates/rhs/

If you can reproduce this bug against a currently maintained version of Red Hat Gluster Storage, please feel free to file a new report against the current release.