Bug 1340032

Summary: Files not able to heal after arbiter and data bricks were rebooted
Product: [Community] GlusterFS Reporter: Karan Sandha <ksandha>
Component: arbiterAssignee: Ravishankar N <ravishankar>
Status: CLOSED WONTFIX QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: mainlineCC: bugs, ravishankar
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1361518 (view as bug list) Environment:
Last Closed: 2018-11-19 07:49:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1361518    
Attachments:
Description Flags
Logs1
none
Logs2
none
Logs3
none
Client Log
none
New Output and Gett fattr none

Description Karan Sandha 2016-05-26 10:41:18 UTC
Description of problem:
created script for 50 files 5mb each and during the creation rebooted 2 nodes arbiter and data brick. while one brick was alive.

Version-Release number of selected component (if applicable):


How reproducible:
1 time

Steps to Reproduce:
1.create a 1x3 volume -core(name) 
2. mounted on the client using fuse  at /mnt/core
3. ran this Script for 1 min.
for (( i=1;i<=50;i++ )) 
do 
dd if=/dev/urandom of=corefile$i bs=5M count=5 status=progress
done

4. rebooted the arbiter and one the data brick.
5. files 15 to 26 were only touched. No data was written. 0 byte file.
6. Files from arbiter and data weren't able to heal

[root@dhcp43-192 core]# gluster volume heal core info
Brick dhcp43-157.lab.eng.blr.redhat.com:/rhs/brick1/core
Status: Connected
Number of entries: 0

Brick dhcp43-192.lab.eng.blr.redhat.com:/rhs/brick1/core
/corefile16 
/corefile17 
/corefile18 
/corefile19 
/corefile20 
/corefile21 
/corefile22 
/corefile23 
/corefile24 
/corefile25 
/corefile26 
Status: Connected
Number of entries: 11

Brick dhcp43-153.lab.eng.blr.redhat.com:/rhs/brick1/core
/corefile16 
/corefile17 
/corefile18 
/corefile19 
/corefile20 
/corefile21 
/corefile22 
/corefile23 
/corefile24 
/corefile25 
/corefile26 
Status: Connected
Number of entries: 11



Actual results:
The files should weren't healed.

Expected results:
The files should have been healed.

Additional info:
logs kept at rhsqe-repo.lab.eng.blr.redhat.com:/var/www/html/sosreports/<bug>

Comment 1 Karan Sandha 2016-05-31 06:18:57 UTC
Created attachment 1163038 [details]
Logs1

Comment 2 Karan Sandha 2016-05-31 06:19:55 UTC
Created attachment 1163039 [details]
Logs2

Comment 3 Karan Sandha 2016-05-31 06:20:47 UTC
Created attachment 1163040 [details]
Logs3

Comment 4 Karan Sandha 2016-05-31 06:22:24 UTC
Created attachment 1163041 [details]
Client Log

Comment 5 Karan Sandha 2016-06-07 12:11:03 UTC
Created attachment 1165588 [details]
New Output and Gett fattr

Comment 6 Karan Sandha 2016-06-07 12:38:33 UTC
Steps To Reproduce:-
1) Create 1x3 Arbiter volume 
2) bricks B1 ,B2, B3(A)
3) bring down B1 
4) Create 50 files 500MB each on fuse mount from client.
5) after 30 files are created
6) bring up the B1 and bring down B3


Check gluster volume heal info
ls for the files on bricks there will be multiple 0 byte files
and gluster heal info shows mulitple files to be healed.

Comment 7 Vijay Bellur 2016-06-20 08:18:01 UTC
REVIEW: http://review.gluster.org/14769 (afr: Do not mark arbiter as data source during newentry_mark) posted (#1) for review on master by Ravishankar N (ravishankar)

Comment 8 Vijay Bellur 2016-06-24 11:42:49 UTC
REVIEW: http://review.gluster.org/14769 (afr: Do not mark arbiter as data source during newentry_mark) posted (#2) for review on master by Ravishankar N (ravishankar)

Comment 9 Ravishankar N 2016-06-24 11:44:56 UTC
Moved BZ state by mistake

Comment 11 Ravishankar N 2018-11-19 07:49:14 UTC
All similar issues due to stale entry creation and healing are tracked in
https://github.com/gluster/glusterfs/issues/502