Bug 1340032 - Files not able to heal after arbiter and data bricks were rebooted
Summary: Files not able to heal after arbiter and data bricks were rebooted
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: GlusterFS
Classification: Community
Component: arbiter
Version: mainline
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
Assignee: Ravishankar N
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1361518
TreeView+ depends on / blocked
 
Reported: 2016-05-26 10:41 UTC by Karan Sandha
Modified: 2018-11-19 07:49 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1361518 (view as bug list)
Environment:
Last Closed: 2018-11-19 07:49:14 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)
Logs1 (7.63 MB, application/x-xz)
2016-05-31 06:18 UTC, Karan Sandha
no flags Details
Logs2 (7.70 MB, application/x-xz)
2016-05-31 06:19 UTC, Karan Sandha
no flags Details
Logs3 (7.78 MB, application/x-xz)
2016-05-31 06:20 UTC, Karan Sandha
no flags Details
Client Log (18.04 MB, application/x-xz)
2016-05-31 06:22 UTC, Karan Sandha
no flags Details
New Output and Gett fattr (15.90 KB, application/vnd.oasis.opendocument.text)
2016-06-07 12:11 UTC, Karan Sandha
no flags Details

Description Karan Sandha 2016-05-26 10:41:18 UTC
Description of problem:
created script for 50 files 5mb each and during the creation rebooted 2 nodes arbiter and data brick. while one brick was alive.

Version-Release number of selected component (if applicable):


How reproducible:
1 time

Steps to Reproduce:
1.create a 1x3 volume -core(name) 
2. mounted on the client using fuse  at /mnt/core
3. ran this Script for 1 min.
for (( i=1;i<=50;i++ )) 
do 
dd if=/dev/urandom of=corefile$i bs=5M count=5 status=progress
done

4. rebooted the arbiter and one the data brick.
5. files 15 to 26 were only touched. No data was written. 0 byte file.
6. Files from arbiter and data weren't able to heal

[root@dhcp43-192 core]# gluster volume heal core info
Brick dhcp43-157.lab.eng.blr.redhat.com:/rhs/brick1/core
Status: Connected
Number of entries: 0

Brick dhcp43-192.lab.eng.blr.redhat.com:/rhs/brick1/core
/corefile16 
/corefile17 
/corefile18 
/corefile19 
/corefile20 
/corefile21 
/corefile22 
/corefile23 
/corefile24 
/corefile25 
/corefile26 
Status: Connected
Number of entries: 11

Brick dhcp43-153.lab.eng.blr.redhat.com:/rhs/brick1/core
/corefile16 
/corefile17 
/corefile18 
/corefile19 
/corefile20 
/corefile21 
/corefile22 
/corefile23 
/corefile24 
/corefile25 
/corefile26 
Status: Connected
Number of entries: 11



Actual results:
The files should weren't healed.

Expected results:
The files should have been healed.

Additional info:
logs kept at rhsqe-repo.lab.eng.blr.redhat.com:/var/www/html/sosreports/<bug>

Comment 1 Karan Sandha 2016-05-31 06:18:57 UTC
Created attachment 1163038 [details]
Logs1

Comment 2 Karan Sandha 2016-05-31 06:19:55 UTC
Created attachment 1163039 [details]
Logs2

Comment 3 Karan Sandha 2016-05-31 06:20:47 UTC
Created attachment 1163040 [details]
Logs3

Comment 4 Karan Sandha 2016-05-31 06:22:24 UTC
Created attachment 1163041 [details]
Client Log

Comment 5 Karan Sandha 2016-06-07 12:11:03 UTC
Created attachment 1165588 [details]
New Output and Gett fattr

Comment 6 Karan Sandha 2016-06-07 12:38:33 UTC
Steps To Reproduce:-
1) Create 1x3 Arbiter volume 
2) bricks B1 ,B2, B3(A)
3) bring down B1 
4) Create 50 files 500MB each on fuse mount from client.
5) after 30 files are created
6) bring up the B1 and bring down B3


Check gluster volume heal info
ls for the files on bricks there will be multiple 0 byte files
and gluster heal info shows mulitple files to be healed.

Comment 7 Vijay Bellur 2016-06-20 08:18:01 UTC
REVIEW: http://review.gluster.org/14769 (afr: Do not mark arbiter as data source during newentry_mark) posted (#1) for review on master by Ravishankar N (ravishankar@redhat.com)

Comment 8 Vijay Bellur 2016-06-24 11:42:49 UTC
REVIEW: http://review.gluster.org/14769 (afr: Do not mark arbiter as data source during newentry_mark) posted (#2) for review on master by Ravishankar N (ravishankar@redhat.com)

Comment 9 Ravishankar N 2016-06-24 11:44:56 UTC
Moved BZ state by mistake

Comment 11 Ravishankar N 2018-11-19 07:49:14 UTC
All similar issues due to stale entry creation and healing are tracked in
https://github.com/gluster/glusterfs/issues/502


Note You need to log in before you can comment on or make changes to this bug.