Bug 837882 - Self-heal gives many "No such file or directory" errors
Self-heal gives many "No such file or directory" errors
Status: CLOSED DUPLICATE of bug 835423
Product: GlusterFS
Classification: Community
Component: replicate (Show other bugs)
3.3.0
x86_64 Linux
unspecified Severity unspecified
: ---
: ---
Assigned To: Pranith Kumar K
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-07-05 13:06 EDT by jaw171
Modified: 2012-07-26 12:50 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-07-26 12:50:10 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
self-heal.log (47.49 KB, text/x-log)
2012-07-05 13:06 EDT, jaw171
no flags Details

  None (edit)
Description jaw171 2012-07-05 13:06:49 EDT
Created attachment 596449 [details]
self-heal.log

Description of problem:
On a two RHEL 6.3 server GlusterFS 3.3 replica volume with four bricks per server self-heal is showing errors after one server was down for some time then brought back online.

Version-Release number of selected component (if applicable):
glusterfs-server-3.3.0-1.el6.x86_64 (from the RPMs on gluster.org)

How reproducible:


Steps to Reproduce:
1. Create a replica volume with a count of 2
2. Add files to the volume
3. Take one server offline
4. Change data in the volume
5. Bring the dead server back up
6. Watch the self-heal log
  
Actual results:
"No such file or directory" errors

Expected results:
Self-heal pushes the changed files from the healthy server to the other one without error.

Additional info:
# gluster volume info
 
Volume Name: vol_home
Type: Distributed-Replicate
Volume ID: 4147f773-f2d2-4e91-bff3-b5ec7da69a47
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: storage1.frank.sam.pitt.edu:/brick/0
Brick2: storage4.frank.sam.pitt.edu:/brick/0
Brick3: storage1.frank.sam.pitt.edu:/brick/1
Brick4: storage4.frank.sam.pitt.edu:/brick/1
Brick5: storage1.frank.sam.pitt.edu:/brick/2
Brick6: storage4.frank.sam.pitt.edu:/brick/2
Brick7: storage1.frank.sam.pitt.edu:/brick/3
Brick8: storage4.frank.sam.pitt.edu:/brick/3
Options Reconfigured:
nfs.rpc-auth-allow: 10.201.*.*,127.*
auth.allow: 10.201.*.*,127.*
performance.io-cache: off
cluster.min-free-disk: 5
performance.cache-size: 128000000
features.quota: on
nfs.disable: on
features.limit-usage: *snipped, lots of quotas*


Part of the self-heal log is attached.  When I run ' find <gluster-mount> -noleaf -print0 | xargs --null stat >/dev/null' I also see more of it in the self-heal log:


[2012-07-05 09:49:12.326243] E [afr-self-heald.c:287:_remove_stale_index] 0-vol_home-replicate-3: 087ca940-a10b-46cc-94dc-c1de2ce08d91: Failed to remove index on vol_home-client-6 - No such file or directory
[2012-07-05 09:49:12.335160] I [afr-self-heald.c:282:_remove_stale_index] 0-vol_home-replicate-3: Removing stale index for a8c87651-a428-4119-8e22-c09d13d2aaaf on vol_home-client-6
[2012-07-05 09:49:12.345011] W [client3_1-fops.c:592:client3_1_unlink_cbk] 0-vol_home-client-6: remote operation failed: No such file or directory
[2012-07-05 09:49:12.345142] E [afr-self-heald.c:287:_remove_stale_index] 0-vol_home-replicate-3: a8c87651-a428-4119-8e22-c09d13d2aaaf: Failed to remove index on vol_home-client-6 - No such file or directory
[2012-07-05 09:49:12.362025] I [afr-self-heald.c:282:_remove_stale_index] 0-vol_home-replicate-3: Removing stale index for f2c01834-7e76-465d-a1b7-04e9328f6557 on vol_home-client-6
[2012-07-05 09:49:12.362395] W [client3_1-fops.c:592:client3_1_unlink_cbk] 0-vol_home-client-6: remote operation failed: No such file or directory
[2012-07-05 09:49:12.362534] E [afr-self-heald.c:287:_remove_stale_index] 0-vol_home-replicate-3: f2c01834-7e76-465d-a1b7-04e9328f6557: Failed to remove index on vol_home-client-6 - No such file or directory
[2012-07-05 09:49:12.373466] I [afr-self-heald.c:282:_remove_stale_index] 0-vol_home-replicate-3: Removing stale index for 090cf79b-562e-4a9d-9f4a-4393c09f9a4b on vol_home-client-6
[2012-07-05 09:49:12.387391] W [client3_1-fops.c:592:client3_1_unlink_cbk] 0-vol_home-client-6: remote operation failed: No such file or directory
[2012-07-05 09:49:12.387558] E [afr-self-heald.c:287:_remove_stale_index] 0-vol_home-replicate-3: 090cf79b-562e-4a9d-9f4a-4393c09f9a4b: Failed to remove index on vol_home-client-6 - No such file or directory
Comment 1 Junaid 2012-07-18 04:37:08 EDT
Changing the component to AFR.
Comment 2 Pranith Kumar K 2012-07-26 12:50:10 EDT

*** This bug has been marked as a duplicate of bug 835423 ***

Note You need to log in before you can comment on or make changes to this bug.