Bug 1153904 - self heal info logs are filled with messages reporting ENOENT while self-heal is going on
Summary: self heal info logs are filled with messages reporting ENOENT while self-heal...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: 3.5.3
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Ravishankar N
QA Contact:
URL:
Whiteboard:
Depends On: 1122511
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-10-17 05:49 UTC by Ravishankar N
Modified: 2014-11-21 16:14 UTC (History)
6 users (show)

Fixed In Version: glusterfs-3.5.3
Clone Of: 1122511
Environment:
Last Closed: 2014-11-21 16:03:20 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Ravishankar N 2014-10-17 05:49:44 UTC
Description of problem:
------------------------------------------------------------------------------

The self-heal info logs were filled with messages of the following kind -
------------------------------------------------------------------------------

[2014-07-23 11:49:27.227425] W [client-rpc-fops.c:2758:client3_3_lookup_cbk] 0-vol1-client-1: remote operation failed: No such file or directory. Path: b870247b-b05e-4502-a43b-442ff79ef1fc (b870247b-b05e-4502-a43b-442ff79ef1fc)

Such messages are seen for large number of files when heal info command is run for a volume, when heal was in progress. The logs were seen to grow in size upto 40GB. The heal info command was run every minute. ENOENT errors should not be logged by the client translator.

Version-Release number of selected component (if applicable):
glusterfs-3.6.0.24-1.el6rhs.x86_64

1.Create a 1x2 replica volume, fuse mount it.
2.Kill brick2
3.Create 5 files from the fuse mount: touch /mnt/fuse_mnt/file{1..5}
4.Bring back brick2 up. This triggers afr entry self-heal.
5.The glustershd.log contains the ENOENT message as given in the bug description, one message per file to be healed:

---------------
[2014-10-15 11:12:57.405428] W [client-rpc-fops.c:2761:client3_3_lookup_cbk] 0-testvol-client-1: remote operation failed: No such file or directory. Path: <gfid:760b4427-2fb9-4a67-9f55-e8e8d78e452f> (760b4427-2fb9-4a67-9f55-e8e8d78e452f)
[2014-10-15 11:12:57.406150] W [client-rpc-fops.c:2761:client3_3_lookup_cbk] 0-testvol-client-1: remote operation failed: No such file or directory. Path: <gfid:95bd7270-f92a-4a13-be2f-fa09c8a80b76> (95bd7270-f92a-4a13-be2f-fa09c8a80b76)
[2014-10-15 11:12:57.406744] W [client-rpc-fops.c:2761:client3_3_lookup_cbk] 0-testvol-client-1: remote operation failed: No such file or directory. Path: <gfid:108a66df-377c-4dd4-8da7-e094d7cec9de> (108a66df-377c-4dd4-8da7-e094d7cec9de)
[2014-10-15 11:12:57.412424] W [client-rpc-fops.c:2761:client3_3_lookup_cbk] 0-testvol-client-1: remote operation failed: No such file or directory. Path: <gfid:748d98d1-edd9-4d7d-aee9-83784b25a24d> (748d98d1-edd9-4d7d-aee9-83784b25a24d)
[2014-10-15 11:12:57.412993] W [client-rpc-fops.c:2761:client3_3_lookup_cbk] 0-testvol-client-1: remote operation failed: No such file or directory. Path: <gfid:565d0ff7-a006-436f-844a-1ddfbda48a82> (565d0ff7-a006-436f-844a-1ddfbda48a82)
-------------------------

6. If The above test is repeated with selfheal daemon turned off, then after step4, run `gluster v heal <volname> info`. This also triggers entry selfheal thus filling the glfsheal-<volname>.log with the same messages as above.

Comment 1 Anand Avati 2014-10-17 05:53:23 UTC
REVIEW: http://review.gluster.org/8937 (protocol/client: change log level for lookup) posted (#1) for review on release-3.5 by Ravishankar N (ravishankar)

Comment 2 Anand Avati 2014-10-21 15:41:29 UTC
COMMIT: http://review.gluster.org/8937 committed in release-3.5 by Niels de Vos (ndevos) 
------
commit 946eecfff75d7c9f4df3890ce57311386fe6e994
Author: Ravishankar N <root@ravi3.(none)>
Date:   Fri Oct 17 10:58:06 2014 +0000

    protocol/client: change log level for lookup
    
    Problem:
    On 3.5 branch, http://review.gluster.org/8294 causes the server to return ENOENT
    (as opposed to ESTALE in master branch) if file does not exist. When AFR does
    entry self-heals (either from the mount or shd or by the `heal info` command), it
    does a gfid-lookup with loc.name == NULL, causing the corresponding log file to be
    flooded with messages like this:
    
    [2014-10-15 11:12:57.405428] W [client-rpc-fops.c:2761:client3_3_lookup_cbk]
    0-testvol-client-1: remote operation failed: No such file or directory. Path:
    <gfid:760b4427-2fb9-4a67-9f55-e8e8d78e452f>
    (760b4427-2fb9-4a67-9f55-e8e8d78e452f)
    
    Fix:
    Change log level for ENOENT and ESTALE errors to DEBUG
    
    Change-Id: Ideb88d9cb609d077e02efe703cd28155985d7513
    BUG: 1153904
    Signed-off-by: Ravishankar N <root@ravi3.(none)>
    Reviewed-on: http://review.gluster.org/8937
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    Reviewed-by: Niels de Vos <ndevos>

Comment 3 Niels de Vos 2014-11-05 09:25:13 UTC
The second Beta for GlusterFS 3.5.3 has been released [1]. Please verify if the release solves this bug report for you. In case the glusterfs-3.5.3beta2 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions have been made available on [2] to make testing easier.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019359.html
[2] http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.5.3beta2/

Comment 4 Niels de Vos 2014-11-21 16:03:20 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.3, please reopen this bug report.

glusterfs-3.5.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/announce/2014-November/000042.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.