Bug 1755344 - glustershd.log getting flooded with "W [inode.c:1017:inode_find] (-->/usr/lib64/glusterfs/6.0/xlator/cluster/disperse.so(+0xe3f9) [0x7fd09b0543f9] -->/usr/lib64/glusterfs/6.0/xlator/cluster/disperse.so(+0xe19c) [0x7fd09b05419 TABLE NOT FOUND"
Summary: glustershd.log getting flooded with "W [inode.c:1017:inode_find] (-->/usr/li...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: disperse
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Xavi Hernandez
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1754790
TreeView+ depends on / blocked
 
Reported: 2019-09-25 09:47 UTC by Xavi Hernandez
Modified: 2019-09-26 14:00 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1754790
Environment:
Last Closed: 2019-09-26 14:00:42 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gluster.org Gerrit 23481 0 None Merged cluster/ec: prevent filling shd log with \"table not found\" messages 2019-09-26 14:00:41 UTC

Description Xavi Hernandez 2019-09-25 09:47:37 UTC
Description of problem:
------------------------
the shd log file is getting flooded with below message



[2019-09-24 05:43:48.883399] W [inode.c:1017:inode_find] (-->/usr/lib64/glusterfs/6.0/xlator/cluster/disperse.so(+0xe3f9) [0x7f3b378513f9] -->/usr/lib64/glusterfs/6.0/xlator/cluster/disperse.so(+0xe19c) [0x7f3b3785119c] -->/lib64/libglusterfs.so.0(inode_find+0x92) [0x7f3b4a748112] ) 0-test-disperse-6: table not found


Version-Release number of selected component (if applicable):

How reproducible:

seeing it consistently 

Steps to Reproduce:

Access a file while self-heal is repairing it.

Actual results:

shd log has been flooded with below log message, and even has log-rotated in just 15 hrs of time

Comment 1 Xavi Hernandez 2019-09-25 09:50:00 UTC
The problem appears when an inodelk contention notification is received by the self-heal daemon. In this case, the function that manages it (ec_upcall() in ec.c) does this:

        case GF_UPCALL_INODELK_CONTENTION:
            lc = upcall->data;
            if (strcmp(lc->domain, ec->xl->name) != 0) {
                /* The lock is not owned by EC, ignore it. */
                return _gf_true;
            }
            inode = inode_find(((xlator_t *)ec->xl->graph->top)->itable,
                               upcall->gfid);

In the case of self-heal daemon, ec->xl->graph->top corresponds to the debug/io-stats xlator, which doesn't have any inode table. In this case this is not a problem because self-heal doesn't use eager-locking, so no need to take care of inodelk contention notifications. Locks will be released as soon as possible regardless of whether there is contention or not.

Normal client mounts do have an inode table on the top xlator, so this problem is not observed there.

I'll send a patch to prevent filling the logs in this case.

Comment 2 Worker Ant 2019-09-25 10:08:24 UTC
REVIEW: https://review.gluster.org/23481 (cluster/ec: prevent filling shd log with \"table not found\" messages) posted (#1) for review on master by Xavi Hernandez

Comment 3 Worker Ant 2019-09-26 14:00:42 UTC
REVIEW: https://review.gluster.org/23481 (cluster/ec: prevent filling shd log with \"table not found\" messages) merged (#2) on master by Pranith Kumar Karampuri


Note You need to log in before you can comment on or make changes to this bug.