Bug 1580315 - gluster volume status inode getting timed out after 30 minutes with no output/error
Summary: gluster volume status inode getting timed out after 30 minutes with no output...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: mainline
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
Assignee: Amar Tumballi
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1578703
TreeView+ depends on / blocked
 
Reported: 2018-05-21 09:13 UTC by Sanoj Unnikrishnan
Modified: 2019-03-22 12:46 UTC (History)
9 users (show)

Fixed In Version:
Clone Of: 1578703
Environment:
Last Closed: 2019-03-22 12:46:14 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gluster.org Gerrit 20045 0 None Abandoned Fix inode table dump 2019-02-06 09:00:52 UTC
Gluster.org Gerrit 22347 0 None Merged inode: don't dump the whole table to CLI 2019-03-20 07:16:14 UTC
Gluster.org Gerrit 22389 0 None Merged inode: fix unused vars 2019-03-22 12:46:13 UTC

Description Sanoj Unnikrishnan 2018-05-21 09:13:11 UTC
Description of problem:
=======================

The command "gluster volume status inode" is getting timed out but without any error message / output

[root@dhcp43-18 ~]# time gluster v status alpha inode --timeout=86400

real    30m1.513s
user    0m0.311s
sys     0m0.374s
[root@dhcp43-18 ~]#


Snippet of glusterd log:

[2018-05-10 09:07:39.375476] E [MSGID: 106116] [glusterd-mgmt.c:124:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.70.43.4. Please check log file for details.
[2018-05-10 09:07:39.842184] E [MSGID: 106116] [glusterd-mgmt.c:124:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.70.43.9. Please check log file for details.
[2018-05-10 09:07:39.842512] E [MSGID: 106152] [glusterd-syncop.c:1641:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s)
[2018-05-10 09:07:39.842735] W [glusterd-locks.c:845:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0xe1379) [0x7f7d16d41379] -->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0xe09ca) [0x7f7d16d409ca] -->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0xe8935) [0x7f7d16d48935] ) 0-management: Lock for vol alpha not held


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.12.2-8.el7rhgs.x86_64


How reproducible:
=================

2/2


Steps to Reproduce:
===================

1. Create a volume
2. FUSE mount the volume
3. Do an untar of 1 or 2 kernel tarball from mount
4. perform "time gluster volume status volname inode --timeout=86400"


Actual results:
===============

Command gets timed out after 30 minutes


Expected results:
=================

Command should succeed with the proper output. If the command is failing then appropriate error message must be displayed.




--- Additional comment from Sanoj Unnikrishnan on 2018-05-16 04:55:34 EDT ---


The issue was seen while doing kernel tarball untar. 
kernel tarball untar creates approx 60k inodes and hence the lru list is filled.

In the function inode_table_dump_to_dict:
Dumping 16384 inodes which are on lru under itable lock can take a few minutes.. , we should not hold the lock for that long. 

Work around: 
change inode-lru-limit to lower value before using "gluster v status inodes"and reset it after the command is run.


Possible solution:

Dumping lru inodes and purge inodes do not seem very useful from debugability perspective, we can avoid them..

the active list inodes can be listed by iterating active-list holding inode ref without itable lock.

Comment 1 Worker Ant 2018-05-21 09:38:16 UTC
REVIEW: https://review.gluster.org/20045 (Fix inode table dump WIP TODO: Fix xml output) posted (#1) for review on master by sanoj-unnikrishnan

Comment 2 Worker Ant 2019-03-13 04:18:16 UTC
REVIEW: https://review.gluster.org/22347 (inode: don't dump the whole table to CLI) posted (#1) for review on master by Amar Tumballi

Comment 3 Worker Ant 2019-03-20 07:16:15 UTC
REVIEW: https://review.gluster.org/22347 (inode: don't dump the whole table to CLI) merged (#3) on master by Atin Mukherjee

Comment 4 Worker Ant 2019-03-21 04:30:09 UTC
REVIEW: https://review.gluster.org/22389 (inode: fix unused vars) posted (#1) for review on master by Atin Mukherjee

Comment 5 Worker Ant 2019-03-22 12:46:14 UTC
REVIEW: https://review.gluster.org/22389 (inode: fix unused vars) merged (#2) on master by Atin Mukherjee


Note You need to log in before you can comment on or make changes to this bug.