Bug 1578703

Summary: gluster volume status inode getting timed out after 30 minutes with no output/error
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Vinayak Papnoi <vpapnoi>
Component: coreAssignee: Sheetal Pamecha <spamecha>
Status: CLOSED ERRATA QA Contact: Vinayak Papnoi <vpapnoi>
Severity: high Docs Contact:
Priority: low    
Version: rhgs-3.4CC: amukherj, nchilaka, rgowdapp, rhs-bugs, sheggodu, spamecha, storage-qa-internal, vpapnoi
Target Milestone: ---Keywords: EasyFix, ZStream
Target Release: RHGS 3.5.0Flags: spamecha: needinfo-
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-6.0-1 Doc Type: Bug Fix
Doc Text:
Previously, running "gluster volume status <volname> inode" output the entire inode table, which could time out and create performance issues. The output of this command is now more streamlined, and the original information should now be obtained by performing a statedump.
Story Points: ---
Clone Of:
: 1580315 (view as bug list) Environment:
Last Closed: 2019-10-30 12:19:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1580315    
Bug Blocks: 1503143, 1696807    

Description Vinayak Papnoi 2018-05-16 08:32:48 UTC
Description of problem:
=======================

The command "gluster volume status inode" is getting timed out but without any error message / output

[root@dhcp43-18 ~]# time gluster v status alpha inode --timeout=86400

real    30m1.513s
user    0m0.311s
sys     0m0.374s
[root@dhcp43-18 ~]#


Snippet of glusterd log:

[2018-05-10 09:07:39.375476] E [MSGID: 106116] [glusterd-mgmt.c:124:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.70.43.4. Please check log file for details.
[2018-05-10 09:07:39.842184] E [MSGID: 106116] [glusterd-mgmt.c:124:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.70.43.9. Please check log file for details.
[2018-05-10 09:07:39.842512] E [MSGID: 106152] [glusterd-syncop.c:1641:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s)
[2018-05-10 09:07:39.842735] W [glusterd-locks.c:845:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0xe1379) [0x7f7d16d41379] -->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0xe09ca) [0x7f7d16d409ca] -->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0xe8935) [0x7f7d16d48935] ) 0-management: Lock for vol alpha not held


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.12.2-8.el7rhgs.x86_64


How reproducible:
=================

2/2


Steps to Reproduce:
===================

1. Create a volume
2. FUSE mount the volume
3. Do an untar of 1 or 2 kernel tarball from mount
4. perform "time gluster volume status volname inode --timeout=86400"


Actual results:
===============

Command gets timed out after 30 minutes


Expected results:
=================

Command should succeed with the proper output. If the command is failing then appropriate error message must be displayed.


Additional info:

Comment 2 Sanoj Unnikrishnan 2018-05-16 08:55:34 UTC
The issue was seen while doing kernel tarball untar. 
kernel tarball untar creates approx 60k inodes and hence the lru list is filled.

In the function inode_table_dump_to_dict:
Dumping 16384 inodes which are on lru under itable lock can take a few minutes.. , we should not hold the lock for that long. 

Work around: 
change inode-lru-limit to lower value before using "gluster v status inodes"and reset it after the command is run.


Possible solution:

Dumping lru inodes and purge inodes do not seem very useful from debugability perspective, we can avoid them..

the active list inodes can be listed by iterating active-list holding inode ref without itable lock.

Comment 3 Sanoj Unnikrishnan 2018-05-17 07:26:03 UTC
Amar/raghavendra

Do we need to print the lru and purge lists in gluster v status inodes?
Has it been useful anytime / Can we do away with it?

The issue is we cannot take a ref and iterate these list as it will activate the inodes.


For iterating active list, we can do following

/* Take ref on all inodes in active list.. start_inode and end_inode will point  to first and last inodes of the list */

inode_active_list_ref(**start_inode, **end_inode)
{

       pthread_mutex_trylock (&itable->lock)
       list_for_each_entry (inode, &itable->active, list) {
           __inode_ref(inode);        
       }
       start_inode = list_first_entry (&itable->active);
       end_inode = list_last_entry (&itable->active);
       pthread_mutex_unlock (&itable->lock);
}
 
      
 inode_active_list_ref(&start_inode, & end_inode) 
 tmp_inode = start_inode; 
 while (list_entry(tmp_inode) != end_inode) {
              /* This is costly operation hence do it outside itable lock*/
              inode_dump_to_dict(tmp_inode);
 }
 inode_active_list_unref(&start_inode, & end_inode)

Comment 6 Sanoj Unnikrishnan 2018-05-17 09:15:59 UTC
resetting needinfo i accidentally cancelled

Comment 7 Amar Tumballi 2018-05-17 09:41:08 UTC
I am OK with not having the details of purge list. Only active is good. But for purge list, and lru list, lets print the size.

Comment 14 Amar Tumballi 2019-03-13 04:46:23 UTC
https://review.gluster.org/22347/

Comment 22 errata-xmlrpc 2019-10-30 12:19:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:3249