1716875 – Inode Unref Assertion failed: inode->ref

Bug 1716875 - Inode Unref Assertion failed: inode->ref

Summary: Inode Unref Assertion failed: inode->ref

Keywords:
Status:	CLOSED INSUFFICIENT_DATA
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	gluster-smb
Sub Component:
Version:	4.1
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	urgent
Target Milestone:	---
Assignee:	Anoop C S
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-06-04 10:37 UTC by ryan
Modified:	2019-07-19 06:33 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2019-07-19 06:33:36 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Client log from Gluster VFS client showing high RAM usage (30.51 KB, text/plain) 2019-06-04 10:37 UTC, ryan	no flags	Details
View All

Description ryan 2019-06-04 10:37:46 UTC

Created attachment 1577014 [details]
Client log from Gluster VFS client showing high RAM usage

Description of problem:
Samba using huge amounts of memory (5GB) per client thread.
Upon checking the Gluster client logs, it's filled with messages such as:
[2019-04-24 07:44:33.607834] E [inode.c:484:__inode_unref] (-->/lib64/libglusterfs.so.0(gf_dirent_entry_free+0x2b) [0x7ff0a24d555b] -->/lib64/libglusterfs.so.0(inode_unref+0x21) [0x7ff0a24b9921] -->/lib64/libglusterfs.so.0(+0x35156) [0x7ff0a24b9156] ) 0-: Assertion failed: inode->ref
[2019-04-30 13:16:47.169047] E [timer.c:37:gf_timer_call_after] (-->/lib64/libglusterfs.so.0(+0x33bec) [0x7ff09d875bec] -->/lib64/libgfrpc.so.0(+0xde88) [0x7ff09dd7ae88] -->/lib64/libglusterfs.so.0(gf_timer_call_after+0x229) [0x7ff09d875fa9] ) 0-timer: Either ctx is NULL or ctx cleanup started [Invalid argument]
[2019-05-28 17:47:28.655550] E [MSGID: 140003] [nl-cache.c:777:nlc_init] 0-mcv01-nl-cache: Initing the global timer wheel failed
[2019-05-28 17:47:28.655873] E [MSGID: 101019] [xlator.c:720:xlator_init] 0-mcv01-nl-cache: Initialization of volume 'mcv01-nl-cache' failed, review your volfile again
[2019-05-28 17:47:28.655887] E [MSGID: 101066] [graph.c:367:glusterfs_graph_init] 0-mcv01-nl-cache: initializing translator failed
[2019-05-28 17:47:28.655894] E [MSGID: 101176] [graph.c:738:glusterfs_graph_activate] 0-graph: init failed
[2019-05-28 17:47:28.655972] E [MSGID: 104007] [glfs-mgmt.c:744:glfs_mgmt_getspec_cbk] 0-glfs-mgmt: failed to fetch volume file (key:mcv01) [Invalid argument]

Version-Release number of selected component (if applicable):
Gluster 4.1.7


How reproducible:
Unknown

Steps to Reproduce:
Unsure how to reproduce, only seen this in one environment currently

Actual results:
All system memory and swap is exhausted. SMBD processes do not get killed off when main SMB service is stopped, where as usually they do.

Expected results:
System resources are freed up and errors are not present in logs.

Comment 1 Anoop C S 2019-07-04 06:48:46 UTC

Is this still the case with updated packages?

Comment 2 ryan 2019-07-04 10:54:24 UTC

Hi Anoop,

I'll leave some IO going for a day or two and will get back to you with the results.
This will be with the latest package available in the 4.1 branch.

Best,
Ryan

Comment 3 Anoop C S 2019-07-16 06:37:13 UTC

(In reply to ryan from comment #2)
> Hi Anoop,
> 
> I'll leave some IO going for a day or two and will get back to you with the
> results.
> This will be with the latest package available in the 4.1 branch.

Anything new to report here based on your latest tests?

Comment 4 ryan 2019-07-16 08:34:07 UTC

Hi Anoop,

I've left IOMeter from 10 clients performing workloads on the system for around 2 days and have not been able to recreate the issue.

Best,
Ryan

Comment 5 Anoop C S 2019-07-19 06:33:36 UTC

(In reply to ryan from comment #4)
> Hi Anoop,
> 
> I've left IOMeter from 10 clients performing workloads on the system for
> around 2 days and have not been able to recreate the issue.

In that case I would prefer to close the bug report for now. Please feel free to reopen if issue re-occurs.

Note You need to log in before you can comment on or make changes to this bug.