1679892 – assertion failure log in glusterd.log file when a volume start is triggered

Bug 1679892 - assertion failure log in glusterd.log file when a volume start is triggered

Summary: assertion failure log in glusterd.log file when a volume start is triggered

Keywords:
Status:	CLOSED DUPLICATE of bug 1700865
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	glusterd
Sub Component:
Version:	6
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Sanju
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	glusterfs-6.0 1732875
TreeView+	depends on / blocked

Reported:	2019-02-22 07:45 UTC by Atin Mukherjee
Modified:	2019-07-24 15:04 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2019-06-10 11:26:04 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Atin Mukherjee 2019-02-22 07:45:41 UTC

Description of problem:

[2019-02-22 07:38:28.772914] E [MSGID: 101191] [event-epoll.c:765:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler
[2019-02-22 07:38:32.322872] I [glusterd-utils.c:6305:glusterd_brick_start] 0-management: starting a fresh brick process for brick /tmp/b1
[2019-02-22 07:38:32.420144] I [MSGID: 106142] [glusterd-pmap.c:290:pmap_registry_bind] 0-pmap: adding brick /tmp/b1 on port 49152
[2019-02-22 07:38:32.420635] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2019-02-22 07:38:32.491504] E [mem-pool.c:351:__gf_free] (-->/usr/local/lib/glusterfs/6.0alpha/xlator/mgmt/glusterd.so(+0x4842e) [0x7fc95a8f742e] -->/usr/local/lib/glusterfs/6.0alpha/xlator/mgmt/glusterd.so(+0x4821a) [0x7fc95a8f721a] -->/usr/local/lib/libglusterfs.so.0(__gf_free+0x22d) [0x7fc96042ccfd] ) 0-: Assertion failed: mem_acct->rec[header->type].size >= header->size
[2019-02-22 07:38:32.492228] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2019-02-22 07:38:32.493431] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-gfproxyd: setting frame-timeout to 600
[2019-02-22 07:38:32.494848] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600
[2019-02-22 07:38:32.495530] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: nfs already stopped
[2019-02-22 07:38:32.495655] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: nfs service is stopped
[2019-02-22 07:38:32.495728] I [MSGID: 106599] [glusterd-nfs-svc.c:81:glusterd_nfssvc_manager] 0-management: nfs/server.so xlator is not installed

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Out of a 3 node cluster setup create a replica 3 volume and start it.

Actual results:
assertion failure and 'failed to dispatch handler' errors are seen.

Expected results:

No errors in the glusterd log should be seen. Assertion failure log tends to indicate there might be a corruption too which is more severe here.

Additional info:

Comment 1 Atin Mukherjee 2019-03-12 05:02:52 UTC

I don't see this happening any further on the latest testing of the release-6 branch. Will keep this bug open for sometime, but taking out the 6.0 blocker.

Comment 2 Sanju 2019-03-12 05:24:17 UTC

I still see the assertion failure message in the glusterd.log

[2019-03-12 05:19:06.206695] E [mem-pool.c:351:__gf_free] (-->/usr/local/lib/glusterfs/6.0rc0/xlator/mgmt/glusterd.so(+0x48133) [0x7f264602c133] -->/usr/local/lib/glusterfs/6.0rc0/xlator/mgmt/glusterd.so(+0x47f0a) [0x7f264602bf0a] -->/usr/local/lib/libglusterfs.so.0(__gf_free+0x22d) [0x7f265263ac9d] ) 0-: Assertion failed: mem_acct->rec[header->type].size >= header->size

I will update the bug with root cause as soon as possible.

Thanks,
Sanju

Comment 3 Atin Mukherjee 2019-06-09 05:39:11 UTC

Ping! Any progress on this? Is this still seen with latest master?

Comment 4 Sanju 2019-06-10 11:26:04 UTC

https://review.gluster.org/#/c/glusterfs/+/22600/ has removed this assert condition, so we don't see this assertion failure in log now.

Susant is working on this issue and https://bugzilla.redhat.com/show_bug.cgi?id=1700865 is tracking it. So, I'm closing this bug.

Thanks,
Sanju

*** This bug has been marked as a duplicate of bug 1700865 ***

Note You need to log in before you can comment on or make changes to this bug.