Bug 1580361

Summary:	rpc: ABRT, SEGV in rpcsvc_handle_disconnect->glusterd_rpcsvc_notify
Product:	[Community] GlusterFS	Reporter:	Kaleb KEITHLEY <kkeithle>
Component:	glusterd	Assignee:	bugs <bugs>
Status:	CLOSED INSUFFICIENT_DATA	QA Contact:
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	mainline	CC:	amukherj, bugs, jeff, kkeithle
Target Milestone:	---
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2018-12-05 06:55:05 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Kaleb KEITHLEY 2018-05-21 11:07:19 UTC

Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Kaleb KEITHLEY 2018-05-21 11:09:46 UTC

See https://retrace.fedoraproject.org/faf/reports/2170288/

includes a rudimentary BT

Comment 2 Jeff Darcy 2018-05-21 12:11:20 UTC

I see there's some RDMA code in the stack trace. Any chance this is RDMA-specific, e.g. some kind of ordering/serialization assumption fulfilled by the socket code but not by the RDMA code? That wouldn't necessarily mean it's a flaw in the RDMA code or should be fixed there, but might provide a useful hint. Or maybe it's pure coincidence.

Comment 3 Atin Mukherjee 2018-06-18 03:40:29 UTC

Do we have a reproducer for this?

Comment 4 Kaleb KEITHLEY 2018-06-18 11:34:07 UTC

No, _I_ don't have a reproducer.

This is an automated ABRT report coming from boxes running community gluster installed from the CentOS Storage SIG repos.

I'm simply relaying the report that gets forwarded to me as the Gluster packager in Fedora. (CentOS ABRTs get sent to Fedora.)

Although as you can see at the link posted in Comment 1 it has occurred 67 times, so it seems like it should be easy to reproduce.

Comment 5 Atin Mukherjee 2018-10-05 02:38:48 UTC

Kaleb - do you still see this happening? I'm looking for a core file (specifically the complete backtrace) and the test which is run to get to the crash by which we can begin the investigation.

Comment 6 Shyamsundar 2018-10-23 14:54:00 UTC

Release 3.12 has been EOLd and this bug was still found to be in the NEW state, hence moving the version to mainline, to triage the same and take appropriate actions.

Comment 7 Atin Mukherjee 2018-12-05 06:55:05 UTC

Doesn't have sufficient data to debug this. Closing it.