Bug 764889 (GLUSTER-3157) - race in rpc reply submit in glusterd
Summary: race in rpc reply submit in glusterd
Keywords:
Status: CLOSED NOTABUG
Alias: GLUSTER-3157
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: krishnan parthasarathi
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-07-12 09:22 UTC by Anand Avati
Modified: 2015-11-03 23:03 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-07-11 06:44:55 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Anand Avati 2011-07-12 09:22:17 UTC
reference to the reply message must be kept till churning of reply buffer is complete.

Comment 1 Jeff Darcy 2011-10-18 21:56:18 UTC
Thinking that this might be related to problems I had seen using the SSL/multi-threaded transport code with glusterd, I did some tests to see if I could find a repeatable test case.  I was unsuccessful.  With only the multi-threading part enabled, everything seemed to run fine.  With the SSL part also enabled, I ran into some problems but none attributable to this (mostly they were to do with mismatches between code paths that were trying to use SSL and other code paths that still weren't).

That doesn't mean there's not a bug here, just that it's one I haven't been able to hit.  From my understanding of the socket code, it is possible that a reply will be enqueued for later transmission, and could be freed before it's actually transmitted.  Same with requests BTW.  In either case it would require that the socket be blocked (e.g. window full) which implies a level of activity I wouldn't expect to see.

Also, I looked at how iobrefs are handled in server_submit_reply and glusterd_submit_reply (at KP's suggestion).  AFAICT these functions rightly expect that the transport will take refs if it needs to hang on to the reply for later, and the socket transport (didn't look at RDMA) does so.  Avati, can you elaborate on what sequence of events you're concerned about that would lead to either a premature free or a memory leak?


Note You need to log in before you can comment on or make changes to this bug.