Bug 762114 (GLUSTER-382)

Summary: Data can be lost before it is read in ib_verbs_receive.
Product: [Community] GlusterFS Reporter: Raghavendra G <raghavendra>
Component: ib-verbsAssignee: Raghavendra G <raghavendra>
Status: CLOSED NOTABUG QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: mainlineCC: anush, gluster-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTNR Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Raghavendra G 2009-11-13 21:32:32 UTC
- There can be a condition wherein,
    1. the thread executing ib_verbs_recv_completion_proc (thr 1) stores the
       buffer pointer and notifies the upper translators about a POLLIN event.
    2. the thread waiting for events on socket (thr 2) calls transport_receive,
       but in ib_verbs_receive it has not still read the data.
    3. thr 1 receives work completion event for another work request and
       overwrites the buffer pointer.
    4. thr 2 reads from the new pointer there by missing the data stored in
       buffer pointed by pointer which got overwritten.

Comment 1 Anand Avati 2009-11-16 05:41:35 UTC
PATCH: http://patches.gluster.com/patch/2227 in master (transport/ib-verbs: synchronize ib_verbs_recv_completion_proc with ib_verbs_receive so that the former doesn't overwrite the pointer from which latter reads.)

Comment 2 Anand Avati 2009-11-19 07:52:08 UTC
PATCH: http://patches.gluster.com/patch/2225 in release-2.0 (transport/ib-verbs: synchronize ib_verbs_recv_completion_proc with ib_verbs_receive so that the former doesn't overwrite the pointer from which latter reads.)

Comment 3 Raghavendra G 2010-01-13 20:19:41 UTC
This bug is invalid because, All the below events happen in a single thread i.e., thread executing ib_verbs_recv_completion_proc. We are notifying the pollin event using xlator_notify, but not through sockets, hence by the time xlator_notify returns, transport_receive would've been called and data would've been read.

The patches which were supposed to fix this bug are harmless, since ib_verbs_recv_completion_proc would never go into pthread_cond_wait (as priv->data_ptr will always be NULL at the point it is being checked).

(In reply to comment #0)
> - There can be a condition wherein,
>     1. the thread executing ib_verbs_recv_completion_proc (thr 1) stores the
>        buffer pointer and notifies the upper translators about a POLLIN event.
>     2. the thread waiting for events on socket (thr 2) calls transport_receive,
>        but in ib_verbs_receive it has not still read the data.
>     3. thr 1 receives work completion event for another work request and
>        overwrites the buffer pointer.
>     4. thr 2 reads from the new pointer there by missing the data stored in
>        buffer pointed by pointer which got overwritten.