Bug 762114 (GLUSTER-382) - Data can be lost before it is read in ib_verbs_receive.
Summary: Data can be lost before it is read in ib_verbs_receive.
Keywords:
Status: CLOSED NOTABUG
Alias: GLUSTER-382
Product: GlusterFS
Classification: Community
Component: ib-verbs
Version: mainline
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Raghavendra G
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-11-13 21:32 UTC by Raghavendra G
Modified: 2010-01-29 12:07 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: RTNR
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Raghavendra G 2009-11-13 21:32:32 UTC
- There can be a condition wherein,
    1. the thread executing ib_verbs_recv_completion_proc (thr 1) stores the
       buffer pointer and notifies the upper translators about a POLLIN event.
    2. the thread waiting for events on socket (thr 2) calls transport_receive,
       but in ib_verbs_receive it has not still read the data.
    3. thr 1 receives work completion event for another work request and
       overwrites the buffer pointer.
    4. thr 2 reads from the new pointer there by missing the data stored in
       buffer pointed by pointer which got overwritten.

Comment 1 Anand Avati 2009-11-16 05:41:35 UTC
PATCH: http://patches.gluster.com/patch/2227 in master (transport/ib-verbs: synchronize ib_verbs_recv_completion_proc with ib_verbs_receive so that the former doesn't overwrite the pointer from which latter reads.)

Comment 2 Anand Avati 2009-11-19 07:52:08 UTC
PATCH: http://patches.gluster.com/patch/2225 in release-2.0 (transport/ib-verbs: synchronize ib_verbs_recv_completion_proc with ib_verbs_receive so that the former doesn't overwrite the pointer from which latter reads.)

Comment 3 Raghavendra G 2010-01-13 20:19:41 UTC
This bug is invalid because, All the below events happen in a single thread i.e., thread executing ib_verbs_recv_completion_proc. We are notifying the pollin event using xlator_notify, but not through sockets, hence by the time xlator_notify returns, transport_receive would've been called and data would've been read.

The patches which were supposed to fix this bug are harmless, since ib_verbs_recv_completion_proc would never go into pthread_cond_wait (as priv->data_ptr will always be NULL at the point it is being checked).

(In reply to comment #0)
> - There can be a condition wherein,
>     1. the thread executing ib_verbs_recv_completion_proc (thr 1) stores the
>        buffer pointer and notifies the upper translators about a POLLIN event.
>     2. the thread waiting for events on socket (thr 2) calls transport_receive,
>        but in ib_verbs_receive it has not still read the data.
>     3. thr 1 receives work completion event for another work request and
>        overwrites the buffer pointer.
>     4. thr 2 reads from the new pointer there by missing the data stored in
>        buffer pointed by pointer which got overwritten.


Note You need to log in before you can comment on or make changes to this bug.