| Summary: | race-condition in accessing fdctx in protocol/client | ||
|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Raghavendra G <raghavendra> |
| Component: | protocol | Assignee: | Raghavendra G <raghavendra> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |
| Severity: | medium | Docs Contact: | |
| Priority: | low | ||
| Version: | 2.0.3 | CC: | gluster-bugs, shehjart, vinayak |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | --- | |
| Regression: | RTNR | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Shehjar Tikoo
2009-07-10 04:57:12 UTC
In protocol/client, fdctx is accessed by two sets of procedures, protocol_client_mark_fd_bad falls in one set whereas the other set consists of all fops which receive fd as an argument. The way these fdctxs are got is different in these two sets. While in the former set, fdctx is accessed through conf->saved_fds, which is a list of fdctxs of fds representing opened/created files. In the latter set, fdctxs are got directly from fd through fd_ctx_get(). Now there can be race conditions between two threads executing one procedure from these two sets. As an example let us consider following scenario:
A flush operation is timed out and polling thread executing protocol_client_mark_fd_bad, fuse thread executing client_release. This can happen because, immediately a reply for flush is written to fuse, a release on the same fd can be sent to glusterfs and the polling thread still might be doing cleanup. Consider following set of events:
1. fuse thread does fd_ctx_get (fd).
2. polling thread gets the same fdctx but through conf->saved_fds.
3. Now both threads go ahead and does list_del (fdctx) and eventually free fdctx.
In other situations the same set events might occur and the threads executing fops other than flush in the second set might be accessing a fdctx freed in protocol_client_mark_fd_bad.
As a fix I propose all the procedures in both sets use conf->mutex to synchronize access to fdctx. More specifically,
procedures in set 2:
====================
LOCK (conf->mutex);
{
fdctx = this_fd_ctx_fd (fd);
get remote fd no;
do other things requiring fdctx
}
UNLOCK (conf->mutex);
procedures in set 1 (protocol_client_mark_fd_bad):
==================================================
LOCK (conf->mutex);
{
list_for_each_entry in conf->saved_fds {
fd_ctx_del (fdctx->fd);
list_del (fdctx);
FREE (fdctx);
}
}
UNLOCK (conf->mutex);
(In reply to comment #0) > In protocol/client, fdctx is accessed by two sets of procedures, > protocol_client_mark_fd_bad falls in one set whereas the other set consists of > all fops which receive fd as an argument. The way these fdctxs are got is > different in these two sets. While in the former set, fdctx is accessed through > conf->saved_fds, which is a list of fdctxs of fds representing opened/created > files. In the latter set, fdctxs are got directly from fd through fd_ctx_get(). > Now there can be race conditions between two threads executing one procedure > from these two sets. As an example let us consider following scenario: > > A flush operation is timed out and polling thread executing > protocol_client_mark_fd_bad, fuse thread executing client_release. This can > happen because, immediately a reply for flush is written to fuse, a release on > the same fd can be sent to glusterfs and the polling thread still might be > doing cleanup. Consider following set of events: > > 1. fuse thread does fd_ctx_get (fd). > 2. polling thread gets the same fdctx but through conf->saved_fds. > 3. Now both threads go ahead and does list_del (fdctx) and eventually free > fdctx. The solution lies is forcing both accesses to happen through the same data path. i.e. we need force both accesses to happen through the fd_t type, as is the correct way to access the fdctx. One way to do this is to store a list of fd_ts in saved_fds, instead of storing fdctxs directly. > > In other situations the same set events might occur and the threads executing > fops other than flush in the second set might be accessing a fdctx freed in > protocol_client_mark_fd_bad. > > As a fix I propose all the procedures in both sets use conf->mutex to > synchronize access to fdctx. More specifically, > > procedures in set 2: > ==================== > LOCK (conf->mutex); > { > fdctx = this_fd_ctx_fd (fd); > get remote fd no; > do other things requiring fdctx > } > UNLOCK (conf->mutex); Once both sets of accesses are syncronized through the fd_t->lock, then there will be no need to have such a big critical section. > > procedures in set 1 (protocol_client_mark_fd_bad): > ================================================== > LOCK (conf->mutex); > { > list_for_each_entry in conf->saved_fds { > fd_ctx_del (fdctx->fd); > list_del (fdctx); > FREE (fdctx); > } > } > UNLOCK (conf->mutex); The fact that the lock is not being taken for this loop is a bug. This needs to be fixed anyway. (In reply to comment #2) > (In reply to comment #0) > > In protocol/client, fdctx is accessed by two sets of procedures, > > protocol_client_mark_fd_bad falls in one set whereas the other set consists of > > all fops which receive fd as an argument. The way these fdctxs are got is > > different in these two sets. While in the former set, fdctx is accessed through > > conf->saved_fds, which is a list of fdctxs of fds representing opened/created > > files. In the latter set, fdctxs are got directly from fd through fd_ctx_get(). > > Now there can be race conditions between two threads executing one procedure > > from these two sets. As an example let us consider following scenario: > > > > A flush operation is timed out and polling thread executing > > protocol_client_mark_fd_bad, fuse thread executing client_release. This can > > happen because, immediately a reply for flush is written to fuse, a release on > > the same fd can be sent to glusterfs and the polling thread still might be > > doing cleanup. Consider following set of events: > > > > 1. fuse thread does fd_ctx_get (fd). > > 2. polling thread gets the same fdctx but through conf->saved_fds. > > 3. Now both threads go ahead and does list_del (fdctx) and eventually free > > fdctx. > > The solution lies is forcing both accesses to happen through the same data > path. > i.e. we need force both accesses to happen through the fd_t type, as is the > correct way to access the fdctx. One way to do this is to store a list of fd_ts > in saved_fds, instead of storing fdctxs directly. Yes, accessing through a common data path is neater than what is being done now. But that only lets us use fd->lock instead of conf->mutex. Also since conf->mutex is used for other than synchronizing access to saved fdctxs, conf->mutex cannot be cleaned up, even if we use fd->lock. Hence we don't gain much here. The problem is concerned with when should fdctx be freed. The race is between two threads executing following critical sections. 1. get fdctx and use information stored in fdctx. 2. get fdctx, delete it from fd and free it. In other words, even after getting fdctx, there is a possibility of fdctx being freed in other thread and the first thread trying to access fdctx after it being freed. The solution I proposed earlier tries to fix this by locking access to the whole of critical sections 1 and 2, hence making each of them atomic. If you feel critical section 1 is big, we can bring in references to fdctx and then critical sections will be, 1. get fdctx, increment fdctx->refcount. 2. get fdctx, delete it from fd->ctx and decrement fdctx->refcount. and fdctx gets freed only when fdctx->refcount is 0. Comments? > > > > > In other situations the same set events might occur and the threads executing > > fops other than flush in the second set might be accessing a fdctx freed in > > protocol_client_mark_fd_bad. > > > > As a fix I propose all the procedures in both sets use conf->mutex to > > synchronize access to fdctx. More specifically, > > > > procedures in set 2: > > ==================== > > LOCK (conf->mutex); > > { > > fdctx = this_fd_ctx_fd (fd); > > get remote fd no; > > do other things requiring fdctx > > } > > UNLOCK (conf->mutex); > > Once both sets of accesses are syncronized through the fd_t->lock, then there > will be no need to have such a big critical section. > > > > > procedures in set 1 (protocol_client_mark_fd_bad): > > ================================================== > > LOCK (conf->mutex); > > { > > list_for_each_entry in conf->saved_fds { > > fd_ctx_del (fdctx->fd); > > list_del (fdctx); > > FREE (fdctx); > > } > > } > > UNLOCK (conf->mutex); > The fact that the lock is not being taken for this loop is a bug. This needs to > be fixed anyway. (In reply to comment #3) > (In reply to comment #2) > > (In reply to comment #0) > > > In protocol/client, fdctx is accessed by two sets of procedures, > > > protocol_client_mark_fd_bad falls in one set whereas the other set consists of > > > all fops which receive fd as an argument. The way these fdctxs are got is > > > different in these two sets. While in the former set, fdctx is accessed through > > > conf->saved_fds, which is a list of fdctxs of fds representing opened/created > > > files. In the latter set, fdctxs are got directly from fd through fd_ctx_get(). > > > Now there can be race conditions between two threads executing one procedure > > > from these two sets. As an example let us consider following scenario: > > > > > > A flush operation is timed out and polling thread executing > > > protocol_client_mark_fd_bad, fuse thread executing client_release. This can > > > happen because, immediately a reply for flush is written to fuse, a release on > > > the same fd can be sent to glusterfs and the polling thread still might be > > > doing cleanup. Consider following set of events: > > > > > > 1. fuse thread does fd_ctx_get (fd). > > > 2. polling thread gets the same fdctx but through conf->saved_fds. > > > 3. Now both threads go ahead and does list_del (fdctx) and eventually free > > > fdctx. > > > > The solution lies is forcing both accesses to happen through the same data > > path. > > i.e. we need force both accesses to happen through the fd_t type, as is the > > correct way to access the fdctx. One way to do this is to store a list of fd_ts > > in saved_fds, instead of storing fdctxs directly. > > Yes, accessing through a common data path is neater than what is being done > now. But that only lets us use fd->lock instead of conf->mutex. Also since > conf->mutex is used for other than synchronizing access to saved fdctxs, > conf->mutex cannot be cleaned up, even if we use fd->lock. Hence we don't gain > much here. > > The problem is concerned with when should fdctx be freed. The race is between > two threads executing following critical sections. > 1. get fdctx and use information stored in fdctx. > 2. get fdctx, delete it from fd and free it. > > In other words, even after getting fdctx, there is a possibility of fdctx being > freed in other thread and the first thread trying to access fdctx after it > being freed. > > The solution I proposed earlier tries to fix this by locking access to the > whole of critical sections 1 and 2, hence making each of them atomic. > > If you feel critical section 1 is big, we can bring in references to fdctx and > then critical sections will be, > 1. get fdctx, increment fdctx->refcount. > 2. get fdctx, delete it from fd->ctx and decrement fdctx->refcount. > > and fdctx gets freed only when fdctx->refcount is 0. > > Comments? Yes. That looks correct. I think its time for a patch to see if the implementation also ends up looking correct. Thanks > > > > > > > > In other situations the same set events might occur and the threads executing > > > fops other than flush in the second set might be accessing a fdctx freed in > > > protocol_client_mark_fd_bad. > > > > > > As a fix I propose all the procedures in both sets use conf->mutex to > > > synchronize access to fdctx. More specifically, > > > > > > procedures in set 2: > > > ==================== > > > LOCK (conf->mutex); > > > { > > > fdctx = this_fd_ctx_fd (fd); > > > get remote fd no; > > > do other things requiring fdctx > > > } > > > UNLOCK (conf->mutex); > > > > Once both sets of accesses are syncronized through the fd_t->lock, then there > > will be no need to have such a big critical section. > > > > > > > > procedures in set 1 (protocol_client_mark_fd_bad): > > > ================================================== > > > LOCK (conf->mutex); > > > { > > > list_for_each_entry in conf->saved_fds { > > > fd_ctx_del (fdctx->fd); > > > list_del (fdctx); > > > FREE (fdctx); > > > } > > > } > > > UNLOCK (conf->mutex); > > The fact that the lock is not being taken for this loop is a bug. This needs to > > be fixed anyway. A patch has been submitted for review at http://patches.gluster.com/patch/758/ *** Bug 239 has been marked as a duplicate of this bug. *** PATCH: http://patches.gluster.com/patch/1380 in master (client-protocol: fix race-condition encountered while accessing fdctx) PATCH: http://patches.gluster.com/patch/1381 in release-2.0 (client-protocol: fix race-condition encountered while accessing fdctx) |