Bug 762615 (GLUSTER-883)

Summary: Data corruption due to thread unsafe reads and writes
Product: [Community] GlusterFS Reporter: Shehjar Tikoo <shehjart>
Component: posixAssignee: Shehjar Tikoo <shehjart>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: urgent    
Version: mainlineCC: gluster-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTP Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Shehjar Tikoo 2010-05-04 07:30:45 UTC
io-threads, till a few weeks back, used to channel all operations on a particular inode over the same thread so that there are no race conditions due to writes over-taking each other.

Recent changes in iothreads have ensured that this race condition is hit very easily. io-threads fires new threads to handle requests when it sees that the queue length is increasing beyond a certain point. Queue length increases can happen due to a slow disk or due to O_SYNC or O_DIRECT writes and in other cases also. The logic has also changed to allow multiple threads to serve fops for the same file. The race condition occurs in posix, where the lseek-readv or lseek-writev sequence is not a critical section, such that lseek of thread1 can be followed by lseek of thread2, resulting in a write on offset of thread2 by both threads. This results in a data corruption.

Comment 1 Anand Avati 2010-05-05 08:36:08 UTC
PATCH: http://patches.gluster.com/patch/3224 in master (posix: Support thread-safe vectored writes/reads)