Bug 140383

Summary: BLKFLSBUF ioctl can cause other reads
Product: Red Hat Enterprise Linux 4 Reporter: sheryl sage <sheryl.sage>
Component: kernelAssignee: Jeff Moyer <jmoyer>
Status: CLOSED ERRATA QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 4.0CC: bjohnson, davej, kanderso, linux26port, rkenna, shillman, tburke
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-06-08 15:12:55 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 146015, 147461    
Attachments:
Description Flags
Fix invalidate page race none

Description sheryl sage 2004-11-22 18:37:29 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET 
CLR 1.1.4322)

Description of problem:
BLKFLSBUF ioctl can cause other reads to fail

In the TC, it is doing randio's(read and verify write). Due to
verify-write, I/Os are happening on the block device(In the old raw
interface IOCTLs are not allowed on raw device). In reopen_dev()
function(randio.c), it is issueing BLKFLSBUF ioctl, and is causing  
the other reads to fail.


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
To reproduce the problem, run the following scripts in parllal.

1) script1.sh
     while [ 0 ]
     do
        dd if=/dev/sdm1 of=/dev/null bs=1M count=100
     done


2) script2.sh
     while [ 0 ]
     do
        blockdev --flushbufs /dev/sdm1
     done

Additional info:

If the caller of invalidate_complete_page() is passing a page with a 
reference of 2, he is the only one that currently has a reference to 
it. And that's the only case where it's ok to mark the page not-
uptodate.

The problem is easily reproducable and the problem is actually quite 
obvious once you found out where things went wrong. I think it should 
be committed to GA right away.

Comment 1 Bob Johnson 2004-12-06 17:07:53 UTC
Sheryl, do you have production code that causes this ?


Comment 5 sheryl sage 2004-12-09 23:21:36 UTC
BLKFLSBUF does invalidate_bdev() -> invalidate_inode_pages() ->
invalidate_complete_page() which clears the uptodate page flag even 
if others are looking at this page.

Comment 6 sheryl sage 2004-12-10 09:06:49 UTC
Sorry no we do not have a patch but are requesting one from RedHat 
that fixes the problem.  

Comment 7 sheryl sage 2004-12-14 16:59:54 UTC
Any status?

Comment 8 sheryl sage 2004-12-15 19:14:27 UTC
1.15/mm/truncate.c may have the fix to this problem.
	

Comment 10 Jeff Moyer 2005-02-21 21:07:36 UTC
Created attachment 111279 [details]
Fix invalidate page race

This is the patch that was committed upstream to address this problem.	I have
tested it, and verified that the reproducer runs without errors.  I also tested
the data copied (via dd) to ensure that it is correct.

Comment 11 Jay Turner 2005-04-16 18:51:26 UTC
Fix confirmed in kernel-2.6.9-6.39.EL.  Moving to PROD_READY.

Comment 12 Tim Powers 2005-06-08 15:12:55 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-420.html