Bug 134338

Summary: Intolerable Disk I/O Performance under 64-bit VM: fix I/O buffers
Product: Red Hat Enterprise Linux 4 Reporter: Reinhard Buendgen <buendgen>
Component: kernelAssignee: Pete Zaitcev <zaitcev>
Status: CLOSED ERRATA QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 4.0CC: bjohnson, davej, riel
Target Milestone: ---   
Target Release: ---   
Hardware: s390x   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-12-08 16:42:40 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dasd fixed buffers patch
none
dasd fixed buffers patch none

Description Reinhard Buendgen 2004-10-01 14:44:28 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET 
CLR 1.1.4322)

Description of problem:
Linux guests running under 64-bit z/VM that do a lot of I/O have 
severe performance problems.
Up to 98% performance impact has been observed.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.run many Linux guest under 64-bit z/VM
2.do heavy DASD I/O on multiple Linux systems 
3.
    

Actual Results:  VM is virtually dead

Additional info:

VM memory management requires locality of I/O buffers.
Introducing this kind of locality fixes the problem.

Comment 2 Matthew Dobson 2004-10-06 18:36:51 UTC
Could you please give us some more detail about this bug?  Ie: How
many linux guests are you running, what exactly do you mean by "heavy
DASD I/O", and also, could you please go into more detail about your
additional info comment about locality fixing the problem?

Thanks!

Comment 3 Klaus Bergmann 2004-10-12 10:38:29 UTC
The problem occurs with any number of guests. Substantial disk I/O 
load can be generated with IOZONE or TPCC. This is the problem:
The current design for I/O under z/VM is that all I/O has to be below 
the 2GB bar. For larger z/VM guests like 64-bit Linux guests the 
problem exists, that pages above the 2GB bar are used. In those cases 
VM has to copy the pages below the 2GB bar in order to execute the 
I/O. These additional copy steps are not considered to be a problem. 
A problem occurs with the number of pages. If the total size exceeds 
2GB storage, then z/VM starts paging into xstor and later to disk 
(again through the 2GB area). In such situations z/VM is more or less 
dead.

Comment 5 Matthew Dobson 2004-10-13 18:24:41 UTC
Well, this problem doesn't sound like it is a *Linux* problem.  It
sounds like a problem with z/VM.  This problem should be noted to the
zSeries developers, as I doubt there is much of anything RedHat can do
about it... :)

Comment 9 Pete Zaitcev 2004-11-11 20:55:33 UTC
Martin came back with a satisfactory explanation why my counter
will not work. I dunno about RC, although the patch is enabled by a
kernel parameter, so it should not be too dangerous.


Comment 10 Jan Glauber 2004-11-13 17:52:59 UTC
Created attachment 106643 [details]
dasd fixed buffers patch

Comment 11 Jan Glauber 2004-11-13 17:53:10 UTC
Created attachment 106644 [details]
dasd fixed buffers patch

Comment 12 Pete Zaitcev 2004-11-23 04:26:16 UTC
Modified in 2.6.9-1.751_EL. Requestor, please test and close.


Comment 13 Pete Zaitcev 2004-11-23 04:28:15 UTC
Regarding comment #9, here's an explanation for posterity:

From: Martin Schwidefsky

> And finally, it seems to me that the block layer is perfectly capable
> of doing this bouncing for you, and it will do it with right GFP, too,
> so you do not have to lean on atomics so much. Please tell me why the
> following will not work:

Fixed I/O buffers are NOT bounce buffers. This is a common
misunderstanding
we faced quited often already. The problem are the bounce buffers in
z/VM not
in the linux image running on z/VM. To ease the problem for z/VM linux has
to use as few I/O pages as possible. Then z/VM doesn't have to shuffle
pages
from above the 2G bar to below the 2G bar all the time. If you define
bounce
buffers with q->bounce_pfn=0x80000000 linux will happily use up to 2G
for its
I/O pages. This is not what we want. With the fixed I/O buffers patch
we use
240*2 I/O pages per device (if the slab allocation doesn't fail). Which is
about 2MB per device.


Comment 14 Tim Powers 2005-06-08 15:12:48 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-420.html