Bug 53898 - kernel IO bottlenecks
kernel IO bottlenecks
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.2
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Doug Ledford
Brock Organ
: FutureFeature
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-09-20 19:03 EDT by Sudhir Shetty
Modified: 2007-04-18 12:37 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2002-02-13 15:22:38 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Kernel profile (20.16 KB, text/plain)
2001-09-20 19:07 EDT, Sudhir Shetty
no flags Details
Gunzip of test program (4.30 KB, application/octet-stream)
2001-09-20 19:11 EDT, Sudhir Shetty
no flags Details
kernelprofile (32.51 KB, text/plain)
2001-10-09 16:44 EDT, Rogelio Noriega
no flags Details
iostat (90.57 KB, text/plain)
2001-10-09 16:47 EDT, Rogelio Noriega
no flags Details

  None (edit)
Description Sudhir Shetty 2001-09-20 19:03:40 EDT
Description of Problem:
There are serious kernel IO bottlenecks in the 2.4.x kernel
that are impacting performance for enterprise applications such 
as Oracle. The result is high system time,etc. for enterprise workloads
making the results unacceptable for tpc benchmarks,etc. 
The bottlenecks are 
a) Bounce buffer allocation for RAM <= 4GB. See kernel 
profile for a test configuration. 
b) __make_request -  due to the global io_request_lock contention

Version-Release number of selected component (if applicable):
Kernel : 2.4.x

How Reproducible:


Steps to Reproduce:
1. System (4-Proc, 4 GB, 4 megaraid controllers-PERC3/DC)
2. Boot with profile=2
3. Run testdevices program 
   The compressed tar attachment includes a Makefile,
   source file tio.c and the executable 'tio'.
   The parameters to 'tio' are the size of the read and
   the time in seconds. 'testdevices' is the driver for this.
   On line 11 in this driver script you could modify the
   size of the read (multiblock) and the time.
   Right now these are set to 512k and 5 minutes
   respectively.

   Usage:

  ./testdevices /dev raw1 raw2 raw3 raw4
   where raw1, raw2, raw3 & raw4 are raw partitions created on 4
   different volumes (controllers), i.e., one process/controller

  ./testdevices /dev raw1 raw1 raw1 raw1
             four processes/controller doing reads

4. While testdevices is running, use iostat and readprofile to 
   determine io and kernel issues.

Actual Results:
See attached kernel profiles.

Expected Results:


Additional Information:
Patches such as 
a) Jens Axboe bounce buffer patch seem to fix issue a) above
b) Experimental patches for the global io lock
Comment 1 Sudhir Shetty 2001-09-20 19:07:00 EDT
Created attachment 32298 [details]
Kernel profile
Comment 2 Sudhir Shetty 2001-09-20 19:11:25 EDT
Created attachment 32299 [details]
Gunzip of test program
Comment 3 Ben LaHaise 2001-09-21 16:46:38 EDT
After doing some poking around inside the scsi layer, it appears that sd.c ends
up calling b_end_io with the io_request_lock held.  This results in any highmem
bounce buffer copies being serialized for scsi requests.  I don't think this
needs the io_request_lock being split just yet, just bugfixing.
Comment 4 Michael K. Johnson 2001-09-26 19:14:28 EDT
In testing we have found several bugs in Jens's highmem nobounce patch,
and we have made progress fixing.

The io_request_lock stuff will take longer to fix because it requires
more auditing.
Comment 5 Rogelio Noriega 2001-10-09 16:44:23 EDT
Created attachment 33661 [details]
kernelprofile
Comment 6 Rogelio Noriega 2001-10-09 16:47:18 EDT
Created attachment 33662 [details]
iostat
Comment 7 Rogelio Noriega 2001-10-09 16:49:03 EDT
FYI Attached are the kernelprofile and iostat logs for kernel 2.4.9-0.18smp.
Comment 8 Michael K. Johnson 2001-11-30 09:00:28 EST
a) is dealt with
b) is being worked on but is a longer-term because the changes
   are initially destabilizing and will require much more work
   not only to complete but also to stabilize.
Comment 9 Arjan van de Ven 2002-02-13 15:22:30 EST
Our advanced server release is fixing most of these issues. How much is still
visible in that beta ?
(and with the latest kernel drop after that ?)
Comment 10 Sudhir Shetty 2002-02-26 16:57:11 EST
This is closed based on feedback from 
Oracle (TPC-R benchmarking efforts).

Note You need to log in before you can comment on or make changes to this bug.