Bug 62993 - Significant performance drop in software RAID5 2.4.9->2.4.18
Summary: Significant performance drop in software RAID5 2.4.9->2.4.18
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.3
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2002-04-08 22:47 UTC by Need Real Name
Modified: 2007-04-18 16:41 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2003-06-08 11:14:03 UTC
Embargoed:


Attachments (Terms of Use)

Description Need Real Name 2002-04-08 22:47:15 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.9) Gecko/20020310

Description of problem:
Between the 2.4.9-21 kernel and the 2.4.18-0.13 (current rawhide)
kernel, changes in the kernel have caused RAID performace to drop.
Here are two benchmarks:

bonnie++-1.02a, software raid, 2 cpus, 7 SCSI drives, kernel 2.4.9-21smp, raid5,
128K chunk size, ext2 filesystem, 4k block size, stride=32
Version  1.02a      ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
green            4G  6946  99 50638  76 22538  36  6572  97 67987  53 269.6   4
green            4G  6988  99 49106  74 21629  35  6567  97 67920  54 271.4   4
green            4G  6990 100 48919  73 22249  36  6570  97 68117  54 276.7   5
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
green            16   859  99 +++++ +++ +++++ +++   751  99 +++++ +++  3073  99
green            16   798 100 +++++ +++ +++++ +++   817  99 +++++ +++  2911 100
green            16   791 100 +++++ +++ +++++ +++   797  99 +++++ +++  2839 100


bonnie++-1.02a, software raid, 2 cpus, 7 SCSI drives, kernel 2.4.18-0.13smp,
raid5, 128K chunk size, ext2 filesystem, 4k block size, stride=32
Version  1.02a      ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
green            4G  6968  99 46785  54 21278  36  6542  97 63970  53 285.7   3
green            4G  6983 100 47339  56 21373  36  6539  97 63141  51 282.1   4
green            4G  6979  99 49556  60 21912  37  6543  97 61775  50 278.9   4
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
green            16   750  99 +++++ +++ +++++ +++   768 100 +++++ +++  2883  99
green            16   745  99 +++++ +++ +++++ +++   755  99 +++++ +++  2817 100
green            16   800  99 +++++ +++ +++++ +++   848  99 +++++ +++  3083  99


Note that block writes droped from ~49 MB/sec to ~47 MB/sec, and 
reads dropped from ~68 MB/sec to ~63MB/sec.

This is one configuration, but I have tried varying many of 
the above listed parameters, and performance is worse across the 
board.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.  Use this /etc/raidtab
raiddev /dev/md0
          raid-level      5
          nr-raid-disks   7
     nr-spare-disks  1
          persistent-superblock 1
          parity-algorithm        left-symmetric
          chunk-size      128
          device          /dev/sda1
          raid-disk       0
          device          /dev/sdb1
          raid-disk       1
          device          /dev/sdc1
          raid-disk       2
          device          /dev/sdd1
          raid-disk       3
          device          /dev/sde1
          raid-disk       4
          device          /dev/sdf1
          raid-disk       5
          device          /dev/sdg1
          raid-disk       6
          device          /dev/sdh1
          spare-disk      0

2.  mkraid --force /dev/md0
3.  mke2fs -b 4096 -R stride=32
4.  mount /dev/md0 /wherever
5.  bonnie++ -d /wherever -s 4096 -m green -r 768 -x 3 -u 0:0
	

Actual Results:  
2.4.18-0.13 kernel performed worse than 2.4.9-21

Expected Results:  
Same performance, or better performance than 2.4.9-21

Additional info:

Comment 1 Arjan van de Ven 2002-04-09 09:31:57 UTC
Note that some of the rawhide kernels have debugging enabled which makes them
slower anyway; 2.4.18-0.16/18 should have this disabled.

Can you give any info on what CPU this is?

Comment 2 Need Real Name 2002-04-10 02:05:21 UTC
For the record, the system is a dual P-III 500:

processor : 0
vendor_id : GenuineIntel
cpu family    : 6
model     : 7
model name    : Pentium III (Katmai)
stepping : 3
cpu MHz        : 501.145
cache size    : 512 KB
fdiv_bug : no
hlt_bug     : no
f00f_bug : no
coma_bug : no
fpu     : yes
fpu_exception : yes
cpuid level    : 2
wp     : yes
flags     : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 mmx fxsr sse
bogomips : 999.42

Processor 1 is identical

Other hardware info (/proc/pci):

PCI devices found:
  Bus  0, device   0, function  0:
    Host bridge: Intel Corporation 440GX - 82443GX Host bridge (rev 0).
      Master Capable.  Latency=64.
      Prefetchable 32 bit memory at 0xf8000000 [0xfbffffff].
  Bus  0, device   1, function  0:
    PCI bridge: Intel Corporation 440GX - 82443GX AGP bridge (rev 0).
      Master Capable.  Latency=64.  Min Gnt=136.
  Bus  0, device   7, function  0:
    ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 2).
  Bus  0, device   7, function  1:
    IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 1).
      Master Capable.  Latency=64.
      I/O at 0xffa0 [0xffaf].
  Bus  0, device   7, function  2:
    USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 1).
      Master Capable.  Latency=64.
      I/O at 0xef80 [0xef9f].
  Bus  0, device   7, function  3:
    Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 2).
      IRQ 9.
  Bus  0, device  15, function  0:
    Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 48).
      IRQ 9.
      Master Capable.  Latency=64.  Min Gnt=10.Max Lat=10.
      I/O at 0xec00 [0xec7f].
      Non-prefetchable 32 bit memory at 0xffafef80 [0xffafefff].
  Bus  0, device  20, function  0:
    SCSI storage controller: Adaptec AHA-2940U2/W (rev 0).
      IRQ 10.
      Master Capable.  Latency=64.  Min Gnt=39.Max Lat=25.
      I/O at 0xe800 [0xe8ff].
      Non-prefetchable 64 bit memory at 0xffaff000 [0xffafffff].
  Bus  1, device   0, function  0:
    VGA compatible controller: ATI Technologies Inc 3D Rage Pro AGP 1X/2X (rev 92).
      IRQ 11.
      Master Capable.  Latency=64.  Min Gnt=8.
      Prefetchable 32 bit memory at 0xf5000000 [0xf5ffffff].
      I/O at 0xd800 [0xd8ff].
      Non-prefetchable 32 bit memory at 0xff9ff000 [0xff9fffff]. 


NOW...

I have tried the 2.4.18-0.18 kernel as recommended (just noticed it) 
and here are the results:

Version  1.02a      ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
green            4G  6968  99 52300  65 22415  38  6547  97 66463  55 289.3   4
green            4G  6999  99 54028  68 22460  39  6542  97 63083  52 277.1   4
green            4G  6996  99 53888  68 22597  39  6543  97 62977  52 287.8   4
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
green            16   725  99 +++++ +++ +++++ +++   795 100 +++++ +++  2718  99
green            16   781 100 +++++ +++ +++++ +++   839 100 +++++ +++  2938 100
green            16   761 100 +++++ +++ +++++ +++   771 100 +++++ +++  2879  99


It significantly improves the write performance, even beyond 2.4.9-21, 
which is great!  The read performance still lags, though, as with 
2.4.18-0.13.  Personally, I need write performance more than 
read performance, so you can mark this issue as resoved for me, but
others may be more concerned with squeezing every last ounce of read
speed out of their hardware (and I certainly wouldn't mind either).

Perhaps a design trade-off has been made to make kernel changes to
improve writes at the expense of reads, but if the change in reads
is unintentional, the developers will probably want to know.

Thanks.




Note You need to log in before you can comment on or make changes to this bug.