Bug 620053 - Multi thread I/O behaves like serialized in some situation. [NEEDINFO]
Summary: Multi thread I/O behaves like serialized in some situation.
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.5
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Red Hat Kernel Manager
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Keywords:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-07-31 16:39 UTC by Kirby Zhou
Modified: 2014-06-02 13:23 UTC (History)
1 user (show)

(edit)
Clone Of:
(edit)
Last Closed: 2014-06-02 13:23:21 UTC
pm-rhel: needinfo? (kirbyzhou)


Attachments (Terms of Use)
testio (4.31 KB, text/plain)
2010-07-31 16:39 UTC, Kirby Zhou
no flags Details
strace log (86.14 KB, application/x-gzip-compressed)
2010-07-31 17:00 UTC, Kirby Zhou
no flags Details

Description Kirby Zhou 2010-07-31 16:39:25 UTC
Created attachment 435768 [details]
testio

Description of problem:

DEV1 is a block device with SOME partitions.
DEV2 is a block device without any DATA.

Each time, just after the system rebooting, random-accessing to DEV1 by multi thread Multi thread I/O behaves like serialized, it is very slow. At the same time DEV2 is fast.

Normally, To access the block device by 10 threads cocurrently should be faster than single thread, if the 2 number of total IO requests are equal. But a blank block devices behaves different.

The more strange thing, If I insert a sleep between each I/O, the throughput goes up, than I remove the sleep, the throughput keeps high as DEV2.

Version-Release number of selected component (if applicable):

kernel-2.6.18-194.el5
kernel-2.6.18-194.8.1.el5
kernel-2.6.18-164.15.1.el5
glibc-2.5-49.el5_5.2

How reproducible:

100% at my 2 server.
The two have different RAID controllers.

Steps to Reproduce:

]# g++ testio.cpp -o testio -g -O2 -pthread
]# dd if=/dev/zero of=/dev/DEV2 bs=1M count=1024
]# reboot
...
]# time ./testio -r /dev/DEV1 10 4096 0 1024000000 4096 32768 100
..........<<<<<<<<<<
real    0m1.457s
user    0m0.001s
sys     0m0.021s

]# time ./testio -r /dev/DEV2 10 4096 0 1024000000 4096 32768 100                
..........<<<<<<<<<<
real    0m11.596s
user    0m0.004s
sys     0m0.035s

# Now, I insert the sleep of 50ms, speed goes up
]# time ./testio -r -s50000 /dev/DEV2 10 4096 0 1024000000 4096 32768 100 
real    0m6.091s
user    0m0.001s
sys     0m0.019s
# Now, I remove the sleep, speed keeps high
]# time ./testio -r /dev/DEV2 10 4096 0 1024000000 4096 32768 100                
..........<<<<<<<<<<
real    0m1.416s
user    0m0.002s
sys     0m0.022s

Expected results:


Additional info:

I have tried boot with a debian-5 live cd, all things goes OK.
I have tried to swap DEV1 and DEV2, the blank one is always the slow one.
It seems that, the threads sometimes be serialzed. You can check the strace log:

strace -f -s0 -etrace=pread ./testio -r /dev/DEV2 10 4096 0 1024000000 4096 32768 100 2>&1 | fgrep -v '<...' | cut -d ' ' -f1-4 | uniq -c | less

When it is slow:

     10 [pid  4993] pread(10,
     10 [pid  4995] pread(12,
      8 [pid  4994] pread(11,
      8 [pid  4986] pread(3,
      7 [pid  4987] pread(4,
      9 [pid  4988] pread(5,
      9 [pid  4989] pread(6,
      9 [pid  4990] pread(7,
      8 [pid  4991] pread(8,
      8 [pid  4992] pread(9,
      8 [pid  4993] pread(10,
      9 [pid  4995] pread(12,
     10 [pid  4994] pread(11,
      9 [pid  4986] pread(3,
      9 [pid  4987] pread(4,
      9 [pid  4988] pread(5,
     10 [pid  4989] pread(6,
      7 [pid  4990] pread(7,
      8 [pid  4991] pread(8,
      9 [pid  4992] pread(9,
     10 [pid  4993] pread(10,
      6 [pid  4995] pread(12,
      8 [pid  4994] pread(11,
      7 [pid  4986] pread(3,
      8 [pid  4987] pread(4,

When it is fast:

      1 [pid  5151] pread(9,
      1 [pid  5152] pread(10,
      1 [pid  5147] pread(5,
      1 [pid  5149] pread(7,
      1 [pid  5150] pread(8,
      1 [pid  5147] pread(5,
      1 [pid  5151] pread(9,
      1 [pid  5145] pread(3,
      1 [pid  5153] pread(11,
      1 [pid  5150] pread(8,
      1 [pid  5146] pread(4,
      1 [pid  5154] pread(12,
      1 [pid  5150] pread(8,
      1 [pid  5149] pread(7,
      1 [pid  5147] pread(5,
      1 [pid  5146] pread(4,
      1 [pid  5149] pread(7,
      1 [pid  5152] pread(10,
      1 [pid  5150] pread(8,
      1 [pid  5148] pread(6,
      1 [pid  5147] pread(5,
      1 [pid  5149] pread(7,
      1 [pid  5151] pread(9,
      1 [pid  5152] pread(10,
      2 [pid  5151] pread(9,
      1 [pid  5147] pread(5,
      1 [pid  5152] pread(10,
      1 [pid  5151] pread(9,
      1 [pid  5147] pread(5,
      1 [pid  5150] pread(8,
      1 [pid  5147] pread(5,
      1 [pid  5145] pread(3,

You can see that. In the slow situation, threads DO NOT run cocurrently, one thread fires 10 IO, than another. And in the fast situation, threads runs cocurrently.

Comment 1 Kirby Zhou 2010-07-31 16:42:36 UTC
You can run './testio -r /dev/DEV2 10 4096 0 1024000000 4096 32768 100' many times, it is always slow.

However, Once you have ran './testio -r -s50000 /dev/DEV2 10 4096 0 1024000000 4096 32768 100', than the prior one becomes fast.

Comment 2 Kirby Zhou 2010-07-31 16:47:21 UTC
If your memory is very large, you can increase 1024000000 (1G) to 1024000000000 (100G) or more huge.

DEV1 and DEV2 must be individual RAID arrays. 
for example: /dev/sdc or /dev/cciss/c0d2.

Comment 3 Kirby Zhou 2010-07-31 17:00:15 UTC
Created attachment 435772 [details]
strace log

The order is:

DEV1fast.log
DEV2slow1.log
DEV2slow2.log
DEV2sleep.log
DEV2fast.log

Comment 4 RHEL Product and Program Management 2014-03-07 13:55:54 UTC
This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug.

Comment 5 RHEL Product and Program Management 2014-06-02 13:23:21 UTC
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support).


Note You need to log in before you can comment on or make changes to this bug.