Bug 761801 (GLUSTER-69) - poor stripe read/write performance with a stripe of 4 machines
Summary: poor stripe read/write performance with a stripe of 4 machines
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: GLUSTER-69
Product: GlusterFS
Classification: Community
Component: stripe
Version: mainline
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Amar Tumballi
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-06-25 07:02 UTC by Basavanagowda Kanur
Modified: 2013-12-19 00:03 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Basavanagowda Kanur 2009-06-25 07:02:53 UTC
[Migrated from savannah BTS] - bug 26418 [https://savannah.nongnu.org/bugs/?26418]

Sat 02 May 2009 12:12:23 AM GMT, original submission by Erick Tryzelaar <erickt>:

This appears to be related to bug #26402. I'm writing and reading a 1GB file full of random data into a stripe of 4 idle machines and seeing how they perform. I've discovered that I really have to pump up the block size to get the performance I'd expect without any of the performance translators:

server.vol:

volume posix
type storage/posix
option directory /tmp/gluster
end-volume

volume locks
type features/locks
subvolumes posix
end-volume

volume io-threads
type performance/io-threads
option thread-count 16
subvolumes locks
end-volume

volume server
type protocol/server
option transport-type tcp
option auth.addr.io-threads.allow *
subvolumes io-threads
end-volume

client.vol:

volume machine01
type protocol/client
option transport-type tcp
option remote-host machine01
option remote-subvolume io-threads
end-volume

volume machine02
type protocol/client
option transport-type tcp
option remote-host machine02
option remote-subvolume io-threads
end-volume

volume machine03
type protocol/client
option transport-type tcp
option remote-host machine03
option remote-subvolume io-threads
end-volume

volume machine04
type protocol/client
option transport-type tcp
option remote-host machine04
option remote-subvolume io-threads
end-volume

volume stripe
type cluster/stripe
option block-size *:512KB
subvolumes machine01 machine02 machine03 machine04
end-volume

My test is just to stream the 1GB file to and from gluster with this:

rm /mnt/glusterfs/giant-1gb
cat /tmp/gluster-data/giant-1gb | pv > /mnt/glusterfs/giant-1gb
sleep 5
cat /mnt/glusterfs/giant-1gb | pv > /dev/null

Here are the results I had without the performance translators:

| block-size | write | read |
| 64MB | 64MB/s | 113MB/s |
| 32MB | 64MB/s | 113MB/s |
| 24MB | 65MB/s | 113MB/s |
| 20MB | 64MB/s | 113MB/s |
| 16MB | 64MB/s | 105MB/s |
| 8MB | 62MB/s | 73MB/s |
| 4MB | 62MB/s | 55MB/s |
| 1MB | 49MB/s | 32MB/s |
| 512KB | 54MB/s | 38MB/s |
| 256KB | 51MB/s | 45MB/s |
| 128KB | 48MB/s | 40MB/s |

While there's some noise in the results, I was able to get in this range of bandwidth usage repeatably. So somewhere around 8-16MB the read performance drops, and later for the writes , which are held up by the slow disks.

With read-ahead and write-behind (default options) in the client.vol file:

volume read-ahead
type performance/read-ahead
subvolumes stripe
end-volume

volume write-behind
type performance/write-behind
subvolumes read-ahead
end-volume

I'm getting:

| block-size | write | read |
| 64MB | 108MB/s | 113MB/s |
| 32MB | 106MB/s | 105MB/s |
| 16MB | 77MB/s | 87MB/s |
| 8MB | 89MB/s | 88MB/s |
| 4MB | 74MB/s | 105MB/s |
| 1MB | 63MB/s | 68MB/s |
| 512KB | 67MB/s | 55MB/s |
| 256KB | 61MB/s | 42MB/s |
| 128KB | 48MB/s | 56MB/s |

The overall performance for the smaller block-size is better, but it appeared to be a bit more noisy. The tests results weren't as repeatable as before as I'm guessing the caches are not being consistently used.

Is this expected behavior?

--------------------------------------------------------------------------------
Sat 02 May 2009 12:57:04 AM GMT, comment #1 by Erick Tryzelaar <erickt>:

I did find something interesting with the read-ahead page-count setting. If I set it to 1, then my performance stabilizes a lot at 90MB/s nearly exactly. Also I only need a write-behind cache of 4MB to saturate my link in most cases:

| block-size | write | read |
| 64MB | 113MB/s | 90MB/s |
| 32MB | 113MB/s | 90MB/s |
| 4MB | 113MB/s | 90MB/s |
| 256KB | 113MB/s | 90MB/s |
| 128KB | 113MB/s | 90MB/s |

This is very repeatable, I'm only within +-2MB/s of this every copy. With two pages:

| block-size | write | read |
| 256MB | 113MB/s | 113MB/s |
| 64MB | 113MB/s | 102MB/s |
| 32MB | 105MB/s | 105MB/s |
| 4MB | 113MB/s | 80MB/s | +- 10MB/s
| 256KB | 113MB/s | 50MB/s | +- 5MB/s
| 128KB | 113MB/s | 35MB/s | +- 5MB/s

It's much more noisy, and really affects performance pretty dramatically.

Comment 1 Amar Tumballi 2009-07-13 21:03:18 UTC
I suspect the bug we had in write-behind (O_RDONLY being 0) which used to disable the write-behind performance for 'open' files could have caused this. We need to run our benchmarking with newer version to see if this is still the case.

Comment 2 Amar Tumballi 2009-12-05 17:43:52 UTC
This bug needs benchmarking to be done over latest stripe. Which can be done after the 3.0.0 release. Currently the focus of testing is on stability of new codebase. Removing this bug from 'dependency' list of bug 762118.

I will work on this after the release.


Note You need to log in before you can comment on or make changes to this bug.