Bug 840659 - Performance is considerable slower than running on local disk and SAN
Summary: Performance is considerable slower than running on local disk and SAN
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: 2.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Amar Tumballi
QA Contact: Sudhir D
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-07-16 20:28 UTC by Patrick Brennan
Modified: 2013-12-19 00:08 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-08-24 02:06:01 UTC
Embargoed:


Attachments (Terms of Use)
output from sosreport and log files from /var/log/glusterfs (4.53 MB, application/x-gzip)
2012-07-16 20:28 UTC, Patrick Brennan
no flags Details

Description Patrick Brennan 2012-07-16 20:28:43 UTC
Created attachment 598522 [details]
output from sosreport and log files from /var/log/glusterfs

Running test we obained the following results:

Here are the results of the various application oriented tests that we ran over the last couple of days. “san” refers to our current gfs2 san setup. “scratch” refers to local disks on machines. “san-san” means that the input was on “san” and the output was also generated directly on “san”. We tested the six configurations below since those made the most sense. All numbers are in minutes. Although the goal was to run 5 iterations for all tests, the limited time available didn’t allow us to do that. However, we don’t think the numbers would have been significantly different based on the trends that we saw.

"gphysical" is RHS 2.0 running on physical systems. "gvirtual" is RHS running on RHEL.

Here are the descriptions of the various tests:

 

(1)Speech: Extract features after doing speech recognition on a 100 audio files. Run as parallelized jobs on the oracle grid engine.

(2)Parsing: Run the Stanford parser on 800 essays. Run as parallelized jobs on the oracle grid engine.

(3)LM [on grid]: Build a statistical language model from 7 million English sentences. Run as parallelized jobs on the oracle grid engine.

(4)LM [off grid]: Same as LM [on grid] except it does not utilize the oracle grid engine but runs on a single core on a single machine.

(5)SVN checkout: Checkout an 80,000 file repository on the various file systems.

 

(All numbers are in minutes)





(1) Speech (5 iterations)
		
			
		

                    Mean    Std Dev.   
san-san             74.5     8.7
san-scratch         67.5     5.2
scratch-san         72.7     4.7
scratch-scratch6    7.7     4.3
gphysical-gphysical 92.8    7.6
gvirtual-gvirtual   92.2    2.8
			

(2) Parsing (3 iterations)
		
			
		    mean   std dev.
san-san             126.7   8.3
san-scratch         128.3   15.3
scratch-san         122.3   9
scratch-scratch     131.6   19.4
gphysical-gphysical 128.6   3.6
gvirtual-gvirtual   132.6   8.6
			

(3) LM [on grid] (3 iterations)
		
		

                    Mean    Std. Dev.
san-san             20.2    0.5
san-scratch         20.2    0.6
scratch-san         19.5    0.2
scratch-scratch     19.2    0.4
gphysical-gphysical 21.1    0.2
gvirtual-gvirtual   21.6    0.6
			

(4) LM [off grid] (3 iterations)
		
		

                     Mean    Std. Dev.
san-san              116.6   0.7
san-scratch          117.2   1.1
scratch-san          116.5   2.0
scratch-scratch      113.9   4.0
gphysical-gphysical  112.1   1.2
gvirtual-gvirtual    112.7   3.1
			

(5) SVN checkout (2 iterations)
	
			
		     Mean    Std. Dev.
scratch              0.8     0.2
san                  55.5    4.9
gphysical            66.0    7.1
gvirtual             65.5    2.1

Comment 2 Amar Tumballi 2012-08-24 02:06:01 UTC
Hi Patrick,

Thanks for taking effort on running some comparisions with Local-disk and SAN Storage. Normally, when you are comparing GlusterFS with some other filesystem, consider other distributed filesystem (or distributed solutions which provide access through NFS, ie NAS) (like Ceph/Isilon/...)

Localdisk comparision with GlusterFS is right away ruled out as the usecases are totally different.

With SAN (in this case gfs2), you can say its network based, but you won't be having multiple client accessing the SAN drive, if needed, you need to export that through NFS which can take some more performance hit.

I would be closing this bug with WONTFIX, as this comparision is not truely apples to apples w.r.to GlusterFS


Note You need to log in before you can comment on or make changes to this bug.