Bug 840659

Summary: Performance is considerable slower than running on local disk and SAN
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Patrick Brennan <pbrennan>
Component: glusterfsAssignee: Amar Tumballi <amarts>
Status: CLOSED WONTFIX QA Contact: Sudhir D <sdharane>
Severity: high Docs Contact:
Priority: unspecified    
Version: 2.0CC: amarts, bengland, gluster-bugs, kbarfiel, ssaha, vraman
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-08-24 02:06:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
output from sosreport and log files from /var/log/glusterfs none

Description Patrick Brennan 2012-07-16 20:28:43 UTC
Created attachment 598522 [details]
output from sosreport and log files from /var/log/glusterfs

Running test we obained the following results:

Here are the results of the various application oriented tests that we ran over the last couple of days. “san” refers to our current gfs2 san setup. “scratch” refers to local disks on machines. “san-san” means that the input was on “san” and the output was also generated directly on “san”. We tested the six configurations below since those made the most sense. All numbers are in minutes. Although the goal was to run 5 iterations for all tests, the limited time available didn’t allow us to do that. However, we don’t think the numbers would have been significantly different based on the trends that we saw.

"gphysical" is RHS 2.0 running on physical systems. "gvirtual" is RHS running on RHEL.

Here are the descriptions of the various tests:

 

(1)Speech: Extract features after doing speech recognition on a 100 audio files. Run as parallelized jobs on the oracle grid engine.

(2)Parsing: Run the Stanford parser on 800 essays. Run as parallelized jobs on the oracle grid engine.

(3)LM [on grid]: Build a statistical language model from 7 million English sentences. Run as parallelized jobs on the oracle grid engine.

(4)LM [off grid]: Same as LM [on grid] except it does not utilize the oracle grid engine but runs on a single core on a single machine.

(5)SVN checkout: Checkout an 80,000 file repository on the various file systems.

 

(All numbers are in minutes)





(1) Speech (5 iterations)
		
			
		

                    Mean    Std Dev.   
san-san             74.5     8.7
san-scratch         67.5     5.2
scratch-san         72.7     4.7
scratch-scratch6    7.7     4.3
gphysical-gphysical 92.8    7.6
gvirtual-gvirtual   92.2    2.8
			

(2) Parsing (3 iterations)
		
			
		    mean   std dev.
san-san             126.7   8.3
san-scratch         128.3   15.3
scratch-san         122.3   9
scratch-scratch     131.6   19.4
gphysical-gphysical 128.6   3.6
gvirtual-gvirtual   132.6   8.6
			

(3) LM [on grid] (3 iterations)
		
		

                    Mean    Std. Dev.
san-san             20.2    0.5
san-scratch         20.2    0.6
scratch-san         19.5    0.2
scratch-scratch     19.2    0.4
gphysical-gphysical 21.1    0.2
gvirtual-gvirtual   21.6    0.6
			

(4) LM [off grid] (3 iterations)
		
		

                     Mean    Std. Dev.
san-san              116.6   0.7
san-scratch          117.2   1.1
scratch-san          116.5   2.0
scratch-scratch      113.9   4.0
gphysical-gphysical  112.1   1.2
gvirtual-gvirtual    112.7   3.1
			

(5) SVN checkout (2 iterations)
	
			
		     Mean    Std. Dev.
scratch              0.8     0.2
san                  55.5    4.9
gphysical            66.0    7.1
gvirtual             65.5    2.1

Comment 2 Amar Tumballi 2012-08-24 02:06:01 UTC
Hi Patrick,

Thanks for taking effort on running some comparisions with Local-disk and SAN Storage. Normally, when you are comparing GlusterFS with some other filesystem, consider other distributed filesystem (or distributed solutions which provide access through NFS, ie NAS) (like Ceph/Isilon/...)

Localdisk comparision with GlusterFS is right away ruled out as the usecases are totally different.

With SAN (in this case gfs2), you can say its network based, but you won't be having multiple client accessing the SAN drive, if needed, you need to export that through NFS which can take some more performance hit.

I would be closing this bug with WONTFIX, as this comparision is not truely apples to apples w.r.to GlusterFS