Red Hat Bugzilla – Bug 840659
Performance is considerable slower than running on local disk and SAN
Last modified: 2013-12-18 19:08:25 EST
Created attachment 598522 [details]
output from sosreport and log files from /var/log/glusterfs
Running test we obained the following results:
Here are the results of the various application oriented tests that we ran over the last couple of days. “san” refers to our current gfs2 san setup. “scratch” refers to local disks on machines. “san-san” means that the input was on “san” and the output was also generated directly on “san”. We tested the six configurations below since those made the most sense. All numbers are in minutes. Although the goal was to run 5 iterations for all tests, the limited time available didn’t allow us to do that. However, we don’t think the numbers would have been significantly different based on the trends that we saw.
"gphysical" is RHS 2.0 running on physical systems. "gvirtual" is RHS running on RHEL.
Here are the descriptions of the various tests:
(1)Speech: Extract features after doing speech recognition on a 100 audio files. Run as parallelized jobs on the oracle grid engine.
(2)Parsing: Run the Stanford parser on 800 essays. Run as parallelized jobs on the oracle grid engine.
(3)LM [on grid]: Build a statistical language model from 7 million English sentences. Run as parallelized jobs on the oracle grid engine.
(4)LM [off grid]: Same as LM [on grid] except it does not utilize the oracle grid engine but runs on a single core on a single machine.
(5)SVN checkout: Checkout an 80,000 file repository on the various file systems.
(All numbers are in minutes)
(1) Speech (5 iterations)
Mean Std Dev.
san-san 74.5 8.7
san-scratch 67.5 5.2
scratch-san 72.7 4.7
scratch-scratch6 7.7 4.3
gphysical-gphysical 92.8 7.6
gvirtual-gvirtual 92.2 2.8
(2) Parsing (3 iterations)
mean std dev.
san-san 126.7 8.3
san-scratch 128.3 15.3
scratch-san 122.3 9
scratch-scratch 131.6 19.4
gphysical-gphysical 128.6 3.6
gvirtual-gvirtual 132.6 8.6
(3) LM [on grid] (3 iterations)
Mean Std. Dev.
san-san 20.2 0.5
san-scratch 20.2 0.6
scratch-san 19.5 0.2
scratch-scratch 19.2 0.4
gphysical-gphysical 21.1 0.2
gvirtual-gvirtual 21.6 0.6
(4) LM [off grid] (3 iterations)
Mean Std. Dev.
san-san 116.6 0.7
san-scratch 117.2 1.1
scratch-san 116.5 2.0
scratch-scratch 113.9 4.0
gphysical-gphysical 112.1 1.2
gvirtual-gvirtual 112.7 3.1
(5) SVN checkout (2 iterations)
Mean Std. Dev.
scratch 0.8 0.2
san 55.5 4.9
gphysical 66.0 7.1
gvirtual 65.5 2.1
Thanks for taking effort on running some comparisions with Local-disk and SAN Storage. Normally, when you are comparing GlusterFS with some other filesystem, consider other distributed filesystem (or distributed solutions which provide access through NFS, ie NAS) (like Ceph/Isilon/...)
Localdisk comparision with GlusterFS is right away ruled out as the usecases are totally different.
With SAN (in this case gfs2), you can say its network based, but you won't be having multiple client accessing the SAN drive, if needed, you need to export that through NFS which can take some more performance hit.
I would be closing this bug with WONTFIX, as this comparision is not truely apples to apples w.r.to GlusterFS