Bug 2059206 - NFS Ganesha slow small-file performance on RHCS 5.0
Summary: NFS Ganesha slow small-file performance on RHCS 5.0
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: NFS-Ganesha
Version: 5.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 5.2
Assignee: Frank Filz
QA Contact: Vidushi Mishra
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-28 14:00 UTC by mcurrier
Modified: 2022-06-01 15:27 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-06-01 15:27:33 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-3582 0 None None None 2022-02-28 14:02:47 UTC

Description mcurrier 2022-02-28 14:00:07 UTC
Description of problem:

In recent project testing CephFS with smallfile workload, it was uncovered
that NFS Ganesha performance was very slow compared to native CephFS. See

https://docs.google.com/document/d/1IuOnChnWJxvMtkDHX_tNJCZsFJYFzoS-B1sxWb5EjE4/edit

for details.

NFS is 3.75% of the performance of CephFS for create
NFS is 14.5% of the performance of CephFS for read



Version-Release number of selected component (if applicable):

RHCS 5.0
ceph version 16.2.0-146.el8cp pacific (stable)

How reproducible:


Steps to Reproduce:
1. configure NFS Ganesha
2. run smallfile workload over 4 nfs mt points
3. see https://docs.google.com/document/d/1bYoYfuemHQOEOElv-PsGZhjh5PoaTZwEG1In47uZGa0/edit# for details 

Actual results:


Expected results:


Additional info:

Comment 1 Yaniv Kaul 2022-02-28 15:08:32 UTC
Raising severity and priority to High/High, as the numbers are really significantly different. I wonder if we have previous release data to compare. After all, OSP is using NFS (in both Manila and Manila CSI use cases) and should have complained already.

Comment 2 Frank Filz 2022-03-01 18:42:48 UTC
Some time ago now I did some performance testing with Ganesha V3.3 using vdbench to test with multiple clients. I was unable to do much metadata testing, though it does report on the creates while setting up the files. It doesn't do lots of files (really one file per thread running) and they are larger files, but still a useful throughput benchmark.

A report I did:

https://docs.google.com/document/d/1hq_3-o8FEYB9ChDTBLbVRMLa7WztOFoknS82v1FyF3Y/edit?usp=sharing

There is a link to more raw data in that report. I saved off ALL the raw data.

From what I saw on that, yes, Ganesha CephFS performance is not great, but it is better than 50% of Ganesha's FSAL_VFS throughput on an XFS file system and that was better than knfsd's throughput on the same XFS file system. 

Looking at the raw data, I see that the create rate and create response time are comparablle to FSAL_VFS, but that doesn't represent a huge create load.

We would definitely need more examination to see what might be going on here.

Comment 3 Frank Filz 2022-03-04 22:27:40 UTC
I also just realized, with RHCS 5.0 this is with Ganesha V2.5 which really is ancient, with significant performance enhancements available in V3.x.

Comment 4 Ben England 2022-03-07 18:42:50 UTC
Thx Frank, that sounds very reasonable.   So getting an up-to-date NFS Ganesha sounds like step 1.   But that has a big impact on OpenStack (and soon ODF) so we have to check with those teams before just doing it.   cc'ing Giulio Fidente (OpenStack) and Michael Adam (ODF), they can update cc to whomever would be interested in this.

Comment 8 Ben England 2022-03-08 14:41:11 UTC
also added the word "small-file" to title so that it was clearer what this was about.


Note You need to log in before you can comment on or make changes to this bug.