Bug 1743595

Summary: Increased performance for Samba vfs_glusterfs when using pthreadpool
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Guenther Deschner <gdeschner>
Component: sambaAssignee: Guenther Deschner <gdeschner>
Status: CLOSED ERRATA QA Contact: Vivek Das <vdas>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.4CC: amanzane, amukherj, anoopcs, bkunal, dkochuka, mduasope, olim, pgurusid, puebele, rcyriac, rhs-smb, skandark, skourdi, vdas
Target Milestone: ---Keywords: Performance
Target Release: RHGS 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: samba-4.9.8-108.el7rhgs Doc Type: Enhancement
Doc Text:
Asynchronous I/O operations were impeded by a bottleneck in the workflow at the point of notification of successful completion. The bottleneck has been removed and asynchronous I/O operations now perform better.
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-10-30 12:18:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1696809    
Attachments:
Description Flags
pthreadpool and profiling patch none

Description Guenther Deschner 2019-08-20 09:25:52 UTC
Samba can provide much better IO performance when using async io (enabled by default in RHGS) *and* when converting to the internal pthreadpool framework. Currently Samba does not use the pthreadpool framework and thus sometimes performances even poorer than Samba when exporting a fusemount.

We have patches in place, successfully tested in a high performance customer setup. They will be submitted to upstream very soon. Then we need to make these patches available to all customer because it's positive performance impact is very significant.

These need for this change has been discovered while working on a gluster fuse bug at the same customer, see bug #1715427 for further details.

Comment 4 Guenther Deschner 2019-08-22 15:29:46 UTC
Created attachment 1607030 [details]
pthreadpool and profiling patch

Comment 15 Poornima G 2019-08-27 17:36:29 UTC
RCA:

Observation from the tests run in customer site:
The throughput on Fuse mount exported Gluster volume is better than VFS(gfapi) exported Gluster volume, only when aio is enabled.

RCA:
The AIO(Asynchronous IO) implementation in VFS gluster module is different from the AIO implementation in the VFS default module. In VFS gluster, we use the async APIs provided by libgfapi, and the callback of async API is executed in another thread. Since Samba is majorly single threaded, this approach of executing async callback in another thread was resulting in crashes. Hence we had to register another event to notify the async callback, so that we workaround the scenario of executing async callback in another thread. But this resulted in reduces performance, as it delays the notification of the AIO completion(it requires two event loop to mark the completion). The solution was to change the AIO implementation to use pthread pool rather than Async APIs by the libgfapi.

Comment 39 sameer kandarkar 2019-10-11 10:19:45 UTC
Hello Team,

Customer installed the hotfix and provided an update that Implementation of the hotfix has been successful. 


-----------
This week we have been installing the hotfix as delivered for the performance issue when using VFS (so disabling the need of the fuse mount / SMB export).
Implementation of that update has been successful. 
Only issue encountered was that on cvltgelgln01 we had the test fix installed that made installation of the hotfix impossible (it did report nothing to do).
Roll back of the test fix did fix this issue, so after that also that node was updated successfully.
We have removed all additional logging settings as well as all additional created shares and volumes, so that this system is now production ready.
I've preformed a number of performance tests yesterday and today.
See attached the results of those.
You can see that on both the local and the CTDB IP address we do see the expected improvement, this test even was better than the ones performed with the testfix. Hence confirming the working of the hotfix for the issue at hand.
Only exception there seems to be writes when using the hostname used for round robbin DNS, this i consider a result of more load from commvault on the system at the time of the test (morning vs. afternoon for all other tests), also this is still a slight improvement vs earlier fuse mount tests.

Since we now really need the storage in our backup environment, we will be moving production backup load to these systems in the next days/weeks, so unfortunately we won't be able to perform any tests on the system that involve installing additional  software packages, or interfering with the production status of the system. So we won't be able to reproduce the results and logging via the script delivered.
-----------


Customer has a query:

A final question, since we need to install this hotfix on top of a specific version of gluster and a newer version is / might be available when adding new nodes to the cluster, we have the idea that we could downgrade gluster after initial install (since we use gdeploy for this, it will come with the most recent version to our knowledge). 
Is this indeed a proper way to add nodes (untill a permanent fix is available and we have upgraded to a version with that fix) ?

Regards,
Sameer

Comment 49 errata-xmlrpc 2019-10-30 12:18:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:3253