Samba can provide much better IO performance when using async io (enabled by default in RHGS) *and* when converting to the internal pthreadpool framework. Currently Samba does not use the pthreadpool framework and thus sometimes performances even poorer than Samba when exporting a fusemount.
We have patches in place, successfully tested in a high performance customer setup. They will be submitted to upstream very soon. Then we need to make these patches available to all customer because it's positive performance impact is very significant.
These need for this change has been discovered while working on a gluster fuse bug at the same customer, see bug #1715427 for further details.
Created attachment 1607030 [details]
pthreadpool and profiling patch
Observation from the tests run in customer site:
The throughput on Fuse mount exported Gluster volume is better than VFS(gfapi) exported Gluster volume, only when aio is enabled.
The AIO(Asynchronous IO) implementation in VFS gluster module is different from the AIO implementation in the VFS default module. In VFS gluster, we use the async APIs provided by libgfapi, and the callback of async API is executed in another thread. Since Samba is majorly single threaded, this approach of executing async callback in another thread was resulting in crashes. Hence we had to register another event to notify the async callback, so that we workaround the scenario of executing async callback in another thread. But this resulted in reduces performance, as it delays the notification of the AIO completion(it requires two event loop to mark the completion). The solution was to change the AIO implementation to use pthread pool rather than Async APIs by the libgfapi.
Customer installed the hotfix and provided an update that Implementation of the hotfix has been successful.
This week we have been installing the hotfix as delivered for the performance issue when using VFS (so disabling the need of the fuse mount / SMB export).
Implementation of that update has been successful.
Only issue encountered was that on cvltgelgln01 we had the test fix installed that made installation of the hotfix impossible (it did report nothing to do).
Roll back of the test fix did fix this issue, so after that also that node was updated successfully.
We have removed all additional logging settings as well as all additional created shares and volumes, so that this system is now production ready.
I've preformed a number of performance tests yesterday and today.
See attached the results of those.
You can see that on both the local and the CTDB IP address we do see the expected improvement, this test even was better than the ones performed with the testfix. Hence confirming the working of the hotfix for the issue at hand.
Only exception there seems to be writes when using the hostname used for round robbin DNS, this i consider a result of more load from commvault on the system at the time of the test (morning vs. afternoon for all other tests), also this is still a slight improvement vs earlier fuse mount tests.
Since we now really need the storage in our backup environment, we will be moving production backup load to these systems in the next days/weeks, so unfortunately we won't be able to perform any tests on the system that involve installing additional software packages, or interfering with the production status of the system. So we won't be able to reproduce the results and logging via the script delivered.
Customer has a query:
A final question, since we need to install this hotfix on top of a specific version of gluster and a newer version is / might be available when adding new nodes to the cluster, we have the idea that we could downgrade gluster after initial install (since we use gdeploy for this, it will come with the most recent version to our knowledge).
Is this indeed a proper way to add nodes (untill a permanent fix is available and we have upgraded to a version with that fix) ?
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.