As we know, we always tried to improve Smallfile performance in glusterfs.After doing a long duration testing we have found memory allocation is a big area that affects smallfile performance significantly in glusterfs. As we know gluster uses own thread based mempool to allocate/deallocate memory blocks.In testing we observed it performs well as compare to glibc thread based pool but it does not perform well as compare to tcmalloc pool so we have decided to move to tcmalloc as a default option instead of using glibc pool. I have executed a smallfile perf test case for 4.8M files 64K (more than 20 times on usual day operation) on latest devel branch.I have setup 12x3(nvme) volume after configure 4 event threads on 12 physical machines, there is no other configurable option i have enabled. Hardware details: 12 physical machines (6 client, 6 servers) Every machine is having 64 CPU 32G RAM, 10g NIC Run smallfile tool to run operations and total number of files are 4.8M date ; for i in {1..5} do /root/cleanup.sh;./smallfile_cli.py --operation create --threads 16 --file-size 64 --files 50000 --top /mnt/test --host-set client01.perf.cloud,client02.perf.cloud,client03.perf.cloud,client04.perf.cloud,client05.perf.cloud,client06.perf.cloud /root/cleanup.sh;./smallfile_cli.py --operation ls-l --threads 16 --file-size 64 --files 50000 --top /mnt/test --host-set client01.perf.cloud,client02.perf.cloud,client03.perf.cloud,client04.perf.cloud,client05.perf.cloud,client06.perf.cloud /root/cleanup.sh;./smallfile_cli.py --operation chmod --threads 16 --file-size 64 --files 50000 --top /mnt/test --host-set client01.perf.cloud,client02.perf.cloud,client03.perf.cloud,client04.perf.cloud,client05.perf.cloud,client06.perf.cloud /root/cleanup.sh;./smallfile_cli.py --operation stat --threads 16 --file-size 64 --files 50000 --top /mnt/test --host-set client01.perf.cloud,client02.perf.cloud,client03.perf.cloud,client04.perf.cloud,client05.perf.cloud,client06.perf.cloud /root/cleanup.sh;./smallfile_cli.py --operation read --threads 16 --file-size 64 --files 50000 --top /mnt/test --host-set client01.perf.cloud,client02.perf.cloud,client03.perf.cloud,client04.perf.cloud,client05.perf.cloud,client06.perf.cloud /root/cleanup.sh;./smallfile_cli.py --operation append --threads 16 --file-size 64 --files 50000 --top /mnt/test --host-set client01.perf.cloud,client02.perf.cloud,client03.perf.cloud,client04.perf.cloud,client05.perf.cloud,client06.perf.cloud /root/cleanup.sh;./smallfile_cli.py --operation mkdir --threads 16 --file-size 64 --files 50000 --top /mnt/test --host-set client01.perf.cloud,client02.perf.cloud,client03.perf.cloud,client04.perf.cloud,client05.perf.cloud,client06.perf.cloud /root/cleanup.sh;./smallfile_cli.py --operation rmdir --threads 16 --file-size 64 --files 50000 --top /mnt/test --host-set client01.perf.cloud,client02.perf.cloud,client03.perf.cloud,client04.perf.cloud,client05.perf.cloud,client06.perf.cloud /root/cleanup.sh;./smallfile_cli.py --operation cleanup --threads 16 --file-size 64 --files 50000 --top /mnt/test --host-set client01.perf.cloud,client02.perf.cloud,client03.perf.cloud,client04.perf.cloud,client05.perf.cloud,client06.perf.cloud done date; We got significant performance improvement and data is available at upstream link https://github.com/gluster/glusterfs/issues/2771.
Thanks Mohit for working on this patch and thanks Xavi for the suggestions for improvement. This patch allows the glusterfs to utilize 'tcmalloc' while moving away from the traditional 'mempool' used by glusterfs for long time. This brings in a new underlying feature which is transparent to the customers, but a significant enough change of managing the memory allocation. So this change falls under RFE as the core memory management/allocation methodology is swapped with a better memory allocation methodology optimized for performance, though there is no change exposed to the user level. As per our 'RHGS and Layered Product batch update model -2.2[1], we agreed not to include any RFE as part of maintenance release. Also, this patch is useful for small file workloads of glusterfs. Small file workloads doesn't suit well with glusterfs, and this fact was well advertised to our customers. Providing a solution to this problem is really good, but this will come as a added advantage for our existing customers who were using small file workloads against the RHGS general recommendation. This has a direct customer impact. Irrespective of the workload, definitely it will improve performance for any workload classifications but still there is a risk involved with the unknown. On the other hand, the patch is merged on Sep 15, 2021 and haven't made in to any of the glusterfs upstream release so far, as I checked. I see that the next glusterfs-9.5 release is scoped for Dec 30, 2021 and I believe that's the release which will carry this change to the upstream users. The logic here is we would need good amount of soak time in the community, before we could make decision about including the patch in RHGS 3.5.z With this thought, I would like to retarget this bug for RHGS 3.5.8 and revisit the verdict of including this bug in RHGS 3.5.z with that relevant information. @Mohit, @Sunil - What do you suggest ? [1] - https://docs.google.com/document/d/1KvdyoI8-BNJJuADBkN0OVTW4sMZ4dX6LqxUXPJtK8ME/edit
Same response as the iobuf patch, although this one is even less intrusive and could be useful by itself.
After conversation with Mohit, I understood that 'tcmalloc' requires 'gperftools' package. This particular package is not available in RHGS specific repos both in RHEL 7 and RHEL 8. In the case of RHEL 7, RHEL 7 server repo contains this package 'gperftools' but as I checked in RHEL 8 this package is not available with baseos or appstream. In the case of using tcmalloc, glusterfs should have a hard dependency of this new package - gperftools. So glusterfs package should have a hard dependency on 'gperftools' package both in RHEL 7 and RHEL 8. On the other hand, as per RHGS Batch update model[1] understanding we agreed not to include new packages. @Sunil, What are your thoughts ? [1] - https://docs.google.com/document/d/1KvdyoI8-BNJJuADBkN0OVTW4sMZ4dX6LqxUXPJtK8ME/edit
(In reply to SATHEESARAN from comment #4) > After conversation with Mohit, I understood that 'tcmalloc' requires > 'gperftools' package. > This particular package is not available in RHGS specific repos both in RHEL > 7 and RHEL 8. > > In the case of RHEL 7, RHEL 7 server repo contains this package 'gperftools' > but as I > checked in RHEL 8 this package is not available with baseos or appstream. > > In the case of using tcmalloc, glusterfs should have a hard dependency of > this new package - gperftools. > > So glusterfs package should have a hard dependency on 'gperftools' package > both in RHEL 7 and RHEL 8. > > On the other hand, as per RHGS Batch update model[1] understanding we agreed > not to include new packages. It's not a new package, it has been used for years by the Ceph team. > > @Sunil, What are your thoughts ? > > > [1] - > https://docs.google.com/document/d/1KvdyoI8- > BNJJuADBkN0OVTW4sMZ4dX6LqxUXPJtK8ME/edit
(In reply to SATHEESARAN from comment #4) > After conversation with Mohit, I understood that 'tcmalloc' requires > 'gperftools' package. > This particular package is not available in RHGS specific repos both in RHEL > 7 and RHEL 8. > > In the case of RHEL 7, RHEL 7 server repo contains this package 'gperftools' > but as I > checked in RHEL 8 this package is not available with baseos or appstream. > > In the case of using tcmalloc, glusterfs should have a hard dependency of > this new package - gperftools. > > So glusterfs package should have a hard dependency on 'gperftools' package > both in RHEL 7 and RHEL 8. Yes, we need to include this new package. This has been brought up during the program call and the ticket CLOUDBLD-8110 is raised to track the same. > > On the other hand, as per RHGS Batch update model[1] understanding we agreed > not to include new packages. > > @Sunil, What are your thoughts ? > > > [1] - > https://docs.google.com/document/d/1KvdyoI8- > BNJJuADBkN0OVTW4sMZ4dX6LqxUXPJtK8ME/edit
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (glusterfs bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:4840