Description of problem: When using large number of storage domains, acquiring the host id (add_lockspace) in all storage domains takes lot of time. The issue worse when using larger io timeout in sanlock. Sanlock bug 1902468 adds max_worker_threads configuration. Based on my tests, 50 workers is a good default, speeding host activation and deactivation. Here are some examples using system with 40 storage domains. |------------------------|------------|-----------|------------| | operation | io_timeout | 8 workers | 50 workers | |-------------------------------------------------|------------| | activate host | 10 | 123 | 47 | | | 20 | 224 | 71 | | deactivate host | 10 | 16 | 8 | | | 20 | 16 | 5 | |------------------------|------------|-----------|------------| Vdsm should configure max_worker_threads in /etc/sanlock/sanlock.conf when configuring sanlock (lib/vdsm/tool/configurators/sanlock.py). How to test Clean shutdown: 1. Add 40 storage domains 2. Deactivate host in the DC 3. Wait until all lockspaces are removed 4. Activate host 5. Measure the time until all lockspaces were added. Unclean shutdown: 1. While host is active, perform hard poweroff. If the host is a vm, kill the vm. 2. Activate host 3. Measure the time until all lockspaces were added. When number of workers is higher than number of storage domains, activation and deactivation are 2-3 times faster. Here are some measurements, setting the value manually. To check lockspaces status run this as root: watch sanlock client status When activating hosts, you will see all lockapces with ADD marker. Wait until all lockapaces do not have ADD marker. When deactivating host you will see all lockspaces with REM marker. Wait until the lockspaces are removed.
Sanlock should provide the new configuration in RHEL 8.3.z.
This is rather simple change with significant performance improvement. I'm not sure when required sanlock version will be available, but it is likely to be available for 4.4.5.
Tested on: Engine : https://rhev-red-03.rdu2.scalelab.redhat.com 40 Storage domains > iscsi connectivity Hosts : F02-h25-000-r620.rdu2.scalelab.redhat.com Version : vdsm-4.40.40-1.el8ev.x86_64 Rhv-release-4.4.4-7-001.noarch sanlock-3.8.2-1.el8.x86_64 max_worker_threads = 50 +---------------------------------+-----------------+-----------+ | F02-h25-000-r620 - vdsm-4.40.40 | | Duration | +---------------------------------+-----------------+-----------+ | Scenario | | | | Clean shutdown | deactivate host | 17.76 s | | | activate host | 139 s | | unclean shutdown | activate host | 140 s | +---------------------------------+-----------------+-----------+ Host: F01-h08-000-1029u.rdu2.scalelab.redhat.com Version : vdsm-4.40.37-1.el8ev.x86_64 Rhv-release-4.4.4-2-001.noarch sanlock-3.8.2-1.el8.x86_64 +----------------------------------+-----------------+-----------+ | F01-h08-000-1029u - vdsm-4.40.37 | | Duration | +----------------------------------+-----------------+-----------+ | Scenario | | | | Clean shutdown | deactivate host | 19.51 s | | | activate host | 136 s | | unclean shutdown | activate host | 975 s | +----------------------------------+-----------------+-----------+ p.s once we will get the official sanlock build we be able to test it
Hey nir please provide the correct sanlock installation procedure : [root@f01-h08-000-1029u ~]# yum localinstall sanlock-lib-3.8.2-2.el8.x86_64.rpm Updating Subscription Management repositories. Unable to read consumer identity This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register. Last metadata expiration check: 2:01:48 ago on Sun 10 Jan 2021 11:13:48 AM UTC. Error: Problem: problem with installed package python3-sanlock-3.8.2-1.el8.x86_64 - package python3-sanlock-3.8.2-1.el8.x86_64 requires sanlock-lib = 3.8.2-1.el8, but none of the providers can be installed - sanlock-lib-3.8.2-1.el8.i686 has inferior architecture - cannot install both sanlock-lib-3.8.2-2.el8.x86_64 and sanlock-lib-3.8.2-1.el8.x86_64 - conflicting requests (try to add '--allowerasing' to command line to replace conflicting packages or '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages) [root@f01-h08-000-1029u ~]# yum install http://download.eng.bos.redhat.com/brewroot/vol/rhel-8/packages/sanlock/3.8.2/2.el8/x86_64/sanlock-3.8.2-2.el8.x86_64.rpm Updating Subscription Management repositories. Unable to read consumer identity This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register. Last metadata expiration check: 2:01:55 ago on Sun 10 Jan 2021 11:13:48 AM UTC. sanlock-3.8.2-2.el8.x86_64.rpm 906 kB/s | 154 kB 00:00 Error: Problem: conflicting requests - nothing provides sanlock-lib = 3.8.2-2.el8 needed by sanlock-3.8.2-2.el8.x86_64 (try to add '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages) [root@f01-h08-000-1029u ~]# rpm -e sanlock-3.8.2-1.el8.x86_64 error: Failed dependencies: sanlock >= 2.4 is needed by (installed) libvirt-lock-sanlock-6.6.0-7.module+el8.3.0+8424+5ea525c5.x86_64 sanlock >= 3.7.3 is needed by (installed) ovirt-hosted-engine-ha-2.4.5-1.el8ev.noarch sanlock >= 2.8 is needed by (installed) ovirt-hosted-engine-setup-2.4.8-1.el8ev.noarch currently, with the above methods, it failed on conflicting / dependencies
(In reply to Tzahi Ashkenazi from comment #5) Try this: cd /etc/yum.repos.d wget http://brew-task-repos.usersys.redhat.com/repos/official/sanlock/3.8.2/2.el8/sanlock-3.8.2-2.el8.repo dnf upgrade Note: you must put host to maintenance before upgrading sanlock.
Tested on: Engine : https://rhev-red-03.rdu2.scalelab.redhat.com 40 Storage domains > iscsi connectivity Hosts : F02-h25-000-r620.rdu2.scalelab.redhat.com Version : vdsm-4.40.40-1.el8ev.x86_64 Rhv-release-4.4.4-7-001.noarch max_worker_threads = 50 Comparison between sanlock version 3.8.2-1 to sanlock 3.8.2-2 on the following : +----------------------------------+-----------------+-----------------+-----------------+ | F02-h25-000-r620 - VDSM-4.40.40 | | Duration | +----------------------------------+-----------------+-----------------+-----------------+ | Scenario | | sanlock 3.8.2-1 | sanlock 3.8.2-2 | | | | | | | Clean shutdown | deactivate host | 17.76 s | 6.56 s | | | activate host | 139 s | 56.74 s | | unClean shutdown | activate host | 140 s | 70.72 s | +----------------------------------+-----------------+-----------------+-----------------+
This bugzilla is included in oVirt 4.4.4 release, published on December 21st 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.4 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.