Bug 2055905

Summary:	Virtual Machine with large number of Direct LUNs failed to migrate without adjusting vars/migration_listener_timeout
Product:	Red Hat Enterprise Virtualization Manager	Reporter:	Germano Veit Michel <gveitmic>
Component:	vdsm	Assignee:	Milan Zamazal <mzamazal>
Status:	CLOSED ERRATA	QA Contact:	Qin Yuan <qiyuan>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	4.4.9	CC:	aefrat, ahadas, apinnick, emarcus, lsurette, mavital, michal.skrivanek, mzamazal, sfishbai, srevivo, ycui
Target Milestone:	ovirt-4.5.1
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	vdsm-4.50.1.1	Doc Type:	Bug Fix
Doc Text:	The default migration timeout period was increased to enable VMs with many direct LUN disks, which require more preparation time on the destination host, to be migrated. The migration_listener_prepare_disk_timeout and max_migration_listener_timeout VDSM options were added to enable the default migration timeout period to be extended, when needed.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2022-07-14 12:41:13 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	Virt	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Germano Veit Michel 2022-02-17 21:58:45 UTC

Description of problem:

When migrating a VM with lots of Direct LUNs, on the destination creation, this timeout in line 6016/6017 is hit:

  6006	    def waitForMigrationDestinationPrepare(self):
  6007	        """Wait until paths are prepared for migration destination"""
  6008	        # Wait for the VM to start its creation. There is no reason to start
  6009	        # the timed waiting for path preparation before the work has started.
  6010	        self.log.debug('migration destination: waiting for VM creation')
  6011	        self._vmCreationEvent.wait()
  6012	        prepareTimeout = self._loadCorrectedTimeout(
  6013	            config.getint('vars', 'migration_listener_timeout'), doubler=5)
  6014	        self.log.debug('migration destination: waiting %ss '
  6015	                       'for path preparation', prepareTimeout)
  6016	        self._incoming_migration_prepared.wait(prepareTimeout)
  6017	        if not self._incoming_migration_prepared.isSet():
  6018	            self.log.debug('Timeout while waiting for path preparation')
  6019	            return False
  6020	        srcDomXML = self._src_domain_xml
  6021	        self._updateDevicesDomxmlCache(srcDomXML)

This happens because prepare() for each LUN of the incoming VMs takes a while, and its done sequentially.
In the customer case, each prepare was slightly below 2s, for a total of 80 Direct LUNs, it easily goes away above prepareTimeout with the default 30 on migration_listener_timeout, regardless of the adjustment (was calculated to 38 in this case), but needed around 200.

The _loadCorrectedTimeout() function adjusts the base value of migration_listener_timeout according to the host load, which is great, but it would be even better if it also took into account the number of disks/LUNs to prepare, as that has a very considerable impact on the total time needed, maybe even more than the host load.

Version-Release number of selected component (if applicable):
4.4.9

How reproducible:
100% on customer site

Steps to Reproduce:
1. Create VM with dozens/hundreds of Direct LUNs
2. Migrate it

Actual results:
* Destination creation fails

Expected results:
* Migration succeeds without having to adjust vars/migration_listener_timeout.

Comment 1 Michal Skrivanek 2022-02-18 07:23:44 UTC

moving to virt for considering better timeouts. e.g. do we have the number of drives at hand?

Comment 2 Arik 2022-02-21 12:25:12 UTC

Germano, could you please attach the relevant parts from the logs here?

Comment 3 Germano Veit Michel 2022-02-22 00:33:30 UTC

(In reply to Arik from comment #2)
> Germano, could you please attach the relevant parts from the logs here?

Of course.

Migration of this VM 8bd921f9-21d1-4945-8088-1adb37abf02e starting at 2022-02-09 08:37:36,721-0500

Comment 6 Germano Veit Michel 2022-02-22 00:44:35 UTC

Also, would be nice to have this out of debug level and into error, so the problem is clearer from vdsm logs without debug.

  6018	            self.log.debug('Timeout while waiting for path preparation')

Comment 7 Milan Zamazal 2022-02-22 19:29:32 UTC

As discussed with Nir, the issue is indeed likely to be caused by many udev reloads, which are quite slow with so many devices on the host. If this is the case then the proper solution would be not to reload udev configuration after each rule change but only once all the rules are changed. Then timeout adjustments wouldn't be needed. It would probably require a non-trivial amount of work though.

Comment 8 Germano Veit Michel 2022-02-23 00:04:13 UTC

(In reply to Milan Zamazal from comment #7)
> As discussed with Nir, the issue is indeed likely to be caused by many udev
> reloads, which are quite slow with so many devices on the host. If this is
> the case then the proper solution would be not to reload udev configuration
> after each rule change but only once all the rules are changed. Then timeout
> adjustments wouldn't be needed. It would probably require a non-trivial
> amount of work though.

Yes, if each LUN took a lot less time (around 80% faster) it would complete in the 30s.

The _loadCorrectedTimeout() is still a bit off though, the number of udev devices and LUNs on the VM have greater influence than the "load" (number of VMs) on the host.

Comment 10 Michal Skrivanek 2022-03-04 08:57:54 UTC

(In reply to Milan Zamazal from comment #7)
> As discussed with Nir, the issue is indeed likely to be caused by many udev
> reloads, which are quite slow with so many devices on the host. If this is
> the case then the proper solution would be not to reload udev configuration
> after each rule change but only once all the rules are changed. Then timeout
> adjustments wouldn't be needed. It would probably require a non-trivial
> amount of work though.

OTOH it's not difficult to improve the _loadCorrectedTimeout to account for number of drives. Just add 2s/disk in waitForMigrationDestinationPrepare and we have a great improvement for large VMs, while for most common cases this won't make any difference. Perhaps with some upper bound.

Comment 11 Milan Zamazal 2022-03-04 12:43:43 UTC

It's not difficult to extend the timeout by some value per disk. The difficult part is what the value should be. 1, 2, 3 seconds, all is equally good or bad. And since it depends on the destination host, we cannot measure easily how long it takes to reload udev rules. But if you think 2 seconds is a good value (and obviously better than 0) then yes, we can lengthen the timeout this way.

Comment 12 Michal Skrivanek 2022-03-15 07:45:27 UTC

(In reply to Milan Zamazal from comment #11)
> It's not difficult to extend the timeout by some value per disk. The
> difficult part is what the value should be. 1, 2, 3 seconds, all is equally
> good or bad. And since it depends on the destination host, we cannot measure
> easily how long it takes to reload udev rules. But if you think 2 seconds is
> a good value (and obviously better than 0) then yes, we can lengthen the
> timeout this way.

when they're equally good or bad then just pick one;) With some upper bound, say 5 minutes.

Comment 13 Milan Zamazal 2022-04-12 13:56:41 UTC

(In reply to Michal Skrivanek from comment #12)

> when they're equally good or bad then just pick one;) With some upper bound,
> say 5 minutes.

OK, will do.

Comment 20 Qin Yuan 2022-06-14 22:01:49 UTC

Verified with:
vdsm-4.50.1.1-1.el8ev.x86_64

Steps:
1. Migrate VM without direct lun from host1 to host2, check the preparation timeout in vdsm.log on host2.
2. Attach 20 direct luns to the VM, migrate it from host1 to host2, check vdsm.log on host2 to see if the preparation timeout=(timeout in step1 + 20*2)s.
3. Set vars/migration_listener_prepare_disk_timeout to 30 on host2, migrate the VM from host1 to host2, check vdsm.log on host2 to see if the preparation timeout=max_migration_listener_timeout=600s.
4. Set vars/max_migration_listener_timeout to 800 on host2, migrate the VM from host1 to host2, check vdsm.log on host2 to see if the preparation timeout=(timeout in step1 + 20*30)s.

Results:
1. Migrating VM without direct lun succeeded, the preparation timeout is 36.0s:
2022-06-14 14:16:09,015+0300 DEBUG (jsonrpc/5) [virt.vm] (vmId='a3c4c8c2-cb63-4451-8a13-c1f0277815eb') migration destination: waiting 36.0s for path preparation (vm:5805)

2. Migrating VM with 20 direct luns attached succeeded, the preparation timeout is 76.0s, it's equal to (36.0s + 20*2s)
2022-06-14 14:56:56,398+0300 DEBUG (jsonrpc/2) [virt.vm] (vmId='a3c4c8c2-cb63-4451-8a13-c1f0277815eb') migration destination: waiting 76.0s for path preparation (vm:5805)

3. Migrating VM with 20 direct luns attached, vars/migration_listener_prepare_disk_timeout=30 succeeded, the preparation timeout is 600s, which is the default max_migration_listener_timeout.
2022-06-14 15:08:41,154+0300 DEBUG (jsonrpc/5) [virt.vm] (vmId='a3c4c8c2-cb63-4451-8a13-c1f0277815eb') migration destination: waiting 600s for path preparation (vm:5805)

4. Migrating VM with 20 direct luns attached, vars/migration_listener_prepare_disk_timeout=30, vars/max_migration_listener_timeout=800 succeeded, the preparation timeout is 636.0s, it's equal to (36.0s + 20*30)s.
2022-06-14 15:18:06,290+0300 DEBUG (jsonrpc/0) [virt.vm] (vmId='a3c4c8c2-cb63-4451-8a13-c1f0277815eb') migration destination: waiting 636.0s for path preparation (vm:5805)

According to the results, the bug fix worked as expected, when migrating VM with direct luns attached, the preparation timeout will be extended by (direct luns number * migration_listener_prepare_disk_timeout)s, and won't exceed the max_migration_listener_timeout.

Comment 24 errata-xmlrpc 2022-07-14 12:41:13 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHV RHEL Host (ovirt-host) [ovirt-4.5.1] update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:5583

Comment 25 meital avital 2022-08-07 10:02:49 UTC

Due to QE capacity, we are not going to cover this issue in our automation