Description of problem: With the introduction of 'migration_max_time_per_gib_mem' we have noticed some migrations failing due to an error similar to the following; "The migration took 133 seconds which is exceeding the configured maximum time for migrations of 128 seconds. The migration will be aborted." On closer inspection, it appears that the start time for this parameter may be captured too soon, i.e. in this specific case the actual data transfer had only been underway for 10 seconds before 'migration_max_time_per_gib_mem' was exceeded. The reason being that multiple migrations had been started simultaneously and the default value of 3 for 'max_outgoing_migrations' was limiting concurrent migrations to the point that some VMs took around 120 seconds just to acquire the migration semaphore. Actual results: The timer for 'migration_max_time_per_gib_mem' should probably be started once the semaphore has been acquired, as opposed to when the migration request was initiated.
merged to master, posted in 3.4
Merged to ovirt-3.4 branch as http://gerrit.ovirt.org/gitweb?p=vdsm.git;a=commit;h=cc38b94d0eaa67d2b1811d4fdc84563c2e6d489e
-build is av9.2 installed -reducing the time out in order to reproduced the problem for 1gb ram. -bug fixed. -migration time not not expired, multiple migration pass (tested on 6 vms, as customer described). -code fix in vm.py located.
This is an automated message oVirt 3.4.2 has been released: * should fix your issue * should be available at your local mirror within two days. If problems still persist, please make note of it in this bug report.