Bug 1097298 - The start time for 'migration_max_time_per_gib_mem' appears to be calculated too early.
Summary: The start time for 'migration_max_time_per_gib_mem' appears to be calculated ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: 3.3
Hardware: x86_64
OS: Linux
urgent
high
Target Milestone: ---
: 3.4.2
Assignee: Vinzenz Feenstra [evilissimo]
QA Contact: Gil Klein
URL:
Whiteboard: virt
Depends On:
Blocks: 1090109 1097332 1114269
TreeView+ depends on / blocked
 
Reported: 2014-05-13 13:55 UTC by Vinzenz Feenstra [evilissimo]
Modified: 2014-06-30 09:25 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 1090109
Environment:
Last Closed: 2014-06-11 06:44:31 UTC
oVirt Team: ---
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 27135 0 None None None Never
oVirt gerrit 27637 0 None MERGED virt: Capture migration start time after the semaphore was accquired Never

Description Vinzenz Feenstra [evilissimo] 2014-05-13 13:55:42 UTC
Description of problem:

With the introduction of 'migration_max_time_per_gib_mem' we have noticed some migrations failing due to an error similar to the following;

"The migration took 133 seconds which is exceeding the configured maximum time for migrations of 128 seconds. The migration will be aborted."


On closer inspection, it appears that the start time for this parameter may be captured too soon, i.e. in this specific case the actual data transfer had only been underway for 10 seconds before 'migration_max_time_per_gib_mem' was exceeded. The reason being that multiple migrations had been started simultaneously and the default value of 3 for 'max_outgoing_migrations' was limiting concurrent migrations to the point that some VMs took around 120 seconds just to acquire the migration semaphore.

Actual results:

The timer for 'migration_max_time_per_gib_mem' should probably be started once the semaphore has been acquired, as opposed to when the migration request was initiated.

Comment 1 Michal Skrivanek 2014-05-13 15:15:49 UTC
merged to master, posted in 3.4

Comment 2 Vinzenz Feenstra [evilissimo] 2014-05-14 11:48:38 UTC
Merged to ovirt-3.4 branch as http://gerrit.ovirt.org/gitweb?p=vdsm.git;a=commit;h=cc38b94d0eaa67d2b1811d4fdc84563c2e6d489e

Comment 3 Eldad Marciano 2014-05-25 11:46:09 UTC
-build is av9.2 installed
-reducing the time out in order to reproduced the problem for 1gb ram.
-bug fixed.
-migration time not not expired, multiple migration pass (tested on 6 vms, as customer described).
-code fix in vm.py located.

Comment 4 Sandro Bonazzola 2014-06-11 06:44:31 UTC
This is an automated message

oVirt 3.4.2 has been released:
 * should fix your issue
 * should be available at your local mirror within two days.

If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.