Bug 1419557 - Switching to post-copy should catch exceptions
Summary: Switching to post-copy should catch exceptions
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: 4.18.15
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ovirt-4.1.1
: 4.19.5
Assignee: Milan Zamazal
QA Contact: Israel Pinto
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-06 13:50 UTC by Milan Zamazal
Modified: 2017-04-21 09:31 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-04-21 09:31:01 UTC
oVirt Team: Virt
Embargoed:
rule-engine: ovirt-4.1+
rule-engine: planning_ack+
rule-engine: devel_ack+
mavital: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 71708 0 ovirt-4.1 MERGED virt: Handle the result of switching to post-copy properly 2017-02-06 16:05:55 UTC

Description Milan Zamazal 2017-02-06 13:50:13 UTC
Description of problem:

When switching to post-copy migration, the result value of the corresponding libvirt call is examined. However, the call raises an exception rather than returning an error code on failure. That exception should be caught and handled appropriately.

How reproducible:

I don't know how to reproduce the bug easily in a real situation. But it can be artificially reproduced with modified Vdsm sources.

Steps to Reproduce:
1. Modify Vdsm sources: Make _post_copy_flag method in migration.py always return 0.
2. Start a busy migration with post-copy schedule.
3. Wait until the switch to post-copy happens.

Actual results:

Traceback appears in vdsm.log and the migration continues "wildly".

Expected results:

The failure is logged in vdsm.log and the next migration schedule (abort) is executed.

Additional info:

Comment 1 Israel Pinto 2017-02-13 15:30:53 UTC
Verify with:
Engine: 4.1.1-0.1.el7
Host:
OS Version:RHEL - 7.3 - 7.el7
Kernel Version:3.10.0 - 550.el7.x86_64
KVM Version:2.6.0 - 28.el7_3.3.1
LIBVIRT Version:libvirt-2.0.0-10.el7_3.4
VDSM Version:vdsm-4.19.5-1.el7ev

Steps to Reproduce:
1. Modify Vdsm sources: Make _post_copy_flag method in migration.py always return 0.
2. Start a busy migration with post-copy schedule.
3. Wait until the switch to post-copy happens.

Results:
Error file to log and migration continue and finish in post_copy mode
from the log:
2017-02-13 13:21:05,781 INFO  (migmon/fe35b83e) [vdsm.api] START switch_migration_to_post_copy args=(<virt.vm.Vm object at 0x360dd90>,) kwargs={} (api:37)
2017-02-13 13:21:05,781 INFO  (migmon/fe35b83e) [virt.vm] (vmId='fe35b83e-62f5-4641-b0df-84bd4af2a10b') Switching to post-copy migration (vm:1578)
2017-02-13 13:21:05,781 INFO  (migmon/fe35b83e) [virt.vm] (vmId='fe35b83e-62f5-4641-b0df-84bd4af2a10b') Stopping connection (guestagent:430)
2017-02-13 13:21:05,782 INFO  (migmon/fe35b83e) [virt.vm] (vmId='fe35b83e-62f5-4641-b0df-84bd4af2a10b') Starting connection (guestagent:245)
2017-02-13 13:21:05,784 INFO  (migmon/fe35b83e) [vdsm.api] FINISH switch_migration_to_post_copy return=False (api:43)
2017-02-13 13:21:05,784 WARN  (migmon/fe35b83e) [virt.vm] (vmId='fe35b83e-62f5-4641-b0df-84bd4af2a10b') Failed to switch to post-copy migration (migration:820)
2017-02-13 13:21:05,784 INFO  (migmon/fe35b83e) [virt.vm] (vmId='fe35b83e-62f5-4641-b0df-84bd4af2a10b') Migration Progress: 280 seconds elapsed, 99% of data processed, total data: 1096MB, processed data: 565MB, remaining data: 10MB, transfer speed 2MBps, zero pages: 209008MB, compressed: 0MB, dirty rate: 1503, memory iteration: 57 (migration:787)


Note You need to log in before you can comment on or make changes to this bug.