Bug 867347
| Summary: | engine: we re-migrate a vm after cancel migration by user and vm completes the migration | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Dafna Ron <dron> | ||||||
| Component: | ovirt-engine | Assignee: | Omer Frenkel <ofrenkel> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Dafna Ron <dron> | ||||||
| Severity: | urgent | Docs Contact: | |||||||
| Priority: | urgent | ||||||||
| Version: | 3.1.0 | CC: | bazulay, dyasny, hateya, iheim, lpeer, michal.skrivanek, ofrenkel, Rhev-m-bugs, sgrinber, yeylon, ykaul | ||||||
| Target Milestone: | --- | Keywords: | TestBlocker | ||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | virt | ||||||||
| Fixed In Version: | si23 | Doc Type: | Bug Fix | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2012-12-04 20:04:45 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | 873792 | ||||||||
| Bug Blocks: | |||||||||
| Attachments: |
|
||||||||
tested in si24 there is still an issue if I cancel the migration right after the migration starts. (full logs will be attached) 2012-11-05 16:59:43,613 INFO [org.ovirt.engine.core.bll.CancelMigrateVmCommand] (pool-4-thread-46) [f19fb51] Running command: CancelMigrateVmCommand internal: false. Entities affected : ID: 299b1397-3b99-4477-8106-9b57ecf36b16 Type: VM 2012-11-05 16:59:43,615 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CancelMigrateVDSCommand] (pool-4-thread-46) [f19fb51] START, CancelMigrateVDSCommand(HostName = gold-vdsd, HostId = ce3fab0c-2734-11e2-83bc-001a4a169741, vmId=299b1397-3b99-4477-8106-9b57ecf36b16), log id: 34d3c74c 2012-11-05 16:59:43,700 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CancelMigrateVDSCommand] (pool-4-thread-46) [f19fb51] FINISH, CancelMigrateVDSCommand, log id: 34d3c74c 2012-11-05 16:59:44,061 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (QuartzScheduler_Worker-61) vds::refreshVmList vm id 299b1397-3b99-4477-8106-9b57ecf36b16 is migrating to vds gold-vdsc ignoring it in the refresh until migration is done 2012-11-05 16:59:46,114 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (QuartzScheduler_Worker-64) vds::refreshVmList vm id 299b1397-3b99-4477-8106-9b57ecf36b16 is migrating to vds gold-vdsc ignoring it in the refresh until migration is done 2012-11-05 16:59:48,157 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (QuartzScheduler_Worker-54) vds::refreshVmList vm id 299b1397-3b99-4477-8106-9b57ecf36b16 is migrating to vds gold-vdsc ignoring it in the refresh until migration is done 2012-11-05 16:59:50,201 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (QuartzScheduler_Worker-69) vds::refreshVmList vm id 299b1397-3b99-4477-8106-9b57ecf36b16 is migrating to vds gold-vdsc ignoring it in the refresh until migration is done 2012-11-05 16:59:52,264 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (QuartzScheduler_Worker-60) vds::refreshVmList vm id 299b1397-3b99-4477-8106-9b57ecf36b16 is migrating to vds gold-vdsc ignoring it in the refresh until migration is done 2012-11-05 16:59:54,597 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (QuartzScheduler_Worker-77) START, DestroyVDSCommand(HostName = gold-vdsd, HostId = ce3fab0c-2734-11e2-83bc-001a4a169741, vmId=299b1397-3b99-4477-8106-9b57ecf36b16, force=false, secondsToWait=0, gracefully=false), log id: 45f73d5d 2012-11-05 16:59:54,668 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (QuartzScheduler_Worker-77) FINISH, DestroyVDSCommand, log id: 45f73d5d 2012-11-05 16:59:54,668 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (QuartzScheduler_Worker-77) Setting VM migrate 299b1397-3b99-4477-8106-9b57ecf36b16 to status unknown 2012-11-05 16:59:54,669 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (QuartzScheduler_Worker-77) vm migrate running in db and not running in vds - add to rerun treatment. vds gold-vdsd Created attachment 638698 [details]
logs
It looks like the migration cancel returned successfully to engine,
but despite that, the migration completed successfully (cancel didn't work)
relevant vdsm log (vm-id is 299b1397-3b99-4477-8106-9b57ecf36b16):
Thread-1800::DEBUG::2012-11-05 16:59:40,548::BindingXMLRPC::894::vds::(wrapper) client [10.35.97.65]::call vmMigrate with ({'src': 'gold-vdsd.qa.lab.tlv.redhat.com', 'dst': 'gold-vdsc.qa.lab.tlv.redhat.com:54321', 'vmId': '299b1397-3b99-4477-8106-9b57ecf36b16', 'method': 'online'},) {} flowID [28cae918]
Thread-1800::DEBUG::2012-11-05 16:59:40,548::API::421::vds::(migrate) {'src': 'gold-vdsd.qa.lab.tlv.redhat.com', 'dst': 'gold-vdsc.qa.lab.tlv.redhat.com:54321', 'vmId': '299b1397-3b99-4477-8106-9b57ecf36b16', 'method': 'online'}
Thread-1801::DEBUG::2012-11-05 16:59:40,549::vm::125::vm.Vm::(_setupVdsConnection) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::Destination server is: gold-vdsc.qa.lab.tlv.redhat.com:54321
Thread-1800::DEBUG::2012-11-05 16:59:40,550::BindingXMLRPC::900::vds::(wrapper) return vmMigrate with {'status': {'message': 'Migration process starting', 'code': 0}}
Thread-1801::DEBUG::2012-11-05 16:59:40,550::vm::127::vm.Vm::(_setupVdsConnection) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::Initiating connection with destination
Thread-1801::DEBUG::2012-11-05 16:59:40,612::libvirtvm::214::vm.Vm::(_getCpuStats) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::CPU stats not available
Thread-1801::DEBUG::2012-11-05 16:59:40,613::libvirtvm::267::vm.Vm::(_getDiskStats) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::Disk hdc stats not available
Thread-1801::DEBUG::2012-11-05 16:59:40,613::libvirtvm::267::vm.Vm::(_getDiskStats) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::Disk hda stats not available
Thread-1801::DEBUG::2012-11-05 16:59:40,614::libvirtvm::296::vm.Vm::(_getDiskLatency) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::Disk hdc latency not available
Thread-1801::DEBUG::2012-11-05 16:59:40,614::libvirtvm::296::vm.Vm::(_getDiskLatency) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::Disk hda latency not available
Thread-1801::DEBUG::2012-11-05 16:59:40,616::vm::173::vm.Vm::(_prepareGuest) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::migration Process begins
Thread-1801::DEBUG::2012-11-05 16:59:40,632::vm::237::vm.Vm::(run) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::migration semaphore acquired
Thread-1801::DEBUG::2012-11-05 16:59:41,090::libvirtvm::454::vm.Vm::(_startUnderlyingMigration) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::starting migration to qemu+tls://gold-vdsc.qa.lab.tlv.redhat.com/system
Thread-1803::DEBUG::2012-11-05 16:59:41,091::libvirtvm::352::vm.Vm::(run) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::migration downtime thread started
Thread-1804::DEBUG::2012-11-05 16:59:41,092::libvirtvm::380::vm.Vm::(run) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::starting migration monitor thread
Thread-1805::DEBUG::2012-11-05 16:59:42,210::BindingXMLRPC::894::vds::(wrapper) client [10.35.97.65]::call vmMigrationCancel with ('299b1397-3b99-4477-8106-9b57ecf36b16',) {} flowID [f19fb51]
Thread-1805::DEBUG::2012-11-05 16:59:42,255::BindingXMLRPC::900::vds::(wrapper) return vmMigrationCancel with {'status': {'message': 'Migration process stopped', 'code': 0}}
Thread-1808::DEBUG::2012-11-05 16:59:46,893::task::588::TaskManager.Task::(_updateState) Task=`39deac96-cab6-40d9-8e43-b4298135e714`::moving from state init -> state preparing
Thread-1808::INFO::2012-11-05 16:59:46,893::logUtils::37::dispatcher::(wrapper) Run and protect: repoStats(options=None)
Thread-1808::INFO::2012-11-05 16:59:46,894::logUtils::39::dispatcher::(wrapper) Run and protect: repoStats, Return response: {u'af5bb9ad-ebe7-4ade-8d0d-a6882d11af44': {'delay': '0.0111539363861', 'lastCheck': '8.9', 'code': 0, 'valid': True}, u'5140b72b-0eac-4d4c-b475-5724bc9c27b7': {'delay': '0.00832009315491', 'lastCheck': '7.8', 'code': 0, 'valid': True}, u'2d52459b-ccd1-4a99-bff3-1264ca77a2c4': {'delay': '0.00824379920959', 'lastCheck': '7.5', 'code': 0, 'valid': True}, u'940726d2-b911-49ae-aed5-cda5791bb4b6': {'delay': '0.00918412208557', 'lastCheck': '8.0', 'code': 0, 'valid': True}, u'72ec1321-a114-451f-bee1-6790cbca1bc6': {'delay': '0.00177216529846', 'lastCheck': '2.2', 'code': 0, 'valid': True}}
Thread-1808::DEBUG::2012-11-05 16:59:46,894::task::1172::TaskManager.Task::(prepare) Task=`39deac96-cab6-40d9-8e43-b4298135e714`::finished: {u'af5bb9ad-ebe7-4ade-8d0d-a6882d11af44': {'delay': '0.0111539363861', 'lastCheck': '8.9', 'code': 0, 'valid': True}, u'5140b72b-0eac-4d4c-b475-5724bc9c27b7': {'delay': '0.00832009315491', 'lastCheck': '7.8', 'code': 0, 'valid': True}, u'2d52459b-ccd1-4a99-bff3-1264ca77a2c4': {'delay': '0.00824379920959', 'lastCheck': '7.5', 'code': 0, 'valid': True}, u'940726d2-b911-49ae-aed5-cda5791bb4b6': {'delay': '0.00918412208557', 'lastCheck': '8.0', 'code': 0, 'valid': True}, u'72ec1321-a114-451f-bee1-6790cbca1bc6': {'delay': '0.00177216529846', 'lastCheck': '2.2', 'code': 0, 'valid': True}}
Thread-1808::DEBUG::2012-11-05 16:59:46,895::task::588::TaskManager.Task::(_updateState) Task=`39deac96-cab6-40d9-8e43-b4298135e714`::moving from state preparing -> state finished
Thread-1808::DEBUG::2012-11-05 16:59:46,895::resourceManager::809::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-1808::DEBUG::2012-11-05 16:59:46,896::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-1808::DEBUG::2012-11-05 16:59:46,896::task::978::TaskManager.Task::(_decref) Task=`39deac96-cab6-40d9-8e43-b4298135e714`::ref 0 aborting False
Thread-1809::DEBUG::2012-11-05 16:59:46,934::libvirtvm::267::vm.Vm::(_getDiskStats) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::Disk hdc stats not available
Thread-1809::DEBUG::2012-11-05 16:59:46,934::libvirtvm::267::vm.Vm::(_getDiskStats) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::Disk hda stats not available
Thread-1809::DEBUG::2012-11-05 16:59:46,934::libvirtvm::296::vm.Vm::(_getDiskLatency) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::Disk hdc latency not available
Thread-1809::DEBUG::2012-11-05 16:59:46,935::libvirtvm::296::vm.Vm::(_getDiskLatency) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::Disk hda latency not available
Thread-27::DEBUG::2012-11-05 16:59:48,027::__init__::1164::Storage.Misc.excCmd::(_log) '/bin/dd iflag=direct if=/dev/af5bb9ad-ebe7-4ade-8d0d-a6882d11af44/metadata bs=4096 count=1' (cwd None)
Thread-27::DEBUG::2012-11-05 16:59:48,037::__init__::1164::Storage.Misc.excCmd::(_log) SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied, 0.000685711 s, 6.0 MB/s\n'; <rc> = 0
Thread-26::DEBUG::2012-11-05 16:59:48,870::__init__::1164::Storage.Misc.excCmd::(_log) '/bin/dd iflag=direct if=/dev/940726d2-b911-49ae-aed5-cda5791bb4b6/metadata bs=4096 count=1' (cwd None)
Thread-26::DEBUG::2012-11-05 16:59:48,879::__init__::1164::Storage.Misc.excCmd::(_log) SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied, 0.000657354 s, 6.2 MB/s\n'; <rc> = 0
Thread-23::DEBUG::2012-11-05 16:59:49,089::__init__::1164::Storage.Misc.excCmd::(_log) '/bin/dd iflag=direct if=/dev/5140b72b-0eac-4d4c-b475-5724bc9c27b7/metadata bs=4096 count=1' (cwd None)
Thread-23::DEBUG::2012-11-05 16:59:49,098::__init__::1164::Storage.Misc.excCmd::(_log) SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied, 0.000583391 s, 7.0 MB/s\n'; <rc> = 0
Thread-24::DEBUG::2012-11-05 16:59:49,406::__init__::1164::Storage.Misc.excCmd::(_log) '/bin/dd iflag=direct if=/dev/2d52459b-ccd1-4a99-bff3-1264ca77a2c4/metadata bs=4096 count=1' (cwd None)
Thread-24::DEBUG::2012-11-05 16:59:49,419::__init__::1164::Storage.Misc.excCmd::(_log) SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied, 0.00533608 s, 768 kB/s\n'; <rc> = 0
Thread-1804::INFO::2012-11-05 16:59:51,095::libvirtvm::412::vm.Vm::(run) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::Migration Progress: 9 seconds elapsed, 52% of data processed, 52% of mem processed
libvirtEventLoop::DEBUG::2012-11-05 16:59:51,529::libvirtvm::2679::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::event Suspended detail 0 opaque None
libvirtEventLoop::DEBUG::2012-11-05 16:59:52,257::libvirtvm::2679::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::event Stopped detail 3 opaque None
Thread-1801::DEBUG::2012-11-05 16:59:52,260::libvirtvm::367::vm.Vm::(cancel) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::canceling migration downtime thread
Thread-1801::DEBUG::2012-11-05 16:59:52,261::libvirtvm::417::vm.Vm::(stop) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::stopping migration monitor thread
Thread-1803::DEBUG::2012-11-05 16:59:52,261::libvirtvm::364::vm.Vm::(run) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::migration downtime thread exiting
Thread-1801::DEBUG::2012-11-05 16:59:52,275::vm::969::vm.Vm::(setDownStatus) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::Changed state to Down: Migration succeeded
Thread-1801::DEBUG::2012-11-05 16:59:52,276::vmChannels::152::vds::(unregister) Delete fileno 28 from listener.
Thread-1801::DEBUG::2012-11-05 16:59:52,276::utils::335::vm.Vm::(stop) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::Stop statistics collection
VM Channels Listener::DEBUG::2012-11-05 16:59:52,277::vmChannels::88::vds::(_do_del_channels) fileno 28 was removed from listener.
Thread-1793::DEBUG::2012-11-05 16:59:52,277::utils::366::vm.Vm::(run) vmId=`299b1397-3b99-4477-8106-9b57ecf36b16`::Stats thread finished
Thread-1813::DEBUG::2012-11-05 16:59:53,171::BindingXMLRPC::894::vds::(wrapper) client [10.35.97.65]::call vmGetStats with ('299b1397-3b99-4477-8106-9b57ecf36b16',) {}
Thread-1813::DEBUG::2012-11-05 16:59:53,172::BindingXMLRPC::900::vds::(wrapper) return vmGetStats with {'status': {'message': 'Done', 'code': 0}, 'statsList': [{'status': 'Down', 'hash': '-7368088289532812409', 'exitMessage': 'Migration succeeded', 'vmId': '299b1397-3b99-4477-8106-9b57ecf36b16', 'timeOffset': '-43200', 'exitCode': 0}]}
Thread-1814::DEBUG::2012-11-05 16:59:53,192::BindingXMLRPC::894::vds::(wrapper) client [10.35.97.65]::call vmDestroy with
Michal from libvirt looked at it and its a libvirt bug: vdsm isued 'migrate_cancel' before libvirt gets to 'migrate'; I mean libvirt has set up a job, but was doing some preparation before instructing qemu to migrate bug opened for libvirt: https://bugzilla.redhat.com/show_bug.cgi?id=873792 Per discussion with Kaul, I'm moving this BZ back to ON_QA for verification that the engine is not re-migrating. The other libvirt issue needs to be handled ASAP as it is a rhevm GA blocker. verified on si24.4 |
Created attachment 628677 [details] logs Description of problem: we automatically re-migrate vm's that their migration was cancelled by the user which means that even though the user cancelled the migration it will succeed. Version-Release number of selected component (if applicable): si20 How reproducible: 100% Steps to Reproduce: 1. start vm on two host cluster 2. start migrate 3. cancel migration Actual results: cancel migration succeeds but engine re-migrates the vm and completes the migration Expected results: we should not re-migrate if the user cancelled the migration. Additional info: logs