Description of problem: Testing Tape passthrough in a VM through scsi capability, one can still try to live migrate a VM from one host to another in the same cluster. The operation will eventually fail at vdsm level and the VM will remain running in the source host. The idea of this BZ is to block/limit the migration feature if a VM has hostdev passed through / attached to it instead of allow the migration and fail later Version-Release number of selected component (if applicable): rhevm-4.0.6.3-0.1.el7ev.noarch qemu-kvm-rhev-2.6.0-27.el7.x86_64 vdsm-4.18.21-1.el7ev.x86_64 How reproducible: 100% Steps to Reproduce: 1. Assign a FC tape device (or emulated one) from host to VM in Virtual Machines -> Host Devices -> Add device 2. Start the VM. 3. Migrate it to another host in the cluster Actual results: Migration will eventually fail in vdsm with: ~~~ Thread-54::ERROR::2017-01-19 17:28:18,500::migration::383::virt.vm::(run) vmId=`5f532b0a-0702-456a-a83f-b1b682bf2fea`::Failed to migrate Traceback (most recent call last): File "/usr/share/vdsm/virt/migration.py", line 365, in run self._startUnderlyingMigration(time.time()) File "/usr/share/vdsm/virt/migration.py", line 438, in _startUnderlyingMigration self._perform_with_conv_schedule(duri, muri) File "/usr/share/vdsm/virt/migration.py", line 498, in _perform_with_conv_schedule self._perform_migration(duri, muri) File "/usr/share/vdsm/virt/migration.py", line 478, in _perform_migration self._vm._dom.migrateToURI3(duri, params, flags) File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 69, in f ret = attr(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 123, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 917, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1939, in migrateToURI3 if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self) libvirtError: Requested operation is not valid: domain has assigned non-USB host devices ~~~ Expected results: Block/limit/warn migration function from UI before trying the migration Additional info:
(In reply to Javier Coscia from comment #0) > Description of problem: > > Testing Tape passthrough in a VM through scsi capability, one can still try > to live migrate a VM from one host to another in the same cluster. > The operation will eventually fail at vdsm level and the VM will remain > running in the source host. how could it have migration enabled? the same rule as for PCI pt should be applied - the guest needs to be pinned to the host
Yes, normally the scheduling filter policy unit should filter out other hosts and thus prevent the migration, but from the logs it seems the operation was allowed. It may indicate a bug in the procedure checking availability of free host devices on the host - will need to investigate the logs further....
Well the HostDevice FILTER scheduling policy unit should - in conjunction with PinToHost FILTER scheduling policy unit - prevent such behavior. One thing that could cause this behavior is that the PinToHost policy unit was disabled. @Javier can you please confirm that the "PinToHost" policy unit was active at that time?
Ok, it turns out that this was a bug in the Host Device Filter Policy Unit that enabled under certain conditions VMs to be ran or migrated to hosts completely different than the one they are "pinned to". Fix posted u/s.
verify with: Version 4.2.0-0.0.master.20171122115834.git3549ed1.el7.centos Steps: 1. Add host device to VM 2. Start VM 3. Migrate VM Results: Migration Failed Engine log: 2017-11-23 17:27:13,825+02 INFO [org.ovirt.engine.core.bll.MigrateVmCommand] (default task-6) [01559e8b-6209-4b35-816d-4ff891eb03d0] Lock Acquired to object 'EngineLock:{exclusiveLocks='[36457a49-d68e-490f-b3b1-3a3a53312986=VM]', sharedLocks=''}' 2017-11-23 17:27:13,970+02 INFO [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (default task-6) [] Candidate host 'intel_vGPU' ('cc775bb5-3858-49bb-bae5-0983494da620') was filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'HostDevice' (correlation id: null) 2017-11-23 17:27:13,981+02 INFO [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (default task-6) [] Candidate host 'monique-vds01.tlv.redhat.com' ('621bf753-b453-40ec-8673-7ff95f57fa71') was filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'Memory' (correlation id: null) 2017-11-23 17:27:13,982+02 WARN [org.ovirt.engine.core.bll.MigrateVmCommand] (default task-6) [] Validation of action 'MigrateVm' failed for user admin@internal-authz. Reasons: VAR__ACTION__MIGRATE,VAR__TYPE__VM,SCHEDULING_ALL_HOSTS_FILTERED_OUT,VAR__FILTERTYPE__INTERNAL,$hostName intel_vGPU,$filterName HostDevice,VAR__DETAIL__WRONG_HOST_FOR_REQUESTED_HOST_DEVICES,SCHEDULING_HOST_FILTERED_REASON_WITH_DETAIL,VAR__FILTERTYPE__INTERNAL,$hostName monique-vds01.tlv.redhat.com,$filterName Memory,$availableMem 3118,VAR__DETAIL__NOT_ENOUGH_MEMORY,SCHEDULING_HOST_FILTERED_REASON_WITH_DETAIL 2017-11-23 17:27:13,983+02 INFO [org.ovirt.engine.core.bll.MigrateVmCommand] (default task-6) [] Lock freed to object 'EngineLock:{exclusiveLocks='[36457a49-d68e-490f-b3b1-3a3a53312986=VM]', sharedLocks=''}' 2017-11-23 17:27:24,520+02 INFO [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [] Connecting to monique-vds01.tlv.redhat.com/10.35.4.1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:1488
BZ<2>Jira Resync