Bug 1816951 - [CNV&RHV] CNV VM migration failure is not handled correctly by the engine
Summary: [CNV&RHV] CNV VM migration failure is not handled correctly by the engine
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.4.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ovirt-4.4.2
: 4.4.2
Assignee: Arik
QA Contact: Pavol Brilla
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-25 08:59 UTC by Pavol Brilla
Modified: 2022-07-09 12:55 UTC (History)
4 users (show)

Fixed In Version: ovirt-engine-4.4.2.1
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-09-23 16:11:04 UTC
oVirt Team: Virt
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-37462 0 None None None 2022-07-09 12:55:02 UTC
Red Hat Product Errata RHSA-2020:3807 0 None None None 2020-09-23 16:11:24 UTC
oVirt gerrit 108077 0 master MERGED kubevirt: propagate migration error messages 2020-11-03 10:22:41 UTC
oVirt gerrit 110375 0 master MERGED core: omit irrelevant audit log on failure to migrate kubevirt vm 2020-11-03 10:22:41 UTC

Description Pavol Brilla 2020-03-25 08:59:50 UTC
Description of problem:
CNV VM with evictionStrategy: LiveMigrate is unable to migrate to any other host 

Version-Release number of selected component (if applicable):
4.4.0-26

How reproducible:
100

Steps to Reproduce:
1. have cnv with vm ( with defined evictionStrategy: LiveMigrate ) conncted to rhv
2. try to migrate such VM
3.

Actual results:
Migration failed (VM: cirros-vm-notfromtemplate, Source: talayan-pytest-2g7mj-worker-rgnwj).

Expected results:
VM should succeded.

Additional info:
2020-03-25 09:41:30,218+01 INFO  [org.ovirt.engine.core.sso.utils.AuthenticationUtils] (default task-14) [] User admin@internal successfully logged in with scopes: ovirt-app-api ovirt-ext=token-info:authz-search ovirt-ext=token-info:public
-authz-search ovirt-ext=token-info:validate ovirt-ext=token:password-access
2020-03-25 09:41:30,401+01 INFO  [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-34) [62390a6e] Lock freed to object 'EngineLock:{exclusiveLocks='
[61dfa229-72c2-48d8-90ae-ecfaf51de1e1=PROVIDER]', sharedLocks=''}'
2020-03-25 09:41:49,601+01 INFO  [org.ovirt.engine.core.bll.MigrateVmCommand] (default task-7) [efba2ef7-9248-41d0-a7c2-406670e4730b] Lock Acquired to object 'EngineLock:{exclusiveLocks='[225d7f52-620c-4ab5-bf87-a99dd60ccae5=VM]', sharedLo
cks=''}'
2020-03-25 09:41:49,716+01 INFO  [org.ovirt.engine.core.bll.MigrateVmCommand] (default task-7) [efba2ef7-9248-41d0-a7c2-406670e4730b] Running command: MigrateVmCommand internal: false. Entities affected :  ID: 225d7f52-620c-4ab5-bf87-a99dd
60ccae5 Type: VMAction group MIGRATE_VM with role type USER
2020-03-25 09:41:50,474+01 ERROR [org.ovirt.engine.core.bll.MigrateVmCommand] (default task-7) [efba2ef7-9248-41d0-a7c2-406670e4730b] Command 'org.ovirt.engine.core.bll.MigrateVmCommand' failed: EngineException: failed to interact with kub
evirt migrate endpoint (Failed with error unexpected and code 16)
2020-03-25 09:41:50,510+01 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-7) [efba2ef7-9248-41d0-a7c2-406670e4730b] EVENT_ID: VM_MIGRATION_NO_VDS_TO_MIGRATE_TO(166), No available host was found 
to migrate VM cirros-vm-notfromtemplate to.
2020-03-25 09:41:50,517+01 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-7) [efba2ef7-9248-41d0-a7c2-406670e4730b] EVENT_ID: VM_MIGRATION_FAILED(65), Migration failed  (VM: cirros-vm-notfromtemplate, Source: talayan-pytest-2g7mj-worker-rgnwj).
2020-03-25 09:41:50,522+01 INFO  [org.ovirt.engine.core.bll.MigrateVmCommand] (default task-7) [efba2ef7-9248-41d0-a7c2-406670e4730b] Lock freed to object 'EngineLock:{exclusiveLocks='[225d7f52-620c-4ab5-bf87-a99dd60ccae5=VM]', sharedLocks=''}'
2020-03-25 09:41:50,632+01 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-7) [] Operation Failed: [Unexpected exception]
2020-03-25 09:45:10,414+01 INFO  [org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService] (EE-ManagedScheduledExecutorService-engineThreadMonitoringThreadPool-Thread-1) [] Thread pool 'default' is using 0 threads out of 1, 5 threads waiting for tasks.

Comment 1 Piotr Kliczewski 2020-03-30 11:21:29 UTC
The migration failed correctly but the error was not handled correctly in the engine. The error message in CNV was "cannot migrate VMI which does not use masquerade to connect to the pod network".
With masquerade configured migration works fine. We need to make sure that the user sees why the migration failed.

Comment 4 Pavol Brilla 2020-07-09 18:39:06 UTC
2020-07-09 19:53:22,973+02 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-7) [e1eea64f-cfef-4935-a64f-74ca9ce35f6a] EVENT_ID: VM_MIGRATION_NO_VDS_TO_MIGRATE_TO(166), No available host was found to migrate VM cirros-vm-notfromtemplate to.
2020-07-09 19:53:22,978+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-7) [e1eea64f-cfef-4935-a64f-74ca9ce35f6a] EVENT_ID: VM_MIGRATION_FAILED(65), Migration failed  (VM: cirros-vm-notfromtemplate, Source: talayan-pytest-8gw99-worker-wqnwd).
2020-07-09 19:53:22,984+02 INFO  [org.ovirt.engine.core.bll.MigrateVmCommand] (default task-7) [e1eea64f-cfef-4935-a64f-74ca9ce35f6a] Lock freed to object 'EngineLock:{exclusiveLocks='[08e73976-e63f-46f1-baae-0f84e9bae728=VM]', sharedLocks=''}'
2020-07-09 19:53:23,002+02 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-7) [] Operation Failed: [Fatal error during migration]
2020-07-09 19:53:44,290+02 INFO  [org.ovirt.engine.core.bll.MigrateVmToServerCommand] (default task-7) [6b49e45e-2581-4aea-93c8-57e76a68d93f] Lock Acquired to object 'EngineLock:{exclusiveLocks='[08e73976-e63f-46f1-baae-0f84e9bae728=VM]', sharedLocks=''}'
2020-07-09 19:53:44,391+02 INFO  [org.ovirt.engine.core.bll.MigrateVmToServerCommand] (default task-7) [6b49e45e-2581-4aea-93c8-57e76a68d93f] Running command: MigrateVmToServerCommand internal: false. Entities affected :  ID: 08e73976-e63f-46f1-baae-0f84e9bae728 Type: VMAction group MIGRATE_VM with role type USER
2020-07-09 19:53:45,053+02 ERROR [org.ovirt.engine.core.bll.MigrateVmToServerCommand] (default task-7) [6b49e45e-2581-4aea-93c8-57e76a68d93f] Command 'org.ovirt.engine.core.bll.MigrateVmToServerCommand' failed: EngineException: Internal error occurred: admission webhook "migration-create-validator.kubevirt.io" denied the request: Cannot migrate VMI, Reason: DisksNotLiveMigratable, Message: cannot migrate VMI with non-shared PVCs (Failed with error migrateErr and code 12)
2020-07-09 19:53:45,091+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-7) [6b49e45e-2581-4aea-93c8-57e76a68d93f] EVENT_ID: VM_MIGRATION_TO_SERVER_FAILED(120), Migration failed  (VM: cirros-vm-notfromtemplate, Source: talayan-pytest-8gw99-worker-wqnwd, Destination: talayan-pytest-8gw99-worker-rgnr5).
2020-07-09 19:53:45,099+02 INFO  [org.ovirt.engine.core.bll.MigrateVmToServerCommand] (default task-7) [6b49e45e-2581-4aea-93c8-57e76a68d93f] Lock freed to object 'EngineLock:{exclusiveLocks='[08e73976-e63f-46f1-baae-0f84e9bae728=VM]', sharedLocks=''}'
2020-07-09 19:53:45,101+02 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-7) [] Operation Failed: [Fatal error during migration]


VM has evictionStrategy: LiveMigrate

if I have to have some more setting to be able to live migrate please specify

Comment 5 Arik 2020-07-20 07:41:06 UTC
(In reply to Pavol Brilla from comment #4)
> if I have to have some more setting to be able to live migrate please specify

There's an example for how to create a migratable VM in Kubevirt:
https://github.com/kubevirt/kubevirt/blob/master/examples/vmi-migratable.yaml
Note that this is a VMI and in order to have it shown in oVirt, we need to create a VM.

Comment 6 Arik 2020-07-20 20:48:15 UTC
A virtual machine that should be able to migrate:

---
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
  labels:
    special: vm-migratable
  name: vm-migratable
spec:
  running: false
  template:
    metadata:
      labels:
        kubevirt.io/vm: vm-migratable
    spec:
      domain:
        devices:
          disks:
          - disk:
              bus: virtio
            name: containerdisk
          interfaces:
          - masquerade: {}
            name: default
        machine:
          type: ""
        resources:
          requests:
            memory: 64M
      networks:
      - name: default
        pod: {}
      terminationGracePeriodSeconds: 0
      volumes:
      - containerDisk:
          image: registry:5000/kubevirt/alpine-container-disk-demo:devel
        name: containerdisk

Comment 11 Pavol Brilla 2020-08-26 17:32:22 UTC
Error of migration failure is visible in engine.log:

2020-08-26 19:31:04,565+02 ERROR [org.ovirt.engine.core.bll.MigrateVmCommand] (default task-56) [6cb613fc-b87e-475b-9d4e-9e4786200c93] Command 'org.ovirt.engine.core.bll.MigrateVmCommand' failed: EngineException: Internal error occurred: admission webhook "migration-create-validator.kubevirt.io" denied the request: Cannot migrate VMI, Reason: InterfaceNotLiveMigratable, Message: cannot migrate VMI which does not use masquerade to connect to the pod network (Failed with error migrateErr and code 12)

Comment 12 Pavol Brilla 2020-08-27 12:06:41 UTC
Point of this bug - better error message for failing migration of ocp virtualization vm is verified

Comment 16 errata-xmlrpc 2020-09-23 16:11:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Virtualization security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:3807


Note You need to log in before you can comment on or make changes to this bug.