Bug 889154
| Summary: | [ovirt-engine] host stuck in 'unassigned' forever in case activate is performed during 'preparing for maintenance' state (deadlock!) | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Haim <hateya> | ||||
| Component: | ovirt-engine | Assignee: | Roy Golan <rgolan> | ||||
| Status: | CLOSED DUPLICATE | QA Contact: | Pavel Stehlik <pstehlik> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 3.1.0 | CC: | acathrow, bazulay, iheim, jkt, lpeer, michal.skrivanek, pstehlik, Rhev-m-bugs, yeylon, yzaslavs | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 3.2.3 | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | virt | ||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2013-08-12 13:39:32 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Haim
2012-12-20 10:59:11 UTC
I have a theory, i guess it happens when some VMs went into pause state during migrate VM command. 2012-12-19 14:33:48,694 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (pool-3-thread-8) [2671efd9] ResourceManager::vdsMaintenance - Failed migrating desktop in digo-vdc 2012-12-19 14:33:48,704 ERROR [org.ovirt.engine.core.engineencryptutils.EncryptionUtils] (QuartzScheduler_Worker-65) Failed to decrypt Data must start with zero 2012-12-19 14:33:48,733 WARN [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (pool-3-thread-8) [777cfdd] CanDoAction of action InternalMigrateVm failed. Reasons:MIGRATE_P AUSED_VM_IS_UNSUPPORTED,VAR__ACTION__MIGRATE,VAR__TYPE__VM 2012-12-19 14:33:48,733 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (pool-3-thread-8) [777cfdd] ResourceManager::vdsMaintenance - Failed migrating desktop Lim e-VDC 2012-12-19 14:33:48,759 WARN [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (pool-3-thread-8) [6251a4ab] CanDoAction of action InternalMigrateVm failed. Reasons:MIGRATE_ PAUSED_VM_IS_UNSUPPORTED,VAR__ACTION__MIGRATE,VAR__TYPE__VM 2012-12-19 14:33:48,759 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (pool-3-thread-8) [6251a4ab] ResourceManager::vdsMaintenance - Failed migrating desktop Se lenium 2012-12-19 14:33:48,775 WARN [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (pool-3-thread-8) [2d134059] CanDoAction of action InternalMigrateVm failed. Reasons:MIGRATE_ PAUSED_VM_IS_UNSUPPORTED,VAR__ACTION__MIGRATE,VAR__TYPE__VM 2012-12-19 14:33:48,775 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (pool-3-thread-8) [2d134059] ResourceManager::vdsMaintenance - Failed migrating desktop ge na-31 2012-12-19 14:33:48,792 WARN [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (pool-3-thread-8) [698bbdd6] CanDoAction of action InternalMigrateVm failed. Reasons:MIGRATE_ PAUSED_VM_IS_UNSUPPORTED,VAR__ACTION__MIGRATE,VAR__TYPE__VM 2012-12-19 14:33:48,792 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (pool-3-thread-8) [698bbdd6] ResourceManager::vdsMaintenance - Failed migrating desktop pa ikov-rhevm-gluster3 2012-12-19 14:33:48,808 WARN [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (pool-3-thread-8) [38e9afaf] CanDoAction of action InternalMigrateVm failed. Reasons:MIGRATE_ PAUSED_VM_IS_UNSUPPORTED,VAR__ACTION__MIGRATE,VAR__TYPE__VM 2012-12-19 14:33:48,808 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (pool-3-thread-8) [38e9afaf] ResourceManager::vdsMaintenance - Failed migrating desktop AR T-mlnx-setup 2012-12-19 14:33:48,826 WARN [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (pool-3-thread-8) [2dfb0161] CanDoAction of action InternalMigrateVm failed. Reasons:MIGRATE_ PAUSED_VM_IS_UNSUPPORTED,VAR__ACTION__MIGRATE,VAR__TYPE__VM 2012-12-19 14:33:48,827 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (pool-3-thread-8) [2dfb0161] ResourceManager::vdsMaintenance - Failed migrating desktop wh eat-vdc 2012-12-19 14:33:48,850 WARN [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (pool-3-thread-8) [7b1ff695] CanDoAction of action InternalMigrateVm failed. Reasons:MIGRATE_ PAUSED_VM_IS_UNSUPPORTED,VAR__ACTION__MIGRATE,VAR__TYPE__VM 2012-12-19 14:33:48,850 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (pool-3-thread-8) [7b1ff695] ResourceManager::vdsMaintenance - Failed migrating desktop mo nique-vdc0 Created attachment 666612 [details]
engine, server, console logs.
please add Expected results section to the bug. Should we block the activate in the migration phase ??? problem is clear but BZ description not specifying how it should work Simon, please answer questions in comment #8 (In reply to comment #8) > Should we block the activate in the migration phase ??? > problem is clear but BZ description not specifying how it should work Well the proper solution is to allow cancel = 'Stop all current migration and set back to up'. But since this is not trivial until we have a good infra for task management (I guess) then the easy solution for 3.2 will be to block the activate button until maintenance either fails or ends successfully. Haim, pleas open an RFE to allow to cancel preparation in maintenance by hitting the activate button. Thanks, Simon removing need-info, opened an RFE, still expect a fix here. after scrub mtg removing Regression There are 2 different issues in this bug: 1. activation of host that is in "preparing for maintenance" and have active migrations going on ... stuck in unassigned. 2. VMs get stuck in "Migrating from" status although they had probably failed migration as they had moved to PAUSED (the only reason I can think off is EIO) I'd be surprised the 2. is still valid Still in progress, for now moving to 3.2.3 Will review next mtg (30th july) |