Summary: | Ghost VMs created with prefix "external-" by "null@N/A" | ||
---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Robert McSwain <rmcswain> |
Component: | ovirt-engine | Assignee: | Francesco Romani <fromani> |
Status: | CLOSED ERRATA | QA Contact: | sefi litmanovich <slitmano> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 3.5.3 | CC: | amureini, bcholler, dmoessne, fromani, fsun, gklein, inetkach, lpeer, lsurette, luisfa36, melewis, mgoldboi, michal.skrivanek, mkalinin, pspacek, rbalakri, Rhev-m-bugs, rmcswain, sbonazzo, simon.barrett, tjelinek, yeylon, ykaul |
Target Milestone: | ovirt-3.6.3 | ||
Target Release: | 3.6.3 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Previously, incorrect handling of the virtual machine recovery file caused virtual machine shut down inside the guest. This caused the virtual machine recovery file to be leaked. This may cause incorrect detection of virtual machines, especially in the case of network failures, and cause the Red Hat Enterprise Virtualization Manager to find unknown external virtual machines. Now, VDSM has been fixed to no longer leak recovery files. Recovery files are now correctly handled and this prevents the Red Hat Enterprise Virtualization Manager from incorrectly detecting virtual machines.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2016-03-09 21:11:44 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Bug Depends On: | |||
Bug Blocks: | 1213937, 1121089 |
Description
Robert McSwain
2015-08-12 19:38:50 UTC
usually this is a result of unclean removal of host. the vm is still running or still reported by vdsm, you should stop it and then remove it again. please let me know if it helped I've pinged the customer again and will let you know if we get any more info from them. any news? I have seen the same behaviour where a number of previously removed VM's have come back as external. I see the output below in the engine log but unfortunately I don't have the log from when the VM's were originally removed so I can't see if anything went wrong during the removal. I was also upgrading my nodes from 3.4.2 to 3.5.4 just prior to these re-appearing. Not sure if that helps in any way but I thought I'd share it anyway. I am running 3.5.4. 015-11-10 05:33:21,685 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-32) [518940e8] Importing VM ny2-laa-012.demo as external-ny2-laa-012.demo, as it is running on the on Host, but does not exist in the engine. 2015-11-10 05:33:21,685 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-32) [518940e8] Importing VM ny2-laa-011-rhel.demo as external-ny2-laa-011-rhel.demo, as it is running on the on Host, but does not exist in the engine. 2015-11-10 05:33:21,686 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-32) [518940e8] Importing VM ny2-laa-013-rhel.demo as external-ny2-laa-013-rhel.demo, as it is running on the on Host, but does not exist in the engine. 2015-11-10 05:33:21,686 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-32) [518940e8] Importing VM ny2-laa-012-rhel.demo as external-ny2-laa-012-rhel.demo, as it is running on the on Host, but does not exist in the engine. 2015-11-10 05:33:21,687 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-32) [518940e8] Importing VM ny2-lda-004a.demo as external-ny2-lda-004a.demo, as it is running on the on Host, but does not exist in the engine. 2015-11-10 05:33:21,687 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-32) [518940e8] Importing VM ny2-laa-010.demo as external-ny2-laa-010.demo, as it is running on the on Host, but does not exist in the engine. 2015-11-10 05:33:21,688 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-32) [518940e8] Importing VM ny2-laa-014-rhel.demo as external-ny2-laa-014-rhel.demo, as it is running on the on Host, but does not exist in the engine. 2015-11-10 05:33:21,688 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-32) [518940e8] Importing VM ny2-laa-014.demo as external-ny2-laa-014.demo, as it is running on the on Host, but does not exist in the engine. 2015-11-10 05:33:21,689 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-32) [518940e8] Importing VM ny2-lda-004b.demo as external-ny2-lda-004b.demo, as it is running on the on Host, but does not exist in the engine. 2015-11-10 05:33:21,689 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-32) [518940e8] Importing VM ny2-laa-013.demo as external-ny2-laa-013.demo, as it is running on the on Host, but does not exist in the engine. 2015-11-10 05:33:21,690 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-32) [518940e8] Importing VM ny2-laa-011.demo as external-ny2-laa-011.demo, as it is running on the on Host, but does not exist in the engine. 2015-11-10 05:33:21,690 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-32) [518940e8] Importing VM ny2-laa-009.demo as external-ny2-laa-009.demo, as it is running on the on Host, but does not exist in the engine. 2015-11-10 05:33:21,690 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-32) [518940e8] Importing VM ny2-laa-010-rhel.demo as external-ny2-laa-010-rhel.demo, as it is running on the on Host, but does not exist in the engine. 2015-11-10 05:33:21,691 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-32) [518940e8] Importing VM ny2-laa-009-rhel.demo as external-ny2-laa-009-rhel.demo, as it is running on the on Host, but does not exist in the engine. root> grep "was created by null" engine.log-20151111 2015-11-10 05:33:23,034 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-32) [593a4795] Correlation ID: 593a4795, Call Stack: null, Custom Event ID: -1, Message: VM external-ny2-laa-012.demo was created by null@N/A. 2015-11-10 05:33:23,180 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-32) [2795b718] Correlation ID: 2795b718, Call Stack: null, Custom Event ID: -1, Message: VM external-ny2-laa-011-rhel.demo was created by null@N/A. 2015-11-10 05:33:23,367 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-32) [2c7eccae] Correlation ID: 2c7eccae, Call Stack: null, Custom Event ID: -1, Message: VM external-ny2-laa-013-rhel.demo was created by null@N/A. 2015-11-10 05:33:23,488 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-32) [6d050906] Correlation ID: 6d050906, Call Stack: null, Custom Event ID: -1, Message: VM external-ny2-laa-012-rhel.demo was created by null@N/A. 2015-11-10 05:33:23,643 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-32) [77327bf0] Correlation ID: 77327bf0, Call Stack: null, Custom Event ID: -1, Message: VM external-ny2-lda-004a.demo was created by null@N/A. 2015-11-10 05:33:23,809 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-32) [7d8ece3f] Correlation ID: 7d8ece3f, Call Stack: null, Custom Event ID: -1, Message: VM external-ny2-laa-010.demo was created by null@N/A. 2015-11-10 05:33:23,897 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-32) [6d939916] Correlation ID: 6d939916, Call Stack: null, Custom Event ID: -1, Message: VM external-ny2-laa-014-rhel.demo was created by null@N/A. 2015-11-10 05:33:23,981 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-32) [5c6ca5bc] Correlation ID: 5c6ca5bc, Call Stack: null, Custom Event ID: -1, Message: VM external-ny2-laa-014.demo was created by null@N/A. 2015-11-10 05:33:24,146 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-32) [cbe9bfb] Correlation ID: cbe9bfb, Call Stack: null, Custom Event ID: -1, Message: VM external-ny2-lda-004b.demo was created by null@N/A. 2015-11-10 05:33:24,238 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-32) [5487ee3b] Correlation ID: 5487ee3b, Call Stack: null, Custom Event ID: -1, Message: VM external-ny2-laa-013.demo was created by null@N/A. 2015-11-10 05:33:24,328 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-32) [280caed6] Correlation ID: 280caed6, Call Stack: null, Custom Event ID: -1, Message: VM external-ny2-laa-011.demo was created by null@N/A. 2015-11-10 05:33:24,417 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-32) [6087c1f1] Correlation ID: 6087c1f1, Call Stack: null, Custom Event ID: -1, Message: VM external-ny2-laa-009.demo was created by null@N/A. 2015-11-10 05:33:24,505 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-32) [18e4ef1d] Correlation ID: 18e4ef1d, Call Stack: null, Custom Event ID: -1, Message: VM external-ny2-laa-010-rhel.demo was created by null@N/A. 2015-11-10 05:33:24,603 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-32) [92aba53] Correlation ID: 92aba53, Call Stack: null, Custom Event ID: -1, Message: VM external-ny2-laa-009-rhel.demo was created by null@N/A. related to bug 1121089 (same issue apparently) Investigation on locally reproduced problem showed a possible reason - network connectivity/instability is triggering fencing actions - restart vdsm multiple times. When the vdsm VM recovery is interrupted before it finishes its job, the recovery files are left behind and reused on following restarts. Current 3.5 code makes troubleshooting quite difficult as it doesn't provide sufficient information to know what's going on for sure, so this is still more of a theory than a proof. There are improvements to logging as well as logic within the recovery thread on master, worth backporting to 3.6 at least Targeting 3.6 at the moment, though we may need to wait for more occurrences there (In reply to Michal Skrivanek from comment #11) > Investigation on locally reproduced problem showed a possible reason - > network connectivity/instability is triggering fencing actions - restart > vdsm multiple times. When the vdsm VM recovery is interrupted before it > finishes its job, the recovery files are left behind and reused on following > restarts. > > Current 3.5 code makes troubleshooting quite difficult as it doesn't provide > sufficient information to know what's going on for sure, so this is still > more of a theory than a proof. > There are improvements to logging as well as logic within the recovery > thread on master, worth backporting to 3.6 at least > > Targeting 3.6 at the moment, though we may need to wait for more occurrences > there After more investigation, we think the most probable cause is Vdsm misdetecting shutdown from within the guest. In that case Vdsm correctly deletes the recovery file, but recreates it after the deletion with 'Down' status. The reason is Vdsm does 'saveState' after releaseVm in the onQemuDeath callback. saveState has a safety guard to avoid this recreation, which triggers only in the 'destroy' path. While we have pretty solid evidence this is the root cause, I haven't reproduced it yet. Will do ASAP, and post a patch if this is indeed the case. (In reply to Francesco Romani from comment #12) > (In reply to Michal Skrivanek from comment #11) > While we have pretty solid evidence this is the root cause, I haven't > reproduced it yet. Will do ASAP, and post a patch if this is indeed the case. Forgot to update (sorry about that): confirmed/reproduced. patches merged on 3.6 branch -> MODIFIED Verified with rhevm-3.6.3-0.1.el6.noarch. 1. Created vm with rhel 7.2 installed. 2. Started vm - recovery file created. 3. Shutdown vm from within the vm. 4. After shutting down and engine reports vm in down status - no recovery file re created on host. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-0376.html *** Bug 1345942 has been marked as a duplicate of this bug. *** |