Bug 508278

Summary: A running vm will disappear after forced off
Product: Red Hat Enterprise Linux 5 Reporter: Qunfang Zhang <qzhang>
Component: libvirtAssignee: Cole Robinson <crobinso>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 5.4CC: berrange, clalance, lihuang, llim, nzhang, qzhang, sprabhu, tao, veillard, virt-maint, xen-maint
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-02 09:19:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Python script reproducing the issue
none
Don't overwrite domain ID in 'define' for xen XM driver
none
Refresh /etc/xen cache even if inotify is enabled
none
Refresh /etc/xen even if inotify enabled (take 2) none

Description Qunfang Zhang 2009-06-26 11:50:44 UTC
Description of problem:
A running vm will disappear after forced off. This issue only occurs when the vm is created just now.

Version-Release number of selected component (if applicable):
virt-manager-0.6.1-3.el5

How reproducible:
Sometimes

Steps to Reproduce:
1.Create a virtual machine, 
2.then right-click ->Shutdown ->Force Off

  
Actual results:
he virtual machine will disappear. But it will still appear after restart the virt-manager.

Expected results:
The virtual machine is forced off but displayed in the window.

Additional info:
This issue only occurs when the vm is created just now.

Comment 2 Qunfang Zhang 2009-06-29 09:02:10 UTC
A running vm I created just now, is forced off at once. Then it disappears from the Virtual Machine Manager window, instead of existing there marked "force off". But after I restart the virt-manager, it will appear again and its status is "force off".
Please notice:
(1)This issue occurs in both 5u3 and 5u4.
   In 5u4 it occurs sometimes.
(2)This issue only occurs when the vm is created just now.
   For example, I create two vms A and B, then I force off B at once, the B will disappear. But if I force off A, it will be normal.

Comment 3 Cole Robinson 2009-06-29 16:17:39 UTC
Created attachment 349809 [details]
Python script reproducing the issue

I'm not sure if this is the exact issue that is confusing virt-manager, but the attached script details some funkiness in this area, either at the libvirt or xen level.

The script will basically do 'list; create; define; destroy; list'. From virsh, this sequence works fine and produces expected results. From the script, this sequence fails on the 'destroy' step. Reproduce this by running the script with a single argument 'fail':

# python test-lose-destroy.py fail
cleanup error: Unknown failure
active IDs     : [0]
inactive Names : ['rhel5.4-pv', 'livecd', 'destroy-test', 'aaa', 'rhel5.4-fv']


libvir: Xen Daemon error : invalid argument in Domain destroy-test isn't running.
Traceback (most recent call last):
  File "test-lose-destroy.py", line 79, in ?
    main()
  File "test-lose-destroy.py", line 73, in main
    vm.destroy()
  File "/usr/lib64/python2.4/site-packages/libvirt.py", line 297, in destroy
    if ret == -1: raise libvirtError ('virDomainDestroy() failed', dom=self)
libvirt.libvirtError: invalid argument in Domain destroy-test isn't running.


However, if the sequence is changed to be 'list; create; define; list; destroy; list', there is _no_ error. This can be reproduced by running the script with no arguments:

# python test-lose-destroy.py
libvir: Xen error : Domain not found: xenUnifiedDomainLookupByUUID
cleanup error: Domain not found: xenUnifiedDomainLookupByUUID
active IDs     : [0]
inactive Names : ['rhel5.4-pv', 'livecd', 'aaa', 'rhel5.4-fv']


active IDs     : [0, 38]
inactive Names : ['rhel5.4-pv', 'livecd', 'aaa', 'rhel5.4-fv']


active IDs     : [0]
inactive Names : ['rhel5.4-pv', 'livecd', 'destroy-test', 'aaa', 'rhel5.4-fv']


I'll reassign this to libvirt.

Comment 4 Daniel Berrangé 2009-06-29 16:31:52 UTC
I expect what is happening here is the virDomainPtr identifiers are getting mixed up

eg,  step 1

  dom = conn.create(xml)

dom has ID=6, name=foo, uuid=XXXX

now step 2

  dom = conn.define(xml)
  

dom is probably ending up with ID=-1, name=foo, uuid=XXXX, even though its already running. So when getting to step 3, the 'destroy' op sees ID=-1 and gives up.

I suspect the DefineXML operation in libvirt's Xen driver simply shouldn't override the existing ID field.

Comment 6 Cole Robinson 2009-07-31 15:00:45 UTC
*** Bug 513958 has been marked as a duplicate of this bug. ***

Comment 8 Cole Robinson 2009-07-31 15:25:27 UTC
Created attachment 355827 [details]
Don't overwrite domain ID in 'define' for xen XM driver

This patch is exactly the fix Dan suggested, and makes my test script work.

Comment 9 Cole Robinson 2009-07-31 16:42:28 UTC
Posted for ACKs upstream:

https://www.redhat.com/archives/libvir-list/2009-July/msg01114.html

Comment 13 Daniel Veillard 2009-08-04 15:05:12 UTC
Patch looks right to me, applying !

Daniel

Comment 14 Cole Robinson 2009-08-04 15:09:58 UTC
Okay, looks the above patch doesn't fix the issue triggered via virt-manager,
and my script wasn't exposing the same bug (though the patch still fixes a valid bug). This virt-manager issue has something to do with the xen inotify support for watching files in /etc/xen.

To reproduce:

In terminal 1, run 'virsh --connect xen:///'.
In terminal 2, copy a new xm config into /etc/xen.
In terminal 1, run 'virsh list --all'. The new config isn't listed.

virt-manager hits this, because we fork off a separate connection to do domain
creation. This would still be an issue if a user provisioned a guest using
virt-install while a virsh or virt-manager session was already running: once
the new domain is stopped, it disappears from the list.

If virsh is run with debugging, it seems to indicate that no inotify event is
even being dispatched. (since the debug message at the top of
xen_inotify.c:xenInotifyEvent is firing).

Moving back to assigned.

Comment 15 Cole Robinson 2009-08-04 15:50:16 UTC
Created attachment 356201 [details]
Refresh /etc/xen cache even if inotify is enabled

This fixes the above test, as well as the original issue with VMs disappearing from virt-manager.

Comment 16 Cole Robinson 2009-08-04 16:11:04 UTC
Created attachment 356206 [details]
Refresh /etc/xen even if inotify enabled (take 2)

Similar to the above, but don't use the old cache updating method if inotify was successfully initialized.

Comment 17 Daniel Veillard 2009-08-04 20:27:25 UTC
Okay, that last patch looks fine, actually cleaner overall !
libvirt-0.6.3-20.el5 has been built in dist-5E-qu-candidate with the 2 fixes

Daniel

Comment 20 Nan Zhang 2009-08-05 05:51:48 UTC
Move to VERIFIED, because the new xm config can be listed with virsh. This bug has been verified with libvirt 0.6.3-20.el5 on RHEL-5.4.

[root@dhcp-66-70-85 ~]# virsh list --all
 Id Name                 State
----------------------------------
  0 Domain-0             running
  - demo                 shut off
  - test                 shut off
  - winxp                shut off
  - xentest              shut off

[root@dhcp-66-70-85 ~]# mv foo /etc/xen/
[root@dhcp-66-70-85 ~]# virsh list --all
 Id Name                 State
----------------------------------
  0 Domain-0             running
  - demo                 shut off
  - foo                  shut off
  - test                 shut off
  - winxp                shut off
  - xentest              shut off

Comment 22 errata-xmlrpc 2009-09-02 09:19:48 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-1269.html

Comment 26 Cole Robinson 2009-10-19 16:03:46 UTC
*** Bug 513958 has been marked as a duplicate of this bug. ***