Bug 693758

Summary: libvirt-guests init script saves but doesn't restore non-persistent domains
Product: Red Hat Enterprise Linux 6 Reporter: Eelco Dolstra <e.dolstra>
Component: libvirtAssignee: Peter Krempa <pkrempa>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.2CC: crobinso, dallan, dyuan, eblake, e.dolstra, mzhan, rwu, tzheng, xen-maint, yupzhang
Target Milestone: alpha   
Target Release: 6.2   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-0.9.10-4.el6 Doc Type: Bug Fix
Doc Text:
No Documentation needed
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-20 06:26:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Eelco Dolstra 2011-04-05 14:13:28 UTC
Description of problem:

The libvirt-guests init script saves VMs that are not persistent
(i.e. haven't been defined with "virsh define"), but fails to restore
them successfully.  This is because the script restores the VM by
doing "virsh start $UUID".  This works for persistent domains because
the UUID is known even when the VM isn't running, but for
non-persistent domains, it fails.  Manually restoring the domain by
doing "virsh restore /var/lib/libvirt/qemu/save/$NAME" does work.


Version-Release number of selected component (if applicable): 0.8.8 / 0.9.0


Steps to Reproduce:

- Create a non-persistent VM:

$ virsh create ./Test1.xml
$ virsh dominfo Test1
...
UUID:           000681bd-94bd-6352-75aa-5892cb995970
...
Persistent:     no

- Save running VMs:

$ /etc/rc.d/init.d/libvirt-guests stop
Running guests on default URI: Test1
Suspending guests on default URI...
Suspending Test1: done

$ cat /var/lib/libvirt/libvirt-guests
default 000681bd-94bd-6352-75aa-5892cb995970

$ ls -l /var/lib/libvirt/qemu/save/
-rw------- 1 root root 251743569 Apr  5 16:02 Test1.save

- Restore the VMs:

$ /etc/rc.d/init.d/libvirt-guests start
Resuming guests on default URI...
Resuming guest 000681bd-94bd-6352-75aa-5892cb995970: error: failed to get domain '000681bd-94bd-6352-75aa-5892cb995970'
error: Domain not found: no domain with matching name '000681bd-94bd-6352-75aa-5892cb995970'


Actual results:

The VM is not restored.


Expected results:

The VM should be restored.  Alternatively, it shouldn't be saved to
begin with.  But personally I would prefer to have all VMs
saved/restored across reboots.


Additional info:

This is libvirt running on NixOS (http://nixos.org/), but using the
upstream libvirt-guests init script.

Comment 4 tingting zheng 2011-11-28 06:01:02 UTC
Hi, I can Reproduce the bug with:
libvirt-0.8.7-18.el6.x86_64

Steps to reproduce:
# virsh create test.xml
Domain test created from test.xml

# virsh dominfo test
Id:             2
Name:           test
UUID:           9231984a-2598-7f17-6e5a-91c251b390da
OS Type:        hvm
State:          running
CPU(s):         4
CPU time:       4.8s
Max memory:     1048576 kB
Used memory:    1048576 kB
Persistent:     no
Autostart:      disable
Security model: selinux
Security DOI:   0
Security label: system_u:system_r:svirt_t:s0:c225,c558 (enforcing)

# /etc/rc.d/init.d/libvirt-guests stop
Running guests on default URI: test
Suspending guests on default URI...
Suspending test: done    
     
# cat /var/lib/libvirt/libvirt-guests
default 9231984a-2598-7f17-6e5a-91c251b390da

# ls -l /var/lib/libvirt/qemu/save
total 158136
-rw-------. 1 root root 161927372 Nov 28 13:42 test.save

# /etc/rc.d/init.d/libvirt-guests start
Resuming guests on default URI...
Resuming guest 9231984a-2598-7f17-6e5a-91c251b390da: error: failed to get domain '9231984a-2598-7f17-6e5a-91c251b390da'
error: Domain not found: no domain with matching name '9231984a-2598-7f17-6e5a-91c251b390da'

# virsh restore /var/lib/libvirt/qemu/save/test.save
Domain restored from /var/lib/libvirt/qemu/save/test.save

[root@tzheng-linux virsh]# virsh list
 Id Name                 State
----------------------------------
  3 test                 running



Also I tested it with the newest libvirt build,it failed to do managed save for transient domain,see the error messages.
# rpm -qa|grep libvirt
libvirt-0.9.4-23.el6.x86_64

Steps:
# virsh create test.xml
Domain test created from test.xml

# virsh dominfo test
Id:             5
Name:           test
UUID:           9231984a-2598-7f17-6e5a-91c251b390da
OS Type:        hvm
State:          running
CPU(s):         4
CPU time:       1.3s
Max memory:     1048576 kB
Used memory:    1048576 kB
Persistent:     no
Autostart:      disable
Managed save:   no
Security model: selinux
Security DOI:   0
Security label: system_u:system_r:svirt_t:s0:c441,c682 (enforcing)

# /etc/rc.d/init.d/libvirt-guests stop
Running guests on default URI: test
Suspending guests on default URI...
Suspending test: error: Failed to save domain 9231984a-2598-7f17-6e5a-91c251b390da state
error: Requested operation is not valid: cannot do managed save for transient domain

# virsh list
 Id Name                 State
----------------------------------
  5 test                 running

Comment 5 Peter Krempa 2012-01-05 13:46:40 UTC
commit 0de75e855b0c37a7ef25370b19cabad34e679ce6
Author: Eric Blake <eblake>
Date:   Wed Aug 10 08:51:36 2011 -0600

    managedsave: prohibit use on transient domains
    
    Transient domains reject attempts to set autostart, and using
    virDomainCreate to restart a domain only works on persistent
    domains.  Therefore, managed save makes no sense on transient
    domains, and should be rejected up front rather than creating
    an otherwise unrecoverable managed save file.
    
    Besides, transient domains imply that a lot more management is
    being done by the upper layer; this includes the assumption
    that the upper layer is okay managing the saved state file
    created by virDomainSave, and does not need to use managed save.
    
    * src/libvirt.c: Document that transient domains are incompatible
    with managed save.
    * src/qemu/qemu_driver.c (qemuDomainManagedSave): Enforce it.
    * src/libxl/libxl_driver.c (libxlDomainManagedSave): Likewise.

This commit forbids to do a managed save on a non-persistent domain. Currently the libvirt-guests script prints an error if transient domain is running while it tries to save other (persistent) domains. This is something we should fix. (save only persistent domains)

For transient domains, the script would need to use virDomainSave (virsh save) to save the domains under the control of the script, and restart them if necessary.

Comment 6 Dave Allan 2012-01-09 13:36:39 UTC
I don't think it makes sense to have the libvirt guests attempt to save and restore transient guests, as I agree with Eric's comment in the commit that transient domains implies that something is taking an active role in managing them.  I think what's been implemented so far, rejecting attempts to save them, is correct and all that's required.  If you disagree, please open a new BZ for saving and restoring transient guests.  Moving to POST.

Comment 9 dyuan 2012-02-15 06:48:55 UTC
Tested with libvirt-0.9.10-1.el6, there is a tiny issue, should I verify this bug and track the issue in new bug or fix it in this bug ?

Prepare a persistent domain and a transient domain, libvirt-guests only save the persistent domain but not for transient domain.

But the transient domain uuid still be recorded to /var/lib/libvirt/libvirt-guests, so there is still error msg when libvirt-guests start.

# virsh dominfo rhel6
Id:             3
Name:           rhel6
UUID:           4f2e1779-7040-702c-efd0-380e87f73a5d
OS Type:        hvm
State:          running
CPU(s):         1
CPU time:       18.2s
Max memory:     1048576 kB
Used memory:    1048576 kB
Persistent:     no
Autostart:      disable
Managed save:   no
Security model: selinux
Security DOI:   0
Security label: system_u:system_r:svirt_t:s0:c197,c946 (enforcing)

# virsh dominfo rhel62-1
Id:             4
Name:           rhel62-1
UUID:           05d9a9f8-3def-491c-e649-87718ea2d98a
OS Type:        hvm
State:          running
CPU(s):         1
CPU time:       18.5s
Max memory:     1048576 kB
Used memory:    1048576 kB
Persistent:     yes
Autostart:      disable
Managed save:   no
Security model: selinux
Security DOI:   0
Security label: system_u:system_r:svirt_t:s0:c229,c852 (enforcing)

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 3     rhel6                          running
 4     rhel62-1                       running

# service libvirt-guests stop
Running guests on default URI: rhel6, rhel62-1
Suspending guests on default URI...
Suspending rhel6: error: Failed to save domain 4f2e1779-7040-702c-efd0-380e87f73a5d state
error: Requested operation is not valid: cannot do managed save for transient domain
Suspending rhel62-1: done         

# cat /var/lib/libvirt/libvirt-guests 
default 05d9a9f8-3def-491c-e649-87718ea2d98a 4f2e1779-7040-702c-efd0-380e87f73a5d

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 3     rhel6                          running
 -     rhel62-1                       shut off

# ll /var/lib/libvirt/qemu/save/
total 289116
-rw-------. 1 root root 295756729 Feb 15 14:26 rhel62-1.save


# virsh save rhel6 /tmp/transient.save

Domain rhel6 saved to /tmp/transient.save

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     rhel62-1                       shut off

# service libvirt-guests start
Resuming guests on default URI...
Resuming guest 4f2e1779-7040-702c-efd0-380e87f73a5d: error: failed to get domain '4f2e1779-7040-702c-efd0-380e87f73a5d'
error: Domain not found: no domain with matching name '4f2e1779-7040-702c-efd0-380e87f73a5d'
Resuming guest rhel62-1: done

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 5     rhel62-1                       running

Comment 10 Peter Krempa 2012-02-15 09:38:30 UTC
I'll use this bug to track the issue. I'll modify the script to ignore transient domains completly, getting rid of both error messages on stop (complaining about not being able to suspend transient domains) and on start (complaining about non-existent domains).

Comment 15 dyuan 2012-03-07 09:35:34 UTC
Verified PASS with libvirt-0.9.10-4.el6.

# service libvirt-guests stop

Running guests on default URI: vr-guest_managedsave, rhel6
Not suspending transient guests on URI: default: rhel6
Suspending guests on default URI...
Suspending vr-guest_managedsave: done         

# more /var/lib/libvirt/libvirt-guests 
default 3862afa0-3ff8-80d1-51f2-cff6ec3880a6

# service libvirt-guests start

Resuming guests on default URI...
Resuming guest vr-guest_managedsave: done

Comment 16 Peter Krempa 2012-05-02 13:07:59 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
No Documentation needed

Comment 18 errata-xmlrpc 2012-06-20 06:26:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0748.html