Bug 841629

Summary: Save fail with error "An undefined error has ocurred"
Product: Red Hat Enterprise Linux 6 Reporter: Eduardo Elias Ferreira <edusf>
Component: qemu-kvmAssignee: Gerd Hoffmann <kraxel>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.3CC: acathrow, areis, bsarathy, dallan, dyasny, dyuan, eblake, jwu, mkenneth, mzhan, rwu, virt-maint, yupzhang
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-09-24 06:35:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Libvirt output none

Description Eduardo Elias Ferreira 2012-07-19 15:56:50 UTC
Description of problem: Try to use save the machine state

Version-Release number of selected component (if applicable): RHEL 6.3

How reproducible: Always

Steps to Reproduce:
1. Start a VM
2. Go to "Shutdown" -> "Save"
  
Actual results:

Error saving domain: internal error unable to execute QEMU command 'migrate': An undefined error has ocurred

Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/asyncjob.py", line 44, in cb_wrapper
    callback(asyncjob, *args, **kwargs)
  File "/usr/share/virt-manager/virtManager/engine.py", line 955, in _save_callback
    newvm.save(file_to_save, meter=meter)
  File "/usr/share/virt-manager/virtManager/domain.py", line 1123, in save
    self._backend.managedSave(0)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 789, in managedSave
    if ret == -1: raise libvirtError ('virDomainManagedSave() failed', dom=self)
libvirtError: internal error unable to execute QEMU command 'migrate': An undefined error has ocurred


Expected results:

Save the machine state.

Comment 2 Dave Allan 2012-07-19 18:48:52 UTC
Eduardo, can you provide the package versions for virt-manager and libvirt?

Comment 3 Eduardo Elias Ferreira 2012-07-19 19:43:55 UTC
libvirt-0.9.10-21.el6_3.1.x86_64
virt-manager-0.9.0-14.el6.x86_64

Comment 4 Eric Blake 2012-07-19 19:48:40 UTC
and which qemu version?

Comment 5 Eduardo Elias Ferreira 2012-07-19 19:52:19 UTC
qemu-kvm-0.12.1.2-2.295.el6.x86_64

Comment 6 Daisy Wu 2012-07-23 09:31:34 UTC
I can not reproduce this bug.

Version-Release number of selected component:
qemu-kvm-0.12.1.2-2.295.el6.x86_64
libvirt-0.9.10-21.el6_3.1.x86_64
virt-manager-0.9.0-14.el6.x86_64
python-virtinst-0.600.0-8.el6.noarch

Steps:
1. Launch virt-manager
# virt-manager --debug

2. Start a VM (rhel6.3).
3. Open this VM -> click "Virtual Machine" in menu -> go to "Shutdown" -> click "Save" 
4. Check the save successfully and the VM shut off.
5. Click "Virtual Machine" in menu -> click "Restore"
6. VM restore successfully and works normal.

debug info:
2012-07-23 04:51:25,026 (engine:1021): Starting vm 'rhel6.1-qcow2'.
2012-07-23 04:51:48,034 (engine:471): window counter incremented to 2
2012-07-23 04:51:48,039 (console:1078): Starting connect process for proto=vnc trans=None connhost=localhost connuser=None connport=None gaddr=127.0.0.1 gport=5900 gsocket=None
2012-07-23 04:51:48,042 (console:374): VNC connecting to localhost:5900
2012-07-23 04:51:48,751 (console:989): Viewer connected
2012-07-23 04:54:17,164 (console:961): Viewer disconnected
2012-07-23 04:54:17,737 (domain:110): Error calling jobinfo
Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/domain.py", line 94, in jobinfo_cb
    jobinfo = vm.job_info()
  File "/usr/share/virt-manager/virtManager/domain.py", line 794, in job_info
    return self._backend.jobInfo()
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1740, in jobInfo
    if ret is None: raise libvirtError ('virDomainGetJobInfo() failed', dom=self)
libvirtError: Requested operation is not valid: domain is not running
2012-07-23 04:55:37,959 (engine:1021): Starting vm 'rhel6.1-qcow2'.
2012-07-23 04:55:43,471 (console:1078): Starting connect process for proto=vnc trans=None connhost=localhost connuser=None connport=None gaddr=127.0.0.1 gport=5900 gsocket=None
2012-07-23 04:55:43,473 (console:374): VNC connecting to localhost:5900
2012-07-23 04:55:43,870 (console:989): Viewer connected

Comment 7 Eduardo Elias Ferreira 2012-07-24 14:22:28 UTC
I can also reproduce with virsh:

$ sudo virsh managedsave Fedora
error: Failed to save domain Fedora state
error: internal error unable to execute QEMU command 'migrate': An undefined error has ocurred

Comment 8 Eduardo Elias Ferreira 2012-07-24 14:33:33 UTC
(In reply to comment #7)
> I can also reproduce with virsh:
> 
> $ sudo virsh managedsave Fedora
> error: Failed to save domain Fedora state
> error: internal error unable to execute QEMU command 'migrate': An undefined
> error has ocurred

This error is with those versions: 
libvirt-python-0.9.10-21.el6_3.3.x86_64
libvirt-client-0.9.10-21.el6_3.3.x86_64
libvirt-0.9.10-21.el6_3.3.x86_64

Comment 9 yuping zhang 2012-08-08 10:42:44 UTC
Hi Eduardo,

I still can not reproduce this issue with:
libvirt-python-0.9.10-21.el6_3.3.x86_64
libvirt-0.9.10-21.el6_3.3.x86_64
libvirt-client-0.9.10-21.el6_3.3.x86_64

I installed a rhel and fedora 16 guests,save and restore work well.
What's your kvm version?

Comment 10 Eduardo Elias Ferreira 2012-08-08 12:11:48 UTC
qemu-kvm-0.12.1.2-2.295.el6_3.1.x86_64
qemu-kvm-tools-0.12.1.2-2.295.el6_3.1.x86_64

Just a double check:

libvirt-python-0.9.10-21.el6_3.3.x86_64
libvirt-client-0.9.10-21.el6_3.3.x86_64
libvirt-0.9.10-21.el6_3.3.x86_64

Comment 11 Eric Blake 2012-08-08 20:59:25 UTC
The typo in the error message ("An undefined error has ocurred") is only in qemu, not libvirt; libvirt is faithfully reporting the message from qemu.  I'm not sure if it is something libvirt is doing wrong, or something that could be reproduced using just raw qemu.

Comment 12 Martin Kletzander 2012-08-09 06:32:30 UTC
I confirm that this error comes from qemu. More sophisticated error message is available only when qemu is compiled with debugging output.

Looking at its code there are two reasons why qemu reports this error in this particular situation. One of them is that it doesn't get the file descriptor to migrate to, second one means that the file operations cannot be set to non-blocking.

Could you stop libvirtd and start it like this:
LIBVIRT_LOG_FILTERS=1:qemu_monitor LIBVIRT_LOG_OUTPUTS=1:file:<filename> libvirtd

with '<filename>' being an output file and then attach the file in this BZ?

Thanks, Martin.

Comment 13 Martin Kletzander 2012-08-09 07:00:26 UTC
Sorry, I forgot to mention that, but after starting the libvirt daemon with the specified command, please reproduce the bug, thanks.

Comment 14 Eduardo Elias Ferreira 2012-08-09 15:46:19 UTC
Created attachment 603292 [details]
Libvirt output

I started libvirt the with those flags.

The steps to create it:
- Start Windows VM
- Try to save it using virt-manager (got the error)
- force it off

- Start Fedora VM
- Try so save it: 
virsh -c qemu:///system managedsave Fedora
error: Failed to save domain Fedora state
error: internal error unable to execute QEMU command 'migrate': An undefined error has ocurred


I could no see any reference in the log file to help. Hope it does though

Comment 15 Martin Kletzander 2012-08-13 07:47:53 UTC
Hi again,

yes, I've found what I wanted, however I'd need few more things. Could you send me the output of following commands?

ls -alZ /var/lib/libvirt/qemu/save # to see the permissions
df /var/lib/libvirt/qemu/save      # to see what's the mountpoint

I was looking for one thing in the logs, so few other things are filtered out. What's the qemu running as? And do you have dynamic_ownership set anyhow in /etc/libvirt/qemu.conf ?

Thanks, Martin.

Comment 16 Eduardo Elias Ferreira 2012-09-03 15:19:53 UTC
$ ls -alZ /var/lib/libvirt/qemu/save
ls: cannot access /var/lib/libvirt/qemu/save: Permission denied

$ sudo ls -alZ /var/lib/libvirt/qemu/save
drwxr-xr-x. qemu qemu system_u:object_r:qemu_var_run_t:s0 .
drwxr-x---. qemu qemu unconfined_u:object_r:qemu_var_run_t:s0 ..

$ sudo  df /var/lib/libvirt/qemu/save
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/vg_oc5152480668-lv_root
                     303062120 124062608 175921212  42% /


Qemu is running as root.

Regards to the /etc/libvirt/qemu.conf, the line with dynamic_ownership is commented.

Comment 17 Dave Allan 2012-09-12 15:34:08 UTC
Eduardo, I'm not sure what more we can do here.  The steps to reproduce you have provided do not allow either libvirt developers or QE to reproduce the behavior you're seeing, so I'm going to close this BZ as WORKSFORME.  If you find additional data that shows what's failing on your system, please feel free to open a new BZ.  Thanks, Dave

Comment 18 Eduardo Elias Ferreira 2012-09-19 20:48:06 UTC
I remove a patch from the package and seems to fix the problem. I was able to save again.  The patch is: kvm-ehci-doesn-t-support-migration.patch

I have not found out why yet or if there is a later patch that is related.

The patch is marked to fix the bz#723870

Reopening the bug.

Comment 19 Dave Allan 2012-09-19 20:57:50 UTC
(In reply to comment #18)
> I remove a patch from the package and seems to fix the problem. I was able
> to save again.  The patch is: kvm-ehci-doesn-t-support-migration.patch
> 
> I have not found out why yet or if there is a later patch that is related.
> 
> The patch is marked to fix the bz#723870
> 
> Reopening the bug.

What package are you referring to, it sounds like that's a qemu patch.

Comment 20 Martin Kletzander 2012-09-20 07:23:12 UTC
I see that's qemu patch, but I don't see any functional change there. Most probably some consequent patch uses the structure member introduced in this one.

Thanks very much for finding this out, but since this is qemu patch, I'm reassigning to qemu-kvm. I'll stay and monitor this bug, however, so in case any help or information is needed, feel free to ask.

Comment 21 Eduardo Elias Ferreira 2012-09-20 12:09:41 UTC
I apologize for the lack of information.

It is a qemu-kvm patch. 

It was removed from this package version: qemu-kvm-0.12.1.2-2.295.el6_3.2.rpm

Comment 22 Ademar Reis 2012-09-22 02:16:40 UTC
(In reply to comment #18)
> I remove a patch from the package and seems to fix the problem. I was able
> to save again.  The patch is: kvm-ehci-doesn-t-support-migration.patch
> 
> I have not found out why yet or if there is a later patch that is related.
> 
> The patch is marked to fix the bz#723870
> 
> Reopening the bug.

Gerd, this patch was added by you to fix Bug 723870. Please investigate.

Comment 23 Gerd Hoffmann 2012-09-24 06:35:28 UTC
Works as intended, ehci simply doesn't support live migration in rhel 6.3

kvm-ehci-doesn-t-support-migration.patch makes sure qemu doesn't allow live migration in case ehci is present in the virtual machine.  You can make things appearrently work by removing the patch, but the ehci controller will fail to work properly after restoring the machine.

Your options:
  (1) remove the ehci controller from the virtual machine.
  (2) wait for rhel 6.4 which will remove the restriction.

*** This bug has been marked as a duplicate of bug 805172 ***