Bugzilla (bugzilla.redhat.com) will be under maintenance for infrastructure upgrades and will not be unavailable on July 31st between 12:30 AM - 05:30 AM UTC. We appreciate your understanding and patience. You can follow status.redhat.com for details.
Bug 807386 - Obscure error message starting KVM guest - error while loading state section id 3
Summary: Obscure error message starting KVM guest - error while loading state section ...
Keywords:
Status: CLOSED DUPLICATE of bug 818172
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.2
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: rc
: 6.4
Assignee: Michal Privoznik
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-03-27 16:10 UTC by Karen Noel
Modified: 2013-01-23 10:35 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-08-29 16:02:08 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Karen Noel 2012-03-27 16:10:53 UTC
Description of problem:

Trying to start my guest, I got the following error message.

# virsh start knoel1
error: Failed to start domain knoel1
error: Unable to read from monitor: Connection reset by peer

Error in /var/log/libvirt/qemu/knoel1.log:

2012-03-27 15:39:28.900+0000: starting up
LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -S -M rhel6.2.0 -enable-kvm -m 2048 -smp 4,sockets=4,cores=1,threads=1 -name knoel1 -uuid 3e1d5d47-4963-dcd8-a3f1-311d1b3c677f -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/knoel1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/libvirt/images/knoel1.img,if=none,id=drive-virtio-disk0,format=raw -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/var/lib/libvirt/images/knoel1-1.img,if=none,id=drive-virtio-disk1,format=raw -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk1,id=virtio-disk1 -netdev tap,fd=22,id=hostnet0,vhost=on,vhostfd=23 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:a6:c4:4e,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:0,password -vga cirrus -incoming fd:20 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
char device redirected to /dev/pts/1
qemu: warning: error while loading state section id 3
load of migration failed
2012-03-27 15:39:31.108+0000: shutting down

"Section id 3" doesn't give me a clue what's wrong. "Load of migration" is also misleading. I never migrated the guest.

To work around the problem, I moved /var/lib/libvirt/qemu/save/knoel1.save away to another directory. I saved it for debugging.

Avi debugged and found that "section id 3" is "mem". The guess is that the .save file was truncated.

Version-Release number of selected component (if applicable):

How reproducible:

Can reproduce the error with my guest's .save file.

Cannot reproduce the bad .save file.

Steps to Reproduce:

"virsh start knoel1" with bad /var/lib/libvirt/qemu/save/knoel1.save on virtlab204.

Not sure how to reproduce the bad .save file. Maybe a developer can reproduce a truncated .save file?

I have the above .save file on my system - virtlab204. Contact me for access.
  
Actual results:
  
Obscure error message.

Expected results:

Clear error message so the user knows how to fix the problem.

Additional info:

Comment 2 juzhang 2012-03-28 07:49:48 UTC
Hi, Karen

Would you please provides the qemu-kvm version? thanks

Comment 3 Karen Noel 2012-03-28 15:55:14 UTC
This happened with 6.2 as well as the latest 6.3 qemu-kvm. I was using qemu-kvm-0.12.1.2-2.265.el6.

Comment 4 Osier Yang 2012-05-09 02:58:45 UTC
I take it to libvirt, as libvirt is expected to recongize the corrupted domain state file, and ignores it when domain starting. See #BZ 730750.

Comment 5 Eric Blake 2012-05-09 19:26:39 UTC
Can you please attach the first 4k of the corrupted save file to this BZ (should just be some binary data followed by the guest XML, limiting to 4k will avoid sending any actual saved guest state, so you don't have to worry about that security aspect)?  Libvirt detects corrupted state files by writing a temporary header, then doing migrate to file, then finally rewriting the header to the proper value.  Also, what version of libvirt was running at the time the guest was previously subject to a managedsave operation?  I assume you are using the libvirt-guests service, which does a guest managedsave on host shutdown?

Comment 6 Michal Privoznik 2012-05-25 14:28:39 UTC
Karen, can you please provide the first 4K of the corrupted save file as requested in comment #5? Thanks!

Comment 7 RHEL Program Management 2012-07-10 08:17:54 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 8 RHEL Program Management 2012-07-10 23:51:14 UTC
This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.

Comment 9 Dave Allan 2012-07-31 21:06:20 UTC
Jiri, this is a dup of the fs corruption on shutdown that you worked on, right?

Comment 10 Jiri Denemark 2012-08-06 14:17:37 UTC
It looks like it could be the bug. Eric Sandeen identified and fixed an issue in writeback code, which he believes was the reason for fs corruption. See bug 818172 (the original fs corruption bug 749527 is likely a duplicate of that bug). The bug is supposed to be fixed in kernel-2.6.32-280.el6

Comment 11 Michal Privoznik 2012-08-29 16:02:08 UTC
I also think this is a dup.

*** This bug has been marked as a duplicate of bug 818172 ***


Note You need to log in before you can comment on or make changes to this bug.