Bug 1017194
Summary: | Libvirtd crash when destroying linux guest which executed a series of operations about S3 and save /restore | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Chris Pelland <cpelland> |
Component: | libvirt | Assignee: | Eric Blake <eblake> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 6.4 | CC: | acathrow, berrange, bili, cwei, dallan, dyuan, eblake, jdenemar, jiahu, jsvarova, mjenner, mzhan, pm-eus, shyu, zhwang |
Target Milestone: | rc | Keywords: | Regression, ZStream |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | libvirt-0.10.2-18.el6_4.15 | Doc Type: | Bug Fix |
Doc Text: |
Some code refactoring to fix another bug left a case in which locks were cleaned up incorrectly. As a consequence, the libvirtd daemon could terminate unexpectedly on certain migrations to file scenarios. With this update, the lock cleanup paths have been fixed and libvirtd no longer crashes when saving a domain to a file.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2013-11-13 10:28:55 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 928661 | ||
Bug Blocks: |
Description
Chris Pelland
2013-10-09 12:22:03 UTC
Verify this bug on libvirt libvirt-0.10.2-18.el6_4.15.x86_64, The following was the verify steps pkginfo qemu-kvm-rhev-0.12.1.2-2.355.el6_4.9.x86_64 libvirt-0.10.2-18.el6_4.15.x86_64 kernel-2.6.32-358.26.1.el6.x86_64 qemu-guest-agent-0.12.1.2-2.355.el6_4.9.x86_64.rpm steps 1.Prepare a running guest with qemu-ga installed # virsh list --all Id Name State ---------------------------------------------------- 33 rhel64m running guest## service qemu-ga start # ps aux|grep qemu-ga root 1523 0.0 0.0 7280 544 ? Ss 01:03 0:00 /usr/bin/qemu-ga --daemonize --method virtio-serial --path /dev/virtio-ports/org.qemu.guest_agent.0 --logfile /var/log/qemu-ga.log --pidfile /var/run/qemu-ga.pid --blacklist guest-file-open guest-file-close guest-file-read guest-file-write guest-file-seek guest-file-flush 2.Do the S3/S4 on the guest, Currently, it will report error in rhel6.4.z while do the S3 operation in the second time, there was a qemu bug 881585 about this issue, this bug won't fixed in rhel6.4, but the dev offer two workarounds about this issue. so we can verify this bug with the workrounds. while we test it Without the workrounds, we usually got the following error # virsh dompmsuspend rhel64m --target mem Domain rhel64m successfully suspended # virsh dompmwakeup rhel64m Domain rhel64m successfully woken up # virsh dompmsuspend rhel64m --target mem error: Domain rhel64m could not be suspended error: Guest agent is not responding: Guest agent not available for now # virsh dompmsuspend rhel64m --target mem error: Domain rhel64m could not be suspended error: Guest agent is not responding: Guest agent not available for now #virsh dompmsuspend rhel64m --target mem error: Domain rhel64m could not be suspended error: Guest agent is not responding: QEMU guest agent is not available due to an error 3.In order to workrounds the issue in step 2, we can remove pm-utils or setting SELinux to permissive mode in guest, then re-test S3/S4, Both S3 and S4 works well # virsh dompmsuspend rhel64m --target mem Domain rhel64m successfully suspended # virsh dompmwakeup rhel64m Domain rhel64m successfully woken up # virsh dompmsuspend rhel64m --target disk Domain rhel64m successfully suspended #virsh start rhel64m Domain rhel64m started 4.Excute a seriers of operations about S3 and save/restore # virsh dompmsuspend rhel64m --target mem Domain rhel64m successfully suspended # virsh dompmwakeup rhel64m Domain rhel64m successfully woken up # virsh save rhel64m /tmp/rhel4m Domain rhel64m saved to /tmp/rhel4m # virsh restore /tmp/rhel4m Domain restored from /tmp/rhel4m 5.After restore from the save file, re-do the S3 with guest, The command will hang there, there was an exsiting bug 890648 about this issue, and the bug was not fixed yet and it not block our verification. # virsh dompmsuspend rhel64m --target mem ^C # virsh save rhel64m /tmp/rhel64m error: Failed to save domain rhel64m to /tmp/rhel64m error: Timed out during operation: cannot acquire state change lock 6.Destroy the guest, The geust can be destroyed successfully, and the libvirtd service was not crashed also #virsh destroy rhel64m Domain rhel64m destroyed # virsh list --all Id Name State ---------------------------------------------------- - rhel64m shut off # service libvirtd status libvirtd (pid 9290) is running... # ps aux|grep libvirtd root 9290 1.4 0.2 1060492 18804 ? Sl Nov06 3:06 libvirtd --daemon 7.Start the guest,re-do the uppper steps, all steps can get the same result with the upper steps, so this bug can be marked verifed Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1517.html |