Bug 928672
Summary: | Libvirtd crash when destroying linux guest which executed a series of operations about S3 and save /restore | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | zhenfeng wang <zhwang> | ||||||
Component: | libvirt | Assignee: | Pavel Hrdina <phrdina> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 7.0 | CC: | acathrow, ajia, bili, cwei, dallan, dyuan, eblake, fjin, mzhan, phrdina, pkrempa, shyu | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | libvirt-1.1.1-1.el7 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | 928661 | Environment: | |||||||
Last Closed: | 2014-06-13 10:13:29 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 928661 | ||||||||
Bug Blocks: | |||||||||
Attachments: |
|
Description
zhenfeng wang
2013-03-28 08:21:28 UTC
Created attachment 717507 [details]
the guest's xml
Created attachment 717918 [details]
The gdb info about libvirtd crash
libvirtd shouldn't crash, so this needs to be fixed. That said, it would be nice if RHEL 7 qemu would first be fixed to allow migration without losing S3 state. Hi Eric, Since this bug has the high level and in the Development Freeze milestone - https://url.corp.redhat.com/41ae300 list for a long time ,so do we have necessary to raise up it blocker, or just downgrade the bug level ? thanks This is still on my list of things to investigate - we don't want to have crashers. However, it is taking second seat to some implementation work that I have to complete before the libvirt 1.1.1 upstream release, and we can fix crashes after dev freeze (although I agree that getting the fix in before freeze is better). Fixed upstream: commit 29c2208c045e16f55bbfd25db266c30e90fa3535 Author: Peter Krempa <pkrempa> Date: Tue Jul 23 15:35:02 2013 +0200 qemu: Take error path if acquiring of job fails in qemuDomainSaveInternal Due to a goto statement missed when refactoring in 2771f8b74c1bf50d1fa when acquiring of a domain job failed the error path was not taken. This resulted into a crash afterwards as an extra reference was removed from a domain object leading to it being freed. An attempt to list the domains leaded to a crash of the daemon afterwards. Hi,peter. when i verify this bug with the latest libvirt packet, i found there was someting wrong with the s3/s4, and this block my bug verification, pls help to have a look,is it a new issue?Thanks. Version-Release number of selected component (if applicable): kernel-3.10.0-4.el7.kpq2.x86_64 qemu-kvm-1.5.2-2.el7.x86_64 libvirt-1.1.1-1.el7.x86_64 steps: 1.# getenforce Enforcing 2.Prepare a guest with qemu-ga ENV # virsh list --all Id Name State ---------------------------------------------------- 7 rhel7 running #virsh dumpxml rhel7 ...... <pm> <suspend-to-mem enabled='yes'/> <suspend-to-disk enabled='yes'/> </pm> ...... <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/rhel7.agent'/> <target type='virtio' name='org.qemu.guest_agent.0'/> <address type='virtio-serial' controller='0' bus='0' port='2'/> </channel> ...... 3.Start the qemu-guest-agent service in guest # systemctl start qemu-guest-agent.service # service qemu-guest-agent status Redirecting to /bin/systemctl status qemu-guest-agent.service qemu-guest-agent.service - QEMU Guest Agent Loaded: loaded (/usr/lib/systemd/system/qemu-guest-agent.service; static) Active: active (running) since Fri 2013-08-02 05:59:56 EDT; 2min 23s ago Main PID: 394 (qemu-ga) CGroup: name=systemd:/system/qemu-guest-agent.service `-394 /usr/bin/qemu-ga 4.Do s3 with the guest,after wakeup it,it comes an error #virsh dompmsuspend rhel7 --target mem error: Domain rhel7 could not be suspended error: internal error unable to execute QEMU agent command 'guest-suspend-ram': child process has failed to suspend Verify this bug with libvirt-1.1.1-12.el7.x86_64, the issue in comment8 has gone while i update all the packets to the latest. The following was the verify steps pkg info kernel-3.10.0-47.el7.x86_64 qemu-kvm-rhev-1.5.3-14.el7.x86_64 libvirt-1.1.1-12.el7.x86_64 steps 1.# getenforce Enforcing 2.Prepare a guest with qemu-ga ENV # virsh list --all Id Name State ---------------------------------------------------- 7 rhel7 running #virsh dumpxml rhel7 ...... <pm> <suspend-to-mem enabled='yes'/> <suspend-to-disk enabled='yes'/> </pm> ...... <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/rhel7.agent'/> <target type='virtio' name='org.qemu.guest_agent.0'/> <address type='virtio-serial' controller='0' bus='0' port='2'/> </channel> ...... 3.Start the qemu-guest-agent service in guest # systemctl start qemu-guest-agent.service # service qemu-guest-agent status Redirecting to /bin/systemctl status qemu-guest-agent.service qemu-guest-agent.service - QEMU Guest Agent Loaded: loaded (/usr/lib/systemd/system/qemu-guest-agent.service; static) Active: active (running) since Fri 2013-08-02 05:59:56 EDT; 2min 23s ago Main PID: 394 (qemu-ga) CGroup: name=systemd:/system/qemu-guest-agent.service `-394 /usr/bin/qemu-ga 4.Do s3 with the guest,then wakeup it,however,the guest can't back to the previous status before pmsuspend #virsh dompmsuspend rhel7 --target mem #virsh dompmwakeup rhel7 5.Save and restore the guest # virsh save rhel7 /tmp/rhel7.save #virsh restore /tmp/rhel7.save 6.Do s3 with the guest again, the virsh command will hang here, there was an exsiting bug 890648 about this issue in rhel6.4, and the bug was not fixed yet, so clone one 1028927 to the rhel7, the bug 1028927 didn't block us verify this bug #virsh dompmsuspend rhel7 --target mem ^C 7.Save the guest again, it will report error, this error was caused by 1028927 # virsh save rhel7 /tmp/rhel7.save error: Failed to save domain rhel7 to /tmp/rhel7.save error: Timed out during operation: cannot acquire state change lock 8.Destroy the guest ,The libvirtd was crashed here # virsh destroy rhel7 Domain rhel7 destroyed # virsh list --all Id Name State ---------------------------------------------------- - rhel7 shut off # ps aux|grep libvirtd root 19089 1.8 0.2 1126080 21920 ? Ssl 16:26 2:00 /usr/sbin/libvirtd # service libvirtd status Redirecting to /bin/systemctl status libvirtd.service libvirtd.service - Virtualization daemon Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled) Active: active (running) since Mon 2013-11-11 16:26:10 CST; 1h 46min ago Main PID: 19089 (libvirtd) CGroup: /system.slice/libvirtd.service ├─ 2418 /sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default... └─19089 /usr/sbin/libvirtd 9.Do the upper steps several times, i can get the same result with the upper steps ,so mark this bug verifed This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request. |