Bug 1129207
Summary: | libvirtd will crash after do managedsave the same guest in the same time | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Luyao Huang <lhuang> | ||||
Component: | libvirt | Assignee: | Peter Krempa <pkrempa> | ||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 7.2 | CC: | dyuan, jiahu, jmiao, lhuang, mzhan, pkrempa, rbalakri, vivianzhang, zhwang | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | All | ||||||
Whiteboard: | |||||||
Fixed In Version: | libvirt-1.2.8-1.el7 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2015-03-05 07:42:25 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
We don't check that the VM is still alive after entering the async job and try to save it again. Fixed upstream: commit 1b7c2c549e3d104da469cf1fd03fb62a41cc99af Author: Peter Krempa <pkrempa> Date: Tue Aug 12 15:21:56 2014 +0200 qemu: migration: Check domain live state after exitting the monitor In qemuMigrationToFile we enter the monitor multiple times and don't check if the VM is still alive after returning form the monitor. Add the checks to skip pieces of code in case the VM crashes while saving it's state. commit 3fe9f61d549106eabc3e1682a3d0795ddba4e5bd Author: Peter Krempa <pkrempa> Date: Tue Aug 12 14:31:26 2014 +0200 qemu: managedsave: Check that VM is alive after entering async job Saving a shutoff VM doesn't make sense and libvirtd crashes while attempting to do that. Check that the domain is alive after entering the save async job. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1129207 v1.2.7-48-g1b7c2c5 Verify this issue with build libvirt-1.2.8-1.el7.x86_64: Verify steps: 1.prepare a running rhel6 guest # virsh list --all Id Name State ---------------------------------------------------- - test6 shut off 2.start guest # virsh start test6 Domain test6 started 3.open other two terminal use 3 terminal do the same thing (do this very fast): # virsh managedsave test6 Actual results: 1st terminal Domain test6 state saved by libvirt 2nd terminal: error: Failed to save domain test6 state error: internal error: guest unexpectedly quit 3rd terminal: error: Failed to save domain test6 state error: internal error: guest unexpectedly quit Check the libvirtd service, and it does not crash. I can reproduce with build: libvirt-1.2.7-1.el7.x86_64 verify with build : libvirt-1.2.8-7.el7.x86_64 Verify steps: 1.prepare a running rhel7 guest virsh list --all Id Name State ---------------------------------------------------- 151 rhel7 running 2. check libvirtd process firstly ps aux |grep libvirtd root 1620 0.0 0.0 112640 960 pts/3 S+ 13:57 0:00 grep --color=auto libvirtd root 23432 0.0 2.0 1876808 165748 ? Ssl 08:59 0:04 /usr/sbin/libvirtd --listen 3.open other two terminal use 3 terminal do the same thing (do this very fast): virsh managedsave rhel7 Actual results: 1st terminal virsh managedsave rhel7 Domain rhel7 state saved by libvirt 2nd terminal: virsh managedsave rhel7 error: Failed to save domain rhel7 state 3rd terminal: virsh managedsave rhel7 error: Failed to save domain rhel7 state error: Requested operation is not valid: domain is not running Check the libvirtd service, and it does not crash. Guest managedsave success. ps aux |grep libvirtd root 1923 0.0 0.0 112640 960 pts/8 S+ 14:00 0:00 grep --color=auto libvirtd root 23432 0.0 2.0 1950540 165828 ? Ssl 08:59 0:05 /usr/sbin/libvirtd --listen test the same steps with virsh save command 1st terminal: virsh save rhel7 /tmp/rhel71.save Domain rhel7 saved to /tmp/rhel71.save 2nd terminal: virsh save rhel7 /tmp/rhel72.save error: Failed to save domain rhel7 to /tmp/rhel72.save error: internal error: guest unexpectedly quit 3rd terminal: virsh save rhel7 /tmp/rhel73.save error: Failed to save domain rhel7 to /tmp/rhel73.save error: Requested operation is not valid: domain is not running libvirtd do not crash move to verified Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0323.html |
Created attachment 925999 [details] back trace Description of problem: libvirtd will crash after do managedsave to the same guest in the same time Version-Release number of selected component (if applicable): libvirt-1.2.7-1.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1.prepare a running rhel 6 guest # virsh list --all Id Name State ---------------------------------------------------- - test6 shut off 2.start guest # virsh start test6 Domain test6 started 3.# ps aux|grep libvirtd root 1604 0.1 0.2 943952 18976 ? Ssl 16:31 0:00 /usr/sbin/libvirtd root 10160 0.0 0.0 112644 972 pts/3 S+ 16:34 0:00 grep --color=auto libvirtd 4.open other two terminal use 3 terminal do the same thing(pls do this very fast): # virsh managedsave test6 Actual results: 1st terminal Domain test6 state saved by libvirt 2nd terminal: error: Failed to save domain test6 state error: Failed to reconnect to the hypervisor 3rd terminal: error: failed to get domain 'test6' error: End of file while reading data: Input/output error error: Failed to reconnect to the hypervisor and libvirtd pid changed # ps aux|grep libvirtd root 10388 2.5 0.2 482940 18216 ? Ssl 16:36 0:00 /usr/sbin/libvirtd root 11555 0.0 0.0 112640 972 pts/3 S+ 16:36 0:00 grep --color=auto libvirtd Expected results: do not crash Additional info: From zhwang: I can reproduce this issue, this issue didn't always happen, the following operation will raise the rate to reproduce this issue 1.you'd better do some operation inside the guest 2.Excute the virsh managedsave operation as quickly as you can while you reproduce this issue