Bug 1129207
| Summary: | libvirtd will crash after do managedsave the same guest in the same time | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Luyao Huang <lhuang> | ||||
| Component: | libvirt | Assignee: | Peter Krempa <pkrempa> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 7.2 | CC: | dyuan, jiahu, jmiao, lhuang, mzhan, pkrempa, rbalakri, vivianzhang, zhwang | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | All | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | libvirt-1.2.8-1.el7 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2015-03-05 07:42:25 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
We don't check that the VM is still alive after entering the async job and try to save it again. Fixed upstream:
commit 1b7c2c549e3d104da469cf1fd03fb62a41cc99af
Author: Peter Krempa <pkrempa>
Date: Tue Aug 12 15:21:56 2014 +0200
qemu: migration: Check domain live state after exitting the monitor
In qemuMigrationToFile we enter the monitor multiple times and don't
check if the VM is still alive after returning form the monitor. Add the
checks to skip pieces of code in case the VM crashes while saving it's
state.
commit 3fe9f61d549106eabc3e1682a3d0795ddba4e5bd
Author: Peter Krempa <pkrempa>
Date: Tue Aug 12 14:31:26 2014 +0200
qemu: managedsave: Check that VM is alive after entering async job
Saving a shutoff VM doesn't make sense and libvirtd crashes while
attempting to do that. Check that the domain is alive after entering
the save async job.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1129207
v1.2.7-48-g1b7c2c5
Verify this issue with build libvirt-1.2.8-1.el7.x86_64: Verify steps: 1.prepare a running rhel6 guest # virsh list --all Id Name State ---------------------------------------------------- - test6 shut off 2.start guest # virsh start test6 Domain test6 started 3.open other two terminal use 3 terminal do the same thing (do this very fast): # virsh managedsave test6 Actual results: 1st terminal Domain test6 state saved by libvirt 2nd terminal: error: Failed to save domain test6 state error: internal error: guest unexpectedly quit 3rd terminal: error: Failed to save domain test6 state error: internal error: guest unexpectedly quit Check the libvirtd service, and it does not crash. I can reproduce with build: libvirt-1.2.7-1.el7.x86_64
verify with build : libvirt-1.2.8-7.el7.x86_64
Verify steps:
1.prepare a running rhel7 guest
virsh list --all
Id Name State
----------------------------------------------------
151 rhel7 running
2. check libvirtd process firstly
ps aux |grep libvirtd
root 1620 0.0 0.0 112640 960 pts/3 S+ 13:57 0:00 grep --color=auto libvirtd
root 23432 0.0 2.0 1876808 165748 ? Ssl 08:59 0:04 /usr/sbin/libvirtd --listen
3.open other two terminal
use 3 terminal do the same thing (do this very fast):
virsh managedsave rhel7
Actual results:
1st terminal
virsh managedsave rhel7
Domain rhel7 state saved by libvirt
2nd terminal:
virsh managedsave rhel7
error: Failed to save domain rhel7 state
3rd terminal:
virsh managedsave rhel7
error: Failed to save domain rhel7 state
error: Requested operation is not valid: domain is not running
Check the libvirtd service, and it does not crash. Guest managedsave success.
ps aux |grep libvirtd
root 1923 0.0 0.0 112640 960 pts/8 S+ 14:00 0:00 grep --color=auto libvirtd
root 23432 0.0 2.0 1950540 165828 ? Ssl 08:59 0:05 /usr/sbin/libvirtd --listen
test the same steps with virsh save command
1st terminal:
virsh save rhel7 /tmp/rhel71.save
Domain rhel7 saved to /tmp/rhel71.save
2nd terminal:
virsh save rhel7 /tmp/rhel72.save
error: Failed to save domain rhel7 to /tmp/rhel72.save
error: internal error: guest unexpectedly quit
3rd terminal:
virsh save rhel7 /tmp/rhel73.save
error: Failed to save domain rhel7 to /tmp/rhel73.save
error: Requested operation is not valid: domain is not running
libvirtd do not crash
move to verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0323.html |
Created attachment 925999 [details] back trace Description of problem: libvirtd will crash after do managedsave to the same guest in the same time Version-Release number of selected component (if applicable): libvirt-1.2.7-1.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1.prepare a running rhel 6 guest # virsh list --all Id Name State ---------------------------------------------------- - test6 shut off 2.start guest # virsh start test6 Domain test6 started 3.# ps aux|grep libvirtd root 1604 0.1 0.2 943952 18976 ? Ssl 16:31 0:00 /usr/sbin/libvirtd root 10160 0.0 0.0 112644 972 pts/3 S+ 16:34 0:00 grep --color=auto libvirtd 4.open other two terminal use 3 terminal do the same thing(pls do this very fast): # virsh managedsave test6 Actual results: 1st terminal Domain test6 state saved by libvirt 2nd terminal: error: Failed to save domain test6 state error: Failed to reconnect to the hypervisor 3rd terminal: error: failed to get domain 'test6' error: End of file while reading data: Input/output error error: Failed to reconnect to the hypervisor and libvirtd pid changed # ps aux|grep libvirtd root 10388 2.5 0.2 482940 18216 ? Ssl 16:36 0:00 /usr/sbin/libvirtd root 11555 0.0 0.0 112640 972 pts/3 S+ 16:36 0:00 grep --color=auto libvirtd Expected results: do not crash Additional info: From zhwang: I can reproduce this issue, this issue didn't always happen, the following operation will raise the rate to reproduce this issue 1.you'd better do some operation inside the guest 2.Excute the virsh managedsave operation as quickly as you can while you reproduce this issue