Bug 1009886 - CVE-2013-7336 libvirtd crashes during established spice session migration. [rhel-6.5]
CVE-2013-7336 libvirtd crashes during established spice session migration. [r...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt (Show other bugs)
6.5
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Martin Kletzander
Virtualization Bugs
: Security
Depends On:
Blocks: 1010861 CVE-2013-7336
  Show dependency treegraph
 
Reported: 2013-09-19 08:25 EDT by Marian Krcmarik
Modified: 2014-03-18 10:20 EDT (History)
13 users (show)

See Also:
Fixed In Version: libvirt-0.10.2-27.el6
Doc Type: Bug Fix
Doc Text:
Cause: Due to code movement, there was invalid job used for querying spice migration status. Consequence: Sometimes, when migrating domain with spice seamless migration and requesting domjobinfo on that same domain in the meantime, the daemon crashed. Fix: The job was set properly. Result: The daemon doesn't crash due to this issue anymore.
Story Points: ---
Clone Of:
: 1010861 (view as bug list)
Environment:
Last Closed: 2013-11-21 04:10:58 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
source libvirtd log (711.29 KB, text/plain)
2013-09-19 08:27 EDT, Marian Krcmarik
no flags Details
core dump of crashed source libvirtd (758.46 KB, application/octet-stream)
2013-09-19 08:28 EDT, Marian Krcmarik
no flags Details
Destination libvirtd log (101.87 KB, text/plain)
2013-09-19 08:58 EDT, Marian Krcmarik
no flags Details

  None (edit)
Description Marian Krcmarik 2013-09-19 08:25:58 EDT
Description of problem:
Source libvirtd crashes when performing migration on 6.5 host to 6.5 remote host with established spice session (spice client connected), The host is managed by RHEVM3.3. The crash does not happen when migration VM without spice session established, I am not able to reproduce that in different setup but my original setup has more complex setup (especially in networking) It uses separated display network for Spice traffic, separated network for VMs network interfaces. That all on bonded NICs. But I could reproduce even without display network.

I am attaching snip from source libvirtd where libvirtd crash is caught. As well as I attach core dump of libvirtd process.

Version-Release number of selected component (if applicable):
rpm -qa | egrep  "libvirt|qemu-kvm|vdsm"
libvirt-client-0.10.2-24.el6.x86_64
vdsm-xmlrpc-4.12.0-138.gitab256be.el6ev.noarch
qemu-kvm-rhev-0.12.1.2-2.404.el6.x86_64
libvirt-lock-sanlock-0.10.2-24.el6.x86_64
vdsm-cli-4.12.0-138.gitab256be.el6ev.noarch
qemu-kvm-rhev-tools-0.12.1.2-2.404.el6.x86_64
libvirt-python-0.10.2-24.el6.x86_64
vdsm-python-4.12.0-138.gitab256be.el6ev.x86_64
vdsm-4.12.0-138.gitab256be.el6ev.x86_64
vdsm-python-cpopen-4.12.0-138.gitab256be.el6ev.x86_64
libvirt-0.10.2-24.el6.x86_64

How reproducible:
Always on my setup.

Steps to Reproduce:
1. In RHEV3.3 environment migrate VM with established Spice session.

Actual results:
Source Libvirtd crash.

Expected results:
No crash on source.

Additional info:
I can keep the setup for some short time.
Comment 1 Marian Krcmarik 2013-09-19 08:27:13 EDT
Created attachment 799909 [details]
source libvirtd log

Snip from libvirtd log which contains crash info.
Comment 2 Marian Krcmarik 2013-09-19 08:28:17 EDT
Created attachment 799910 [details]
core dump of crashed source libvirtd
Comment 4 Marian Krcmarik 2013-09-19 08:58:53 EDT
Created attachment 799920 [details]
Destination libvirtd log
Comment 6 Martin Kletzander 2013-09-20 11:19:33 EDT
Patch proposed upstream:

http://www.redhat.com/archives/libvir-list/2013-September/msg01208.html
Comment 8 Shanzhi Yu 2013-09-22 06:18:25 EDT
Hi Marian,
I can't reproduce this error with below package in rhevm 3.2 environment.

vdsm-python-4.10.2-25.0.el6ev.x86_64
qemu-kvm-rhev-debuginfo-0.12.1.2-2.404.el6.x86_64
libvirt-python-0.10.2-26.el6.x86_64
libvirt-0.10.2-26.el6.x86_64
libvirt-debuginfo-0.10.2-26.el6.x86_64
vdsm-cli-4.10.2-25.0.el6ev.noarch
qemu-kvm-rhev-0.12.1.2-2.404.el6.x86_64
libvirt-lock-sanlock-0.10.2-26.el6.x86_64
vdsm-4.10.2-25.0.el6ev.x86_64
qemu-kvm-rhev-tools-0.12.1.2-2.404.el6.x86_64
libvirt-devel-0.10.2-26.el6.x86_64
libvirt-client-0.10.2-26.el6.x86_64
vdsm-xmlrpc-4.10.2-25.0.el6ev.noarch

Do i must setup rhevm 3.3 to repoduce this bug? AFAIK,rhevm 3.3 is not released, how can i get it?
Comment 9 Marian Krcmarik 2013-09-22 12:00:53 EDT
(In reply to Shanzhi Yu from comment #8)
> Hi Marian,
> I can't reproduce this error with below package in rhevm 3.2 environment.
> 
> vdsm-python-4.10.2-25.0.el6ev.x86_64
> qemu-kvm-rhev-debuginfo-0.12.1.2-2.404.el6.x86_64
> libvirt-python-0.10.2-26.el6.x86_64
> libvirt-0.10.2-26.el6.x86_64
> libvirt-debuginfo-0.10.2-26.el6.x86_64
> vdsm-cli-4.10.2-25.0.el6ev.noarch
> qemu-kvm-rhev-0.12.1.2-2.404.el6.x86_64
> libvirt-lock-sanlock-0.10.2-26.el6.x86_64
> vdsm-4.10.2-25.0.el6ev.x86_64
> qemu-kvm-rhev-tools-0.12.1.2-2.404.el6.x86_64
> libvirt-devel-0.10.2-26.el6.x86_64
> libvirt-client-0.10.2-26.el6.x86_64
> vdsm-xmlrpc-4.10.2-25.0.el6ev.noarch
> 
> Do i must setup rhevm 3.3 to repoduce this bug? AFAIK,rhevm 3.3 is not
> released, how can i get it?

Try to slown down Spice migration -> Limit bandwidth between your client machine and hosts, Install spice component (spice guest agent) on the VM and open more monitors and redirect some USB devices through the native USB redir.

I am not sure, maybe 3.3 vdsm has some effect on that (http://bob.eng.lab.tlv.redhat.com/builds/is15/). 

I still have the setup and new build with the fix will be available soon  I assume so In the worst case I can verify myself.
Comment 10 Martin Kletzander 2013-09-23 04:59:42 EDT
Colleague of mine managed to reproduse this without vdsm/rhev etc.  The point is that there must be enough domblkstat requests to get one while waiting for the spice migration to finish.  So virsh domblkstat in a cycle and slowing down the migration should be the way how to reproduce this properly.  As Martin pointed out, try slowing down the migration (through a slow network for example), make sure there is vdagent installed in the guest, and the more spice stuff is used, the more probable it is to hit this issue.
Comment 12 Shanzhi Yu 2013-09-24 06:15:06 EDT
(In reply to Martin Kletzander from comment #10)
> Colleague of mine managed to reproduse this without vdsm/rhev etc.  The
> point is that there must be enough domblkstat requests to get one while
> waiting for the spice migration to finish.  So virsh domblkstat in a cycle
> and slowing down the migration should be the way how to reproduce this
> properly.  As Martin pointed out, try slowing down the migration (through a
> slow network for example), make sure there is vdagent installed in the
> guest, and the more spice stuff is used, the more probable it is to hit this
> issue.

Hi Martin
I try to reproduce it without vdsm/rhev, but hardly succeed. My steps is as below, please help correct it. Thanks advance

1. exist an guest installed with vdagent
2. login guest and do some disk R/W operations in a cycle.
3. do "virsh domblkstat guest" in a cycle
4. open one spice
5. set migrate speed to very low by "virsh migrate-setspeed " command  
6. do migrate
Comment 13 Shanzhi Yu 2013-09-24 06:17:47 EDT
The error I met is as below:

virsh migrate --live migrate qemu+ssh://10.66.106.20/system
2013-09-24 10:12:48.736+0000: 24726: info : libvirt version: 0.10.2, package: 24.el6 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2013-09-06-15:56:52, x86-022.build.eng.bos.redhat.com)
2013-09-24 10:12:48.736+0000: 24726: warning : virDomainMigrateVersion3:4922 : Guest migrate probably left in 'paused' state on source

error: One or more references were leaked after disconnect from the hypervisor
Comment 14 Martin Kletzander 2013-09-25 01:49:06 EDT
(In reply to Shanzhi Yu from comment #12)
Doing R/W in the guest slows down the storage part of migration and virsh migrate-setspeed accepts parameters in MiB/s.  What you need to do is slow down the spice part of the migration, which contains only a bit of data.  So what you need to do is to slow it down way more than 1MiB/s (1) and generate more data for spice to transfer (2).

 1) The easiest way is to migrate over dedicated network which will be slowed down to, for example, 20KiB/s
 2) Open more displays (monitors) and redirect some USB devices into the guest (make sure it uses spicevmc redirection)

Basically everything from comment #9.
Comment 16 Shanzhi Yu 2013-10-12 08:03:46 EDT
Hi Martin,
Please help review my steps below, Thanks

Prepare an NFS guest, mounted on source server and target server.
1. define an guest with spicevmc redirection on source server

# virsh dumpxml winxp|grep redirdev
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
2. plugin two usb disk on source server.Open an diskplays(using virt-viewer) and do redirection to guest.

3. configure network card speed on source server

# ethtool --change eth0 autoneg off speed 10 duplex full
# iptables -A OUTPUT -s $(source server IP) -m limit --limit 10/s -j ACCEPT
# iptables -A OUTPUT -s $(source server IP) -j ACCEPT

4. Do migration from source server to target server

# time  virsh migrate --live  winxp qemu+ssh://$(Target server IP)/system

By those steps above, can't met libvirtd crash while the result is that it fail to migrate the guest. Error info can be found in libvirtd.log is 

2013-10-12 11:33:34.045+0000: 5406: error : virNetSocketReadWire:1184 : End of file while reading data: Input/output error
2013-10-12 11:33:42.365+0000: 5406: error : qemuMonitorIORead:513 : Unable to read from monitor: Connection reset by peer
Comment 17 Martin Kletzander 2013-10-16 08:39:34 EDT
The iptables rules you are using mean that 10 packets per second are allowed by the first rule, but the rest is allowed by the second one.  And even if you don't add the second one, it will be matched by a different one probably.
Nevertheless, the problem is that if you drop/reject non-matching packets, it will not just slow down the traffic, but you'll get a huge amount of packet loss (because all other packets will get dropped).

You need to limit the speed properly, try doing this on the destination:

 # tc qdisc add dev eth0 ingress
 # tc filter add dev eth0 parent ffff: protocol ip u32 match ip src 0.0.0.0/0 police rate 64kbit burst 64kbit mtu 64kb drop flowid :1

Beware! This will limit all incoming data on eth0 to this machine to 64kbps.  To remove this limitation, do this:

 # tc qdisc del dev eth0 ingress

Before you are starting the migration, try (for example using 'nc' and 'dd' or 'pv') how fast the communication really is.  If it is more than 64kbps, the setting is not correct and you will fail reproducing the crash.
Comment 18 Shanzhi Yu 2013-10-28 06:28:55 EDT
Version-Release number of selected component (if applicable):

qemu-kvm-rhev-0.12.1.2-2.411.el6.x86_64
libvirt-0.10.2-29.el6.x86_64

Preparations:

1. mount NFS server to both source and target server.Both source and target server has two NICs(eth0 and eth1). one is used for spice migrations which be limit low speed. define net1 on two server based on eth0

2. define an guest with usbredir 
#virsh dumpxml rhel6

    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
 <graphics type='spice' port='5900' autoport='no'>
      <listen type='network' network='net1'/>
 </graphics>

3. install vdagent in guest.
# rpm -qa|grep vdagent
spice-vdagent-0.12.0-4.el6.x86_64

4. do network transport limit on source server(eth0)(limit eth0 to 64kbps ).
# tc qdisc add dev eth0 ingress
# tc filter add dev eth0 parent ffff: protocol ip u32 match ip src 0.0.0.0/0 police rate 64kbit burst 64kbit mtu 64kb drop flowid :1
(has test it with "dd" and "nc", the two command above works fine)

Steps:
1. start guest 
# virsh start rhel6

2. open an display by remote-view on client server use net1 and plugin two usb disk on source server. Do usb redirection.


3. Do migration from source server to target server with eth1' IP
# time  virsh migrate --live  rhel6 qemu+ssh://10.66.106.23/system
root@10.66.106.23's password: 

Results: succeed migrating the guest.
So set it to verified.
Comment 20 errata-xmlrpc 2013-11-21 04:10:58 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1581.html

Note You need to log in before you can comment on or make changes to this bug.