Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1009886

Summary:

CVE-2013-7336 libvirtd crashes during established spice session migration. [rhel-6.5]

Product:

Red Hat Enterprise Linux 6

Reporter:

Marian Krcmarik <mkrcmari>

Component:

libvirt

Assignee:

Martin Kletzander <mkletzan>

Status:

CLOSED ERRATA

QA Contact:

Virtualization Bugs <virt-bugs>

Severity:

unspecified

Docs Contact:

Priority:

unspecified

Version:

6.5

CC:

acathrow, bili, cwei, dallan, dyuan, eblake, lsu, mkletzan, mkrcmari, pmatouse, shyu, tlavigne, zpeng

Target Milestone:

Keywords:

Security

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

libvirt-0.10.2-27.el6

Doc Type:

Bug Fix

Doc Text:

Cause: Due to code movement, there was invalid job used for querying spice migration status. Consequence: Sometimes, when migrating domain with spice seamless migration and requesting domjobinfo on that same domain in the meantime, the daemon crashed. Fix: The job was set properly. Result: The daemon doesn't crash due to this issue anymore.

Story Points:

---

Clone Of:

Clones:

1010861 (view as bug list)

Environment:

Last Closed:

2013-11-21 09:10:58 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

1010861, 1077620

Attachments:

Description	Flags
source libvirtd log	none
core dump of crashed source libvirtd	none
Destination libvirtd log	none

Description Marian Krcmarik 2013-09-19 12:25:58 UTC

Description of problem:
Source libvirtd crashes when performing migration on 6.5 host to 6.5 remote host with established spice session (spice client connected), The host is managed by RHEVM3.3. The crash does not happen when migration VM without spice session established, I am not able to reproduce that in different setup but my original setup has more complex setup (especially in networking) It uses separated display network for Spice traffic, separated network for VMs network interfaces. That all on bonded NICs. But I could reproduce even without display network.

I am attaching snip from source libvirtd where libvirtd crash is caught. As well as I attach core dump of libvirtd process.

Version-Release number of selected component (if applicable):
rpm -qa | egrep "libvirt|qemu-kvm|vdsm"
libvirt-client-0.10.2-24.el6.x86_64
vdsm-xmlrpc-4.12.0-138.gitab256be.el6ev.noarch
qemu-kvm-rhev-0.12.1.2-2.404.el6.x86_64
libvirt-lock-sanlock-0.10.2-24.el6.x86_64
vdsm-cli-4.12.0-138.gitab256be.el6ev.noarch
qemu-kvm-rhev-tools-0.12.1.2-2.404.el6.x86_64
libvirt-python-0.10.2-24.el6.x86_64
vdsm-python-4.12.0-138.gitab256be.el6ev.x86_64
vdsm-4.12.0-138.gitab256be.el6ev.x86_64
vdsm-python-cpopen-4.12.0-138.gitab256be.el6ev.x86_64
libvirt-0.10.2-24.el6.x86_64

How reproducible:
Always on my setup.

Steps to Reproduce:
1. In RHEV3.3 environment migrate VM with established Spice session.

Actual results:
Source Libvirtd crash.

Expected results:
No crash on source.

Additional info:
I can keep the setup for some short time.

Comment 1 Marian Krcmarik 2013-09-19 12:27:13 UTC

Created attachment 799909 [details]
source libvirtd log

Snip from libvirtd log which contains crash info.

Comment 2 Marian Krcmarik 2013-09-19 12:28:17 UTC

Created attachment 799910 [details]
core dump of crashed source libvirtd

Comment 4 Marian Krcmarik 2013-09-19 12:58:53 UTC

Created attachment 799920 [details]
Destination libvirtd log

Comment 6 Martin Kletzander 2013-09-20 15:19:33 UTC

Patch proposed upstream:

http://www.redhat.com/archives/libvir-list/2013-September/msg01208.html

Comment 8 Shanzhi Yu 2013-09-22 10:18:25 UTC

Hi Marian,
I can't reproduce this error with below package in rhevm 3.2 environment.

vdsm-python-4.10.2-25.0.el6ev.x86_64
qemu-kvm-rhev-debuginfo-0.12.1.2-2.404.el6.x86_64
libvirt-python-0.10.2-26.el6.x86_64
libvirt-0.10.2-26.el6.x86_64
libvirt-debuginfo-0.10.2-26.el6.x86_64
vdsm-cli-4.10.2-25.0.el6ev.noarch
qemu-kvm-rhev-0.12.1.2-2.404.el6.x86_64
libvirt-lock-sanlock-0.10.2-26.el6.x86_64
vdsm-4.10.2-25.0.el6ev.x86_64
qemu-kvm-rhev-tools-0.12.1.2-2.404.el6.x86_64
libvirt-devel-0.10.2-26.el6.x86_64
libvirt-client-0.10.2-26.el6.x86_64
vdsm-xmlrpc-4.10.2-25.0.el6ev.noarch

Do i must setup rhevm 3.3 to repoduce this bug? AFAIK,rhevm 3.3 is not released, how can i get it?

Comment 9 Marian Krcmarik 2013-09-22 16:00:53 UTC

(In reply to Shanzhi Yu from comment #8)
> Hi Marian,
> I can't reproduce this error with below package in rhevm 3.2 environment.
> 
> vdsm-python-4.10.2-25.0.el6ev.x86_64
> qemu-kvm-rhev-debuginfo-0.12.1.2-2.404.el6.x86_64
> libvirt-python-0.10.2-26.el6.x86_64
> libvirt-0.10.2-26.el6.x86_64
> libvirt-debuginfo-0.10.2-26.el6.x86_64
> vdsm-cli-4.10.2-25.0.el6ev.noarch
> qemu-kvm-rhev-0.12.1.2-2.404.el6.x86_64
> libvirt-lock-sanlock-0.10.2-26.el6.x86_64
> vdsm-4.10.2-25.0.el6ev.x86_64
> qemu-kvm-rhev-tools-0.12.1.2-2.404.el6.x86_64
> libvirt-devel-0.10.2-26.el6.x86_64
> libvirt-client-0.10.2-26.el6.x86_64
> vdsm-xmlrpc-4.10.2-25.0.el6ev.noarch
> 
> Do i must setup rhevm 3.3 to repoduce this bug? AFAIK,rhevm 3.3 is not
> released, how can i get it?

Try to slown down Spice migration -> Limit bandwidth between your client machine and hosts, Install spice component (spice guest agent) on the VM and open more monitors and redirect some USB devices through the native USB redir.

I am not sure, maybe 3.3 vdsm has some effect on that (http://bob.eng.lab.tlv.redhat.com/builds/is15/). 

I still have the setup and new build with the fix will be available soon  I assume so In the worst case I can verify myself.

Comment 10 Martin Kletzander 2013-09-23 08:59:42 UTC

Colleague of mine managed to reproduse this without vdsm/rhev etc.  The point is that there must be enough domblkstat requests to get one while waiting for the spice migration to finish.  So virsh domblkstat in a cycle and slowing down the migration should be the way how to reproduce this properly.  As Martin pointed out, try slowing down the migration (through a slow network for example), make sure there is vdagent installed in the guest, and the more spice stuff is used, the more probable it is to hit this issue.

Comment 12 Shanzhi Yu 2013-09-24 10:15:06 UTC

(In reply to Martin Kletzander from comment #10)
> Colleague of mine managed to reproduse this without vdsm/rhev etc.  The
> point is that there must be enough domblkstat requests to get one while
> waiting for the spice migration to finish.  So virsh domblkstat in a cycle
> and slowing down the migration should be the way how to reproduce this
> properly.  As Martin pointed out, try slowing down the migration (through a
> slow network for example), make sure there is vdagent installed in the
> guest, and the more spice stuff is used, the more probable it is to hit this
> issue.

Hi Martin
I try to reproduce it without vdsm/rhev, but hardly succeed. My steps is as below, please help correct it. Thanks advance

1. exist an guest installed with vdagent
2. login guest and do some disk R/W operations in a cycle.
3. do "virsh domblkstat guest" in a cycle
4. open one spice
5. set migrate speed to very low by "virsh migrate-setspeed " command  
6. do migrate

Comment 13 Shanzhi Yu 2013-09-24 10:17:47 UTC

The error I met is as below:

virsh migrate --live migrate qemu+ssh://10.66.106.20/system
2013-09-24 10:12:48.736+0000: 24726: info : libvirt version: 0.10.2, package: 24.el6 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2013-09-06-15:56:52, x86-022.build.eng.bos.redhat.com)
2013-09-24 10:12:48.736+0000: 24726: warning : virDomainMigrateVersion3:4922 : Guest migrate probably left in 'paused' state on source

error: One or more references were leaked after disconnect from the hypervisor

Comment 14 Martin Kletzander 2013-09-25 05:49:06 UTC

(In reply to Shanzhi Yu from comment #12)
Doing R/W in the guest slows down the storage part of migration and virsh migrate-setspeed accepts parameters in MiB/s.  What you need to do is slow down the spice part of the migration, which contains only a bit of data.  So what you need to do is to slow it down way more than 1MiB/s (1) and generate more data for spice to transfer (2).

 1) The easiest way is to migrate over dedicated network which will be slowed down to, for example, 20KiB/s
 2) Open more displays (monitors) and redirect some USB devices into the guest (make sure it uses spicevmc redirection)

Basically everything from comment #9.

Comment 16 Shanzhi Yu 2013-10-12 12:03:46 UTC

Hi Martin,
Please help review my steps below, Thanks

Prepare an NFS guest, mounted on source server and target server.
1. define an guest with spicevmc redirection on source server

# virsh dumpxml winxp|grep redirdev
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
2. plugin two usb disk on source server.Open an diskplays(using virt-viewer) and do redirection to guest.

3. configure network card speed on source server

# ethtool --change eth0 autoneg off speed 10 duplex full
# iptables -A OUTPUT -s $(source server IP) -m limit --limit 10/s -j ACCEPT
# iptables -A OUTPUT -s $(source server IP) -j ACCEPT

4. Do migration from source server to target server

# time  virsh migrate --live  winxp qemu+ssh://$(Target server IP)/system

By those steps above, can't met libvirtd crash while the result is that it fail to migrate the guest. Error info can be found in libvirtd.log is 

2013-10-12 11:33:34.045+0000: 5406: error : virNetSocketReadWire:1184 : End of file while reading data: Input/output error
2013-10-12 11:33:42.365+0000: 5406: error : qemuMonitorIORead:513 : Unable to read from monitor: Connection reset by peer

Comment 17 Martin Kletzander 2013-10-16 12:39:34 UTC

The iptables rules you are using mean that 10 packets per second are allowed by the first rule, but the rest is allowed by the second one.  And even if you don't add the second one, it will be matched by a different one probably.
Nevertheless, the problem is that if you drop/reject non-matching packets, it will not just slow down the traffic, but you'll get a huge amount of packet loss (because all other packets will get dropped).

You need to limit the speed properly, try doing this on the destination:

 # tc qdisc add dev eth0 ingress
 # tc filter add dev eth0 parent ffff: protocol ip u32 match ip src 0.0.0.0/0 police rate 64kbit burst 64kbit mtu 64kb drop flowid :1

Beware! This will limit all incoming data on eth0 to this machine to 64kbps.  To remove this limitation, do this:

 # tc qdisc del dev eth0 ingress

Before you are starting the migration, try (for example using 'nc' and 'dd' or 'pv') how fast the communication really is.  If it is more than 64kbps, the setting is not correct and you will fail reproducing the crash.

Comment 18 Shanzhi Yu 2013-10-28 10:28:55 UTC

Version-Release number of selected component (if applicable):

qemu-kvm-rhev-0.12.1.2-2.411.el6.x86_64
libvirt-0.10.2-29.el6.x86_64

Preparations:

1. mount NFS server to both source and target server.Both source and target server has two NICs(eth0 and eth1). one is used for spice migrations which be limit low speed. define net1 on two server based on eth0

2. define an guest with usbredir 
#virsh dumpxml rhel6

    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
 <graphics type='spice' port='5900' autoport='no'>
      <listen type='network' network='net1'/>
 </graphics>

3. install vdagent in guest.
# rpm -qa|grep vdagent
spice-vdagent-0.12.0-4.el6.x86_64

4. do network transport limit on source server(eth0)(limit eth0 to 64kbps ).
# tc qdisc add dev eth0 ingress
# tc filter add dev eth0 parent ffff: protocol ip u32 match ip src 0.0.0.0/0 police rate 64kbit burst 64kbit mtu 64kb drop flowid :1
(has test it with "dd" and "nc", the two command above works fine)

Steps:
1. start guest 
# virsh start rhel6

2. open an display by remote-view on client server use net1 and plugin two usb disk on source server. Do usb redirection.


3. Do migration from source server to target server with eth1' IP
# time  virsh migrate --live  rhel6 qemu+ssh://10.66.106.23/system
root.106.23's password: 

Results: succeed migrating the guest.
So set it to verified.

Comment 20 errata-xmlrpc 2013-11-21 09:10:58 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1581.html