Bug 1145919

Summary:	qemu-kvm segmentation fault, when boot a RHEL7.1 guest with "-chardev spicevmc" and reboot inside guest
Product:	Red Hat Enterprise Linux 7	Reporter:	huiqingding <huding>
Component:	spice	Assignee:	Marc-Andre Lureau <marcandre.lureau>
Status:	CLOSED ERRATA	QA Contact:	Desktop QE <desktop-qa-list>
Severity:	high	Docs Contact:
Priority:	medium
Version:	7.1	CC:	cfergeau, djasa, fidencio, hhuang, huding, juzhang, kraxel, marcandre.lureau, tpelka, virt-maint, xfu
Target Milestone:	rc
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	spice-0.12.4-8.el7	Doc Type:	Bug Fix
Doc Text:	Cause: With older clients, spice-server resets the spicevmc device instead of destroying it for compatibility reasons Consequence: Accessing a guest using spice-vdagent with an old version of the SPICE gtk client would cause qemu to crash when rebooting the guest Fix: Add some NULL checks in spice-server code in order to handle the situation on reboot when the spicevmc device was already destroyed. Result: As a result, Spice server no longer crashes in this scenario.	Story Points:	---
Clone Of:
Clones:	1168509 (view as bug list)		Environment:
Last Closed:	2015-03-05 07:56:30 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1168509

Description huiqingding 2014-09-24 06:35:01 UTC

Description of problem:
boot a RHEL7.1 guest with "-chardev spicevmc", login the guest and reboot inside guest, qemu-kvm is Segmentation fault.

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.1.0-4.el7.x86_64
kernel-3.10.0-167.el7.x86_64
spice-server-0.12.4-6.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.boot a rhel7.1 guest with "-chardev spicevmc"
# /usr/libexec/qemu-kvm \
-name rhel7 \
-M pc \
-cpu SandyBridge \
-m 4096 \
-smp 4,sockets=4,cores=1,threads=1 \
-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 \
-chardev spicevmc,id=charchannel0,name=vdagent \
-device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 \
-drive file=/home/rhel7_1.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=writethrough \
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-net none \
-spice port=5900,disable-ticketing,seamless-migration=on \
-vga qxl \
-monitor stdio
2. login the guest
3. inside guest, do reboot
# reboot

Actual results:
qemu-kvm is Segmentation fault:
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff3343ed1 in spice_char_device_write_to_device.part.7 () from /usr/lib64/libspice-server.so.1
(gdb) bt
#0  0x00007ffff3343ed1 in spice_char_device_write_to_device.part.7 () from /usr/lib64/libspice-server.so.1
#1  0x00007ffff3344b67 in spice_char_device_start () from /usr/lib64/libspice-server.so.1
#2  0x00007ffff33874d1 in spice_server_vm_start () from /usr/lib64/libspice-server.so.1
#3  0x0000555555781189 in qdev_reset_one ()
#4  0x0000555555780b18 in qbus_walk_children ()
#5  0x0000555555780a48 in qdev_walk_children ()
#6  0x0000555555780b18 in qbus_walk_children ()
#7  0x0000555555722a0d in qemu_system_reset ()
#8  0x000055555561e61f in main ()


Expected results:
The guest can reboot normally.

Additional info:

Comment 1 huiqingding 2014-09-24 06:39:53 UTC

Use the same commandline of comment 0, I test qemu-kvm-rhev-1.5.3-60.el7_0.9.x86_64, not hit tis bug.

Comment 3 Fabiano Fidêncio 2014-10-01 16:51:57 UTC

Hmm. Not able to reproduce using F20 as a host.

I'd like to  confirm that you're using RHEL7.1 as *host* and RHEL7.1 as *guest*.

Comment 4 Fabiano Fidêncio 2014-10-02 10:50:30 UTC

huiqingding, as I cannot reproduce it here, may I ask you to install the debuginfo packages for spice/qemu and provide a new backtrace? Hopefully that will have a bit more info.

Comment 5 huiqingding 2014-10-08 05:31:30 UTC

(In reply to Fabiano Fidêncio from comment #3)
> Hmm. Not able to reproduce using F20 as a host.
> 
> I'd like to  confirm that you're using RHEL7.1 as *host* and RHEL7.1 as
> *guest*.

The host running qemu-kvm is RHEL7.1, the version of kernel and spice-server are:
kernel-3.10.0-175.el7.x86_64
spice-server-0.12.4-7.el7.x86_64
qemu-kvm-rhev-2.1.2-1.el7.x86_64 

The host running remote-viewer is F17, the version of kernel and virt-viewer are:
kernel-3.3.4-5.fc17.x86_64
virt-viewer-0.5.3-1.fc17.x86_64


I also run "remote-viewer spice://host_ip:5900" on the RHEL7.1 host, connect to the guest with "-chardev spicevmc", reboot the guest and not hit this bug.

Comment 6 huiqingding 2014-10-08 05:44:04 UTC

(In reply to Fabiano Fidêncio from comment #4)
> huiqingding, as I cannot reproduce it here, may I ask you to install the
> debuginfo packages for spice/qemu and provide a new backtrace? Hopefully
> that will have a bit more info.

I intall the debuginfo packages of spice and qemu-kvm, do test using the steps of comment0, the backtrace is as following:

(gdb) bt
#0  0x00007ffff3118ed1 in spice_char_device_write_to_device (dev=dev@entry=0x55555639c730) at char_device.c:443
#1  0x00007ffff3119b67 in spice_char_device_write_to_device (dev=0x55555639c730) at char_device.c:436
#2  spice_char_device_start (dev=0x55555639c730) at char_device.c:798
#3  0x00007ffff315c581 in spice_server_vm_start (s=<optimized out>) at reds.c:4542
#4  0x0000555555782a89 in qdev_reset_one (dev=<optimized out>, opaque=<optimized out>) at hw/core/qdev.c:241
#5  0x0000555555782418 in qbus_walk_children (bus=0x555556336070, pre_devfn=0x0, pre_busfn=0x0, post_devfn=0x555555782a80 <qdev_reset_one>, post_busfn=0x555555780e50 <qbus_reset_one>, opaque=0x0)
    at hw/core/qdev.c:422
#6  0x0000555555782348 in qdev_walk_children (dev=0x555556345560, pre_devfn=0x0, pre_busfn=0x0, post_devfn=0x555555782a80 <qdev_reset_one>, post_busfn=0x555555780e50 <qbus_reset_one>, opaque=0x0)
    at hw/core/qdev.c:456
#7  0x0000555555782418 in qbus_walk_children (bus=0x555556323e90, pre_devfn=0x0, pre_busfn=0x0, post_devfn=0x555555782a80 <qdev_reset_one>, post_busfn=0x555555780e50 <qbus_reset_one>, opaque=0x0)
    at hw/core/qdev.c:422
#8  0x00005555557242fd in qemu_devices_reset () at vl.c:1830
#9  qemu_system_reset (report=report@entry=true) at vl.c:1843
#10 0x000055555561f47f in main_loop_should_exit () at vl.c:1974
#11 main_loop () at vl.c:2014
#12 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4552
(gdb)

Comment 7 Marc-Andre Lureau 2014-10-08 09:43:08 UTC

(In reply to huiqingding from comment #5)
> The host running qemu-kvm is RHEL7.1, the version of kernel and spice-server
> are:
> kernel-3.10.0-175.el7.x86_64
> spice-server-0.12.4-7.el7.x86_64
> qemu-kvm-rhev-2.1.2-1.el7.x86_64 

Hi huiqingding, what is your yum repos? I can't find qemu-kvm-rhev in my various rhel7. thanks

Comment 12 Marc-Andre Lureau 2014-10-09 13:06:53 UTC

(In reply to huiqingding from comment #5)
> The host running remote-viewer is F17, the version of kernel and virt-viewer
> are:
> kernel-3.3.4-5.fc17.x86_64
> virt-viewer-0.5.3-1.fc17.x86_64
> 
> I also run "remote-viewer spice://host_ip:5900" on the RHEL7.1 host, connect
> to the guest with "-chardev spicevmc", reboot the guest and not hit this bug.

Interesting bug, I managed to reproduce using spice-gtk 0.12. I am now investigating further. thanks

Comment 13 Marc-Andre Lureau 2014-10-09 13:53:09 UTC

When restarting the VM,  spice_char_device_write_to_device() is called, but dev->sin is NULL.

Why does it work with newer version of spice-gtk and fails with 0.12, this is related to this code in spice-server, it will reset vdagent char device with 0.12, resulting in dev->sin = NULL.

    /* reseting and not destroying the state as a workaround for a bad
     * tokens management in the vdagent protocol:
     *  The client tokens' are set only once, when the main channel is initialized.
     *  Instead, it would have been more appropriate to reset them upon AGEN_CONNECT.
     *  The client tokens are tracked as part of the SpiceCharDeviceClientState. Thus,
     *  in order to be backward compatible with the client, we need to track the tokens
     *  even if the agent is detached. We don't destroy the char_device state, and
     *  instead we just reset it.
     *  In addition, there used to be a misshandling of AGENT_TOKENS message in spice-gtk: it
     *  overrides the amount of tokens, instead of adding the given amount.
     */
    if (red_channel_test_remote_cap(&reds->main_channel->base,
                                    SPICE_MAIN_CAP_AGENT_CONNECTED_TOKENS)) {
        spice_char_device_state_destroy(state->base);
        state->base = NULL;
    } else {
        spice_char_device_reset(state->base);
    }

Comment 14 Marc-Andre Lureau 2014-10-09 16:01:00 UTC

sent fix to ML:
http://lists.freedesktop.org/archives/spice-devel/2014-October/017579.html

I guess this is not CVE, since the crash happens during reboot... though we should apply the fix in other releases (rhel6 at least)

Comment 18 David Jaša 2014-10-23 14:30:57 UTC

Reproduced locally using spice-gtk 0.12 (Windows client from RHEV 3.1)

Comment 21 errata-xmlrpc 2015-03-05 07:56:30 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0335.html