Bug 1029646

Summary: Qemu instance received segfault after multiple resolution change
Product: Red Hat Enterprise Linux 7 Reporter: Marian Krcmarik <mkrcmari>
Component: spiceAssignee: Marc-Andre Lureau <marcandre.lureau>
Status: CLOSED ERRATA QA Contact: Desktop QE <desktop-qa-list>
Severity: high Docs Contact:
Priority: high    
Version: 7.0CC: dblechte, djasa, marcandre.lureau, mkrcmari, uril
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: spice-0.12.4-6.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1127342 (view as bug list) Environment:
Last Closed: 2015-03-05 07:55:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1127342    

Description Marian Krcmarik 2013-11-12 20:05:09 UTC
Description of problem:
Qemu guest crashes when resolution is changed on the guest with using xrandr repeatedly - three tries within 3 seconds are enough to crash the qemu in spice-server.

Version-Release number of selected component (if applicable):
All (client/host/guest) RHEL7 PreBeta
# rpm -qa | egrep "qemu|kernel|spice"libvirt-daemon-driver-qemu-1.1.1-11.el7.x86_64
qemu-img-1.5.3-19.el7.x86_64
kernel-3.10.0-42.el7.x86_64
qemu-guest-agent-1.5.3-17.el7.x86_64
qemu-kvm-common-1.5.3-19.el7.x86_64
spice-server-0.12.4-3.el7.x86_64
spice-glib-0.20-6.el7.x86_64
qemu-kvm-debuginfo-1.5.3-19.el7.x86_64
spice-debuginfo-0.12.4-3.el7.x86_64
qemu-kvm-1.5.3-19.el7.x86_64
spice-vdagent-0.14.0-5.el7.x86_64
kernel-3.10.0-44.el7.x86_64
spice-gtk3-0.20-6.el7.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Start RHEL7 guest on RHEL7 host and connect to it with remote-viewer.
2. Change resolution in loop, i.e.:
for ((i=0;$i<2000;i++)); do echo ${i}; if [ "x$((i%3))" == "x0" ]; then MODE="1280x800"; elif [ "x$((i%3))" == "x1" ]; then MODE="1440x900"; else MODE="1680x1050"; fi ; xrandr --output Virtual-0 --mode "${MODE}" 2> /dev/null; sleep 1; done
3.

Actual results:
Segmentation fault received in qemu-kvm process

Expected results:


Additional info:
Bt from qemu-kvm process
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7faf08dff700 (LWP 6820)]
0x00007faf1369dfcb in __memset_sse2 () from /lib64/libc.so.6
(gdb) thread apply all bt

Thread 4 (Thread 0x7faf0b457700 (LWP 6925)):
#0  0x00007faf165caab3 in pread64 () from /lib64/libpthread.so.0
#1  0x00007faf1864e47b in pread (__offset=<optimized out>, __nbytes=<optimized out>, 
    __buf=0x7faedf34a000, __fd=<optimized out>) at /usr/include/bits/unistd.h:99
#2  handle_aiocb_rw_linear (aiocb=0x7faf19c1e400, buf=0x7faedf34a000 "GVariant")
    at block/raw-posix.c:588
#3  0x00007faf1864e6ba in handle_aiocb_rw (aiocb=<optimized out>) at block/raw-posix.c:618
#4  0x00007faf1864f09d in aio_worker (arg=0x7faf19c1e400) at block/raw-posix.c:751
#5  0x00007faf1875ab6b in worker_thread (opaque=0x7faf19a84c00) at thread-pool.c:109
#6  0x00007faf165c3de3 in start_thread () from /lib64/libpthread.so.0
#7  0x00007faf1370a1ad in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x7faf0aa55700 (LWP 6819)):
#0  0x00007faf13701297 in ioctl () from /lib64/libc.so.6
#1  0x00007faf187d8aa5 in kvm_vcpu_ioctl (cpu=cpu@entry=0x7faf19becbe0, 
    type=type@entry=44672) at /usr/src/debug/qemu-1.5.3/kvm-all.c:1744
#2  0x00007faf187d8bdc in kvm_cpu_exec (env=env@entry=0x7faf19beccf0)
    at /usr/src/debug/qemu-1.5.3/kvm-all.c:1629
#3  0x00007faf18784225 in qemu_kvm_cpu_thread_fn (arg=0x7faf19beccf0)
    at /usr/src/debug/qemu-1.5.3/cpus.c:793
#4  0x00007faf165c3de3 in start_thread () from /lib64/libpthread.so.0
#5  0x00007faf1370a1ad in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7faf08dff700 (LWP 6820)):
#0  0x00007faf1369dfcb in __memset_sse2 () from /lib64/libc.so.6
#1  0x00007faf143d91e5 in memset (__len=5184000, __ch=0, __dest=<optimized out>)
    at /usr/include/bits/string3.h:84
#2  red_create_surface (send_client=1, data_is_valid=0, line_0=<optimized out>, format=32, 
    stride=5760, height=900, width=1440, surface_id=5, worker=0x7faeb00008c0)
    at red_worker.c:9659
#3  red_process_surface (loadvm=0, group_id=1, surface=0x7faeb02421e0, 
    worker=0x7faeb00008c0) at red_worker.c:4288
#4  red_process_commands (worker=worker@entry=0x7faeb00008c0, 
    ring_is_empty=ring_is_empty@entry=0x7faf08dfebf0, max_pipe_size=50)
    at red_worker.c:5243
#5  0x00007faf143de30a in red_worker_main (arg=<optimized out>) at red_worker.c:12292
#6  0x00007faf165c3de3 in start_thread () from /lib64/libpthread.so.0
#7  0x00007faf1370a1ad in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7faf18532a00 (LWP 6816)):
#0  0x00007faf136ffbdd in poll () from /lib64/libc.so.6
#1  0x00007faf17a66d57 in poll (__timeout=<optimized out>, __nfds=<optimized out>, 
    __fds=<optimized out>) at /usr/include/bits/poll2.h:46
#2  g_poll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>)
---Type <return> to continue, or q <return> to quit---
    at gpoll.c:132
#3  0x00007faf1870ef86 in os_host_main_loop_wait (timeout=<optimized out>)
    at main-loop.c:226
#4  main_loop_wait (nonblocking=<optimized out>) at main-loop.c:464
#5  0x00007faf18618488 in main_loop () at vl.c:1986
#6  main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4343

Comment 3 Uri Lublin 2014-02-05 13:57:03 UTC
I can not reproduce the crash running RHEL-7 guest on a RHEL-7 host.

Packages on Host:
spice-glib-0.20-8.el7.x86_64
spice-gtk3-0.20-8.el7.x86_64
spice-server-0.12.4-5.el7.x86_64
spice-vdagent-0.14.0-7.el7.x86_64
qemu-kvm-1.5.3-44.el7.x86_64
qemu-kvm-common-1.5.3-44.el7.x86_64
kernel-3.10.0-80.el7.x86_64
virt-viewer-0.5.7-5.el7.x86_64

Packages on Guest:
kernel-3.10.0-79.el7.x86_64
gnome-shell-3.8.4-21.el7.x86_64
xorg-x11-server-Xorg-1.15.0-2.el7.x86_64


What I do see is that after a while gnome-shell fails on the guest.
I think due to a gnome-shell memory leak.
I filed bug 1059343 for that.


I also tried with a RHEL-6 guest, where both guest and host seem to behave fine.
(Outer loop was limited to 200 not 2000)

Comment 6 Marc-Andre Lureau 2014-08-05 16:08:21 UTC
I tried on f20 spice-server-0.12.5-2.fc20.x86_64 qemu-kvm-2.1.0-0.3.rc1.fc20.x86_64, I can't reproduce.

Tentatively closing as currentrelease.
Marian, feel free to reopen with further details if you can reproduce. thanks

Comment 7 Marc-Andre Lureau 2014-08-06 10:05:21 UTC
Got a few crashes after ~50x tries

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f0e8d1ff700 (LWP 11731)]
memset () at ../sysdeps/x86_64/memset.S:97
97		movdqa	%xmm8, (%rcx)
(gdb) bt
#0  memset () at ../sysdeps/x86_64/memset.S:97
#1  0x00007f0f3d97e0da in red_create_surface (worker=0x7f0e880008c0, surface_id=4, width=1680, height=1050, 
    stride=6720, format=32, line_0=0x7f0e90a6d000, data_is_valid=0, send_client=1) at red_worker.c:9473
#2  0x00007f0f3d96e4a2 in red_process_surface (worker=0x7f0e880008c0, surface=0x7f0e882e4000, group_id=1, 
    loadvm=0) at red_worker.c:4252
#3  0x00007f0f3d970b34 in red_process_commands (worker=0x7f0e880008c0, max_pipe_size=50, 
    ring_is_empty=0x7f0e8d1febac) at red_worker.c:5068
#4  0x00007f0f3d9859bf in red_worker_main (arg=0x7fff41c76ec0) at red_worker.c:12039
#5  0x00007f0f46055f33 in start_thread (arg=0x7f0e8d1ff700) at pthread_create.c:309
#6  0x00007f0f3c77aded in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Comment 8 Marc-Andre Lureau 2014-08-06 16:59:21 UTC
sent fix: http://lists.freedesktop.org/archives/spice-devel/2014-August/017167.html

Comment 9 Marc-Andre Lureau 2014-08-06 17:00:00 UTC
the crash could happen on rhel6 too, so let's get it there.

Comment 12 David Jaša 2014-12-11 16:19:41 UTC
*** Bug 1159425 has been marked as a duplicate of this bug. ***

Comment 14 errata-xmlrpc 2015-03-05 07:55:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0335.html