Bug 1253276

Summary: [abrt] qemu-kvm: SLL_Next(): qemu-kvm killed by SIGSEGV
Product: Red Hat Enterprise Linux 7 Reporter: David Jaša <djasa>
Component: qemu-kvmAssignee: Gerd Hoffmann <kraxel>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.2CC: areis, armbru, djasa, fziglio, huding, juzhang, knoel, kraxel, mazhang, rbalakri, rh-spice-bugs, thuth, virt-maint, xfu
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard: abrt_hash:e4fb695c975c5f0f5eb875ec6c00894a2efba238
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-12 15:01:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
File: backtrace
none
File: cgroup
none
File: core_backtrace
none
File: dso_list
none
File: environ
none
File: limits
none
File: machineid
none
File: maps
none
File: open_fds
none
File: proc_pid_status
none
File: var_log_messages
none
File: binary
none
backtrace: crash during VM shutdown
none
valgrind log 1
none
valgrind log 2 none

Description David Jaša 2015-08-13 11:47:39 UTC
Description of problem:
               

Version-Release number of selected component:
qemu-kvm-1.5.3-100.el7

Additional info:
reporter:       libreport-2.1.11
backtrace_rating: 3
cmdline:        /usr/libexec/qemu-kvm -name win7u -S -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off -cpu Nehalem -m 1536 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 30db2a88-db5c-4af9-9ff2-1b75998218e8 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/home/djasa/.config/libvirt/qemu/lib/win7u.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x6 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -device usb-ccid,id=ccid0 -drive file=/home/djasa/.local/share/gnome-boxes/images/win7u.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2,discard=unmap -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -drive file=/home/djasa/Downloads/cs_windows_7_ultimate_with_sp1_x64_dvd_u_677376.iso,if=none,id=drive-ide0-0-1,readonly=on,format=raw -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=2 -drive file=/usr/share/virtio-win/virtio-win.iso,if=none,id=drive-ide0-1-0,readonly=on,format=raw -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=26,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:82:18:e8,bus=pci.0,addr=0x3 -chardev spicevmc,id=charsmartcard0,name=smartcard -device ccid-card-passthru,chardev=charsmartcard0,id=smartcard0,bus=ccid0.0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -spice port=5902,tls-port=5903,addr=::,disable-ticketing,x509-dir=/etc/pki/libvirt-spice,seamless-migration=on -vga qxl -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=67108864 -global qxl-vga.vgamem_mb=16 -device qxl,id=video1,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=pci.0,addr=0x8 -device qxl,id=video2,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=pci.0,addr=0x9 -device qxl,id=video3,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=pci.0,addr=0xa -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1 -chardev spicevmc,id=charredir2,name=usbredir -device usb-redir,chardev=charredir2,id=redir2 -chardev spicevmc,id=charredir3,name=usbredir -device usb-redir,chardev=charredir3,id=redir3 -global qxl-vga.debug=5 -global qxl-vga.guestdebug=10 -msg timestamp=on
crash_function: SLL_Next
executable:     /usr/libexec/qemu-kvm
kernel:         3.10.0-304.el7.x86_64
runlevel:       N 5
type:           CCpp
uid:            16189

Truncated backtrace:
Thread no. 1 (10 frames)
 #0 SLL_Next at src/linked_list.h:45
 #1 SLL_Pop at src/linked_list.h:59
 #2 Pop at src/thread_cache.h:205
 #3 Allocate at src/thread_cache.h:357
 #4 do_malloc_small at src/tcmalloc.cc:1099
 #5 do_malloc_no_errno at src/tcmalloc.cc:1106
 #6 do_malloc at src/tcmalloc.cc:1115
 #7 do_malloc_or_cpp_alloc at src/tcmalloc.cc:1035
 #8 tc_malloc at src/tcmalloc.cc:1574
 #9 malloc_and_trace at vl.c:2743

Comment 1 David Jaša 2015-08-13 11:47:43 UTC
Created attachment 1062496 [details]
File: backtrace

Comment 2 David Jaša 2015-08-13 11:47:44 UTC
Created attachment 1062497 [details]
File: cgroup

Comment 3 David Jaša 2015-08-13 11:47:46 UTC
Created attachment 1062498 [details]
File: core_backtrace

Comment 4 David Jaša 2015-08-13 11:47:47 UTC
Created attachment 1062499 [details]
File: dso_list

Comment 5 David Jaša 2015-08-13 11:47:48 UTC
Created attachment 1062500 [details]
File: environ

Comment 6 David Jaša 2015-08-13 11:47:50 UTC
Created attachment 1062501 [details]
File: limits

Comment 7 David Jaša 2015-08-13 11:47:51 UTC
Created attachment 1062502 [details]
File: machineid

Comment 8 David Jaša 2015-08-13 11:47:53 UTC
Created attachment 1062503 [details]
File: maps

Comment 9 David Jaša 2015-08-13 11:47:54 UTC
Created attachment 1062504 [details]
File: open_fds

Comment 10 David Jaša 2015-08-13 11:47:56 UTC
Created attachment 1062505 [details]
File: proc_pid_status

Comment 11 David Jaša 2015-08-13 11:47:57 UTC
Created attachment 1062506 [details]
File: var_log_messages

Comment 12 David Jaša 2015-08-13 11:48:19 UTC
Created attachment 1062507 [details]
File: binary

Comment 14 David Jaša 2015-08-13 13:37:58 UTC
Abrt has eaten my comment so here it goes: I hit the bug when trying to reproduce some other bug. This involved
* Windows 7 64b VM
* with 4 large monitors (2560x1600) arranged to a matrix
* with firefox playing youtube/flash video at the center of virtual screen (parts of video were on every monitor)
* at some point, I resized the firefox window. Then the VM crashed

Comment 15 Gerd Hoffmann 2015-08-31 07:37:34 UTC
Hmm, it's g_malloc() crashing.  So most likely we have a bug (buffer overflow, use-after-free, ...) somewhere which corrupts malloc's data structures.  Given the circumstances I's suspect it is somewhere in spice-server.

Any chance you can retry with a malloc debugger active, so we find the place where the corruption happens?  I'd suggest ElectricFence ...

Comment 16 David Jaša 2015-08-31 13:48:16 UTC
what are suitable arguments to valgrind?

Comment 17 David Jaša 2015-08-31 15:16:32 UTC
OK, I found ElectricFence and how to use it. So far, I got one crash without suitable dump, I'll try tomorrow again.

Comment 18 David Jaša 2015-08-31 15:22:10 UTC
Created attachment 1068672 [details]
backtrace: crash during VM shutdown

after several tries to reproduce, I got the crash with while I was shutting down the VM.

Comment 19 Gerd Hoffmann 2015-09-01 10:51:40 UTC
Hmm, crashes within efence library, not very helpful.
Also might be some different bug.

Can you try playing with efence settings to see whenever you get better results then?  I'd suggest EF_PROTECT_FREE=1 first (to catch use-after-free).  Eventually also EF_PROTECT_BELOW=1.

If that doesn't help you can try valgrind (start without special parameters), although valgrind is slooooow so it might not work that well either, especially with a performance-sensitive workload such as video playback.  But worth a try nevertheless.

Comment 20 Gerd Hoffmann 2015-09-01 10:52:38 UTC
Oh, and does it happen on qemu-kvm only or does it reproduce with qemu-kvm-rhev too?

Comment 21 Markus Armbruster 2015-10-14 15:21:39 UTC
Suggest to try with valgrind --soname-synonyms='somalloc=*tcmalloc*'.  See also bug 1271754.

Comment 23 David Jaša 2015-11-07 13:15:54 UTC
I tried unsuccessfully with various tools:
  * electric fence always stopped qemu as it couldn't allocate enough memory
  * valgrind/memcheck evenually run qemu but couldn't see anything
  * injecting mcheck to qemu main couldn't start the VM (probably threading issue, mcheck is marked as thread-unsafe)
  * Address Sanitizer looked most promising but qemu couldn't ultimately finish its rebuild (in "make check" stage), most probably because of lack of coroutine support in asan

What's left is: running qemu without any tool with G_SLICE=always-malloc (as opposed to debug-blocks of the original report) and trying some thread safety tool.

Comment 24 Markus Armbruster 2015-11-09 06:38:32 UTC
Did you reproduce the crash under valgrind?

Comment 25 David Jaša 2015-11-09 12:32:05 UTC
(In reply to Markus Armbruster from comment #24)
> Did you reproduce the crash under valgrind?

No. I'll attach the valgrind logs nevertheless as they show some definite leaks.

Comment 26 David Jaša 2015-11-09 12:32:53 UTC
Created attachment 1091714 [details]
valgrind log 1

Comment 27 David Jaša 2015-11-09 12:33:33 UTC
Created attachment 1091715 [details]
valgrind log 2

Comment 28 Markus Armbruster 2015-11-09 12:47:10 UTC
The valgrind logs also show other things, which may or may not be false positives.

I asked whether you reproduced the crash because "valgrind couldn't see anything" is an inconclusive result unless you did.

Comment 29 Frediano Ziglio 2016-04-12 15:01:02 UTC

*** This bug has been marked as a duplicate of bug 1253375 ***