Bug 501131

Summary: qemu segfault when VNC client disconnects
Product: [Fedora] Fedora Reporter: Enrico Scholz <rh-bugzilla>
Component: qemuAssignee: Mark McLoughlin <markmc>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: medium    
Version: 11CC: berrange, dwmw2, gcosta, itamar, markmc, m.gruys, vdanen, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 0.10.6-5.fc11 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-24 05:24:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 480594    
Attachments:
Description Flags
gdb backtrace
none
qemu-vnc-segfault.patch
none
gdb backtrace
none
gdb backtraces (100% reproducible)
none
fix none

Description Enrico Scholz 2009-05-16 20:55:31 UTC
Description of problem:

I got the attached segfault when a VNC client disconnects while guest gives out something. In current case, I executed 'type long-textfile' in a Windoze guest (SMP, 2 cpu i686).  I used 'vinagre' VNC client and the VNC connection was tunneled through slow SSH portforwarding.

Abort seems to be caused by a double-free().


Version-Release number of selected component (if applicable):

qemu-system-x86-0.10.4-4.fc11.x86_64
vinagre-2.24.2-1.fc10.x86_64


How reproducible:

20-30%

Comment 1 Enrico Scholz 2009-05-16 20:56:58 UTC
Created attachment 344297 [details]
gdb backtrace

Comment 2 Mark McLoughlin 2009-05-22 15:22:07 UTC
Thanks for the backtrace Enrico

I can't reproduce this, but I think I see the issue

vnc_client_io_error() frees the VncState if there was an error

This means that after e.g. vnc_flush() or vnc_write() are called, VncState might have been freed. There is no error return from vnc_flush() and there are lots of places we continue to use the VncState even though it may have been freed.

So, in this stack trace we may have hit an I/O error in vnc_update_client() and freed the VncState yet vnc_copy() continues on and tries to do a vnc_flush()

Comment 3 Mark McLoughlin 2009-05-22 15:25:28 UTC
Created attachment 345104 [details]
qemu-vnc-segfault.patch

So, this is a hacky and incomplete fix, but it will help us confirm whether we've identified the cause.

I don't think this solution is workable - it's just too difficult to audit the entire protocol handling to make sure that we correctly handle an I/O error everywhere. Instead, I think we'll probably add a ->deleted flag to VncState, set that on I/O error and only in a small fixed number of places actually handle deleting.

Comment 4 Mark McLoughlin 2009-05-22 15:28:05 UTC
Enrico: when it finished building, could you try out this scratch build:

https://koji.fedoraproject.org/koji/taskinfo?taskID=1370743

the RPMs should wind up here:

http://koji.fedoraproject.org/scratch/markmc/task_1370743

Comment 5 Enrico Scholz 2009-05-26 12:04:37 UTC
Created attachment 345451 [details]
gdb backtrace

still there :(

qemu-common-0.10.4-5.1.fc11.x86_64

Comment 6 Enrico Scholz 2009-05-26 12:08:01 UTC
Created attachment 345452 [details]
gdb backtraces (100% reproducible)

These are two backtraces which are 100% reproducibly by 

$ rdesktop localhost:7905

Comment 7 Mark McLoughlin 2009-06-04 13:32:40 UTC
Perhaps my test patch didn't catch all cases where this could happen - oh well, we need to fix this in a better way anyway

Comment 8 Mark McLoughlin 2009-06-04 13:36:54 UTC
Someone else reported this when running qemu on windows:

  http://marc.info/?l=qemu-devel&m=124324043812915

Comment 9 Bug Zapper 2009-06-09 15:57:18 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 10 Gerd Hoffmann 2009-06-16 12:17:04 UTC
Created attachment 348098 [details]
fix

Against latest git, will send this one upstream shortly.

Comment 11 Mark McLoughlin 2009-06-16 14:54:16 UTC
*** Bug 505640 has been marked as a duplicate of this bug. ***

Comment 12 Mark McLoughlin 2009-06-17 12:05:30 UTC
Thanks Gerd - any chance you could re-base it to F-11 ?

Comment 13 Mark McLoughlin 2009-07-03 11:14:07 UTC
*** Bug 508567 has been marked as a duplicate of this bug. ***

Comment 14 Mark McLoughlin 2009-08-07 14:44:51 UTC
The upstream commit we need re-based to stable-0.10 is:

  http://git.savannah.gnu.org/cgit/qemu.git/commit/?id=198a0039c5

Comment 15 Mark McLoughlin 2009-09-07 17:33:28 UTC
0.10.7 should be released relatively soon with this backport:

  http://git.savannah.gnu.org/cgit/qemu.git/commit/?h=stable-0.10&id=c2723a9606

Comment 16 Mark McLoughlin 2009-09-11 11:17:20 UTC
Will push this to updates-testing soon:

* Fri Sep 11 2009 Mark McLoughlin <markmc> - 2:0.10.6-5
- Fix vnc segfault on disconnect (#501131)
- Fix vnc screen corruption with e.g. xterm (#503156)
- Rebase vnc sasl patches on top of these two vnc fixes

Comment 17 Fedora Update System 2009-09-14 07:27:22 UTC
qemu-0.10.6-5.fc11 has been submitted as an update for Fedora 11.
http://admin.fedoraproject.org/updates/qemu-0.10.6-5.fc11

Comment 18 Fedora Update System 2009-09-15 07:36:17 UTC
qemu-0.10.6-5.fc11 has been pushed to the Fedora 11 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update qemu'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F11/FEDORA-2009-9542

Comment 19 Fedora Update System 2009-09-24 05:24:06 UTC
qemu-0.10.6-5.fc11 has been pushed to the Fedora 11 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 20 Mark McLoughlin 2009-11-19 10:47:41 UTC
*** Bug 537903 has been marked as a duplicate of this bug. ***