Bug 994414

Summary: hot-unplug chardev with pty backend caused qemu Segmentation fault
Product: Red Hat Enterprise Linux 7 Reporter: Min Deng <mdeng>
Component: qemu-kvmAssignee: Gerd Hoffmann <kraxel>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.0CC: acathrow, amit.shah, bcao, chayang, hhuang, juzhang, kraxel, mdeng, michen, mprivozn, qzhang, virt-bugs, virt-maint, xuzhang
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-1.5.3-11.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 995341 (view as bug list) Environment:
Last Closed: 2014-06-13 12:10:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 887348, 995341    

Description Min Deng 2013-08-07 08:59:38 UTC
Description of problem:
Try to hot-unplug a chardev with pty backend will cause qemu Segmentation fault 

Version-Release number of selected component (if applicable):
qemu-kvm-1.5.1-2.el7.x86_64
kernel-3.10.0-1.el7.x86_64

How reproducible:
3 times

Steps to Reproduce:
1.boot up guest with CLI
   /usr/libexec/qemu-kvm -m 4096 -smp 2,sockets=2,cores=1,threads=1 -no-kvm-pit-reinjection -name usb-device -uuid b03eea94-a502-4142-b541-96f86473a07a -rtc base=localtime,clock=host,driftfix=slew -device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=0 -chardev pty,id=channel1,server,nowait -device virtserialport,chardev=channel1,name=com.redhat.rhevm.vdsm1,bus=virtio-serial0.0,id=port1,nr=1 -chardev socket,id=channel2,path=/tmp/helloworld2,server,nowait -device virtserialport,chardev=channel2,name=com.redhat.rhevm.vdsm2,bus=virtio-serial0.0,id=port2,nr=2 -drive file=rhel64-new.raw,if=none,id=drive-system-disk,format=raw,cache=none,aio=native,werror=stop,rerror=stop,serial=QEMU-DISK1 -device ide-hd,bus=ide.0,unit=0,drive=drive-system-disk,id=system-disk,bootindex=2 -netdev tap,sndbuf=0,id=hostnet0,script=/etc/qemu-ifup,downscript=no -device e1000,netdev=hostnet0,mac=00:15:65:01:3a:20 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -vnc :1 -monitor stdio -qmp tcp:0:4444,server,nowait
  
2.remove virtserialport port1

3.remove chardev channel1
  {"execute": "qmp_capabilities"}
  {"return": {}}
  {"execute": "chardev-remove", "arguments": { "id" : "channel1" } }


Actual results:Segmentation fault 
-------------------------------------------------------------------------
gdb) bt
#0  pty_chr_timer (opaque=0x5555564c5260) at qemu-char.c:991
#1  0x00007ffff76ee963 in g_timeout_dispatch () from /lib64/libglib-2.0.so.0
#2  0x00007ffff76ede06 in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
#3  0x00005555556bfeba in glib_pollfds_poll () at main-loop.c:187
#4  os_host_main_loop_wait (timeout=<optimized out>) at main-loop.c:232
#5  main_loop_wait (nonblocking=<optimized out>) at main-loop.c:464
#6  0x00005555555c0609 in main_loop () at vl.c:2029
#7  main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4419
-----------------------------------------------------------------------------


Expected results:The chardev can be removed successfully.



Additional info:
The issue can not be reproduced when using socket&file backends.

Comment 1 Amit Shah 2013-08-07 09:12:55 UTC
Might be needed in 6.5 too.

Comment 3 Gerd Hoffmann 2013-08-20 13:01:21 UTC
Hmm.  Stacktrace looks like a timer being called after chardev removal.  Can't see a bug in the code closing the pty, the timer is cleaned up properly.  Also can't reproduce the bug.

Does it happen on every attempt or only now and then?
Can you still reproduce it with the latest rhel7 builds?

Comment 7 Gerd Hoffmann 2013-08-22 10:00:19 UTC
Pinned it: http://patchwork.ozlabs.org/patch/269003/

Comment 8 Miroslav Rezanina 2013-10-31 07:55:35 UTC
Fix included in qemu-kvm-1.5.3-11.el7

Comment 10 Chao Yang 2013-11-22 10:11:26 UTC
Tried to reproduce this issue with same CLI as well as same version of qemu-kvm mentioned in Comment #0. 
After hot removing chardev, I got bt as follows, but I am not very sure if they are the same issue.


Program terminated with signal 11, Segmentation fault.
#0  0x00007f4fe8691e40 in ?? ()
Missing separate debuginfos, use: debuginfo-install cyrus-sasl-lib-2.1.26-12.1.el7.x86_64 cyrus-sasl-md5-2.1.26-12.1.el7.x86_64 cyrus-sasl-plain-2.1.26-12.1.el7.x86_64 dbus-libs-1.6.12-5.el7.x86_64 krb5-libs-1.11.3-31.el7.x86_64 libiscsi-1.7.0-6.el7.x86_64 libuuid-2.23.2-6.el7.x86_64 nspr-4.10-3.el7.x86_64 nss-3.15.2-8.el7.x86_64 openssl-libs-1.0.1e-23.el7.x86_64
(gdb) bt
#0  0x00007f4fe8691e40 in ?? ()
#1  0x00007f4fe733f4b8 in qemu_chr_be_can_write (s=<optimized out>) at qemu-char.c:161
#2  pty_chr_read_poll (opaque=<optimized out>) at qemu-char.c:1042
#3  0x00007f4fe7340fb2 in io_watch_poll_prepare (source=0x7f4fe8464d70, timeout_=timeout_@entry=0x7fffb6a2d664) at qemu-char.c:593
#4  0x00007f4fe689d79d in g_main_context_prepare (context=context@entry=0x7f4fe8464a00, 
    priority=priority@entry=0x7f4fe7c9eb80 <max_priority>) at gmain.c:3328
#5  0x00007f4fe731adb6 in glib_pollfds_fill (cur_timeout=<synthetic pointer>) at main-loop.c:163
#6  os_host_main_loop_wait (timeout=1000) at main-loop.c:198
#7  main_loop_wait (nonblocking=<optimized out>) at main-loop.c:464
#8  0x00007f4fe721b609 in main_loop () at vl.c:2029
#9  main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4419

Comment 11 Chao Yang 2013-11-22 10:12:57 UTC
Hi Gerd,
 Would please help check above bt and clarify if I reproduced this issue? Thanks very much.

Comment 12 Gerd Hoffmann 2013-11-29 09:14:08 UTC
Stack trace looks unrelated.  Doesn't reproduce here (qemu-kvm-1.5.3-19.el7.x86_64).  Can you retest with latest qemu-kvm please?
In case it still happens: please install the missing debuginfos, so we get symbol names instead of the question marks for stackframe #0?

Comment 13 Chao Yang 2013-12-02 10:20:36 UTC
Reproduced on qemu-kvm-1.5.1-2.el7.x86_64.rpm. qemu-kvm coredumped once hot removing pty backend.

(gdb) bt
#0  0x00007f0f049752b0 in g_io_channel_unix_get_fd () from /lib64/libglib-2.0.so.0
#1  0x00007f0f053d9594 in pty_chr_update_read_handler (chr=0x7f0f071e88a0) at qemu-char.c:1076
#2  0x00007f0f053d9625 in pty_chr_timer (opaque=<optimized out>) at qemu-char.c:996
#3  0x00007f0f04935963 in g_timeout_dispatch () from /lib64/libglib-2.0.so.0
#4  0x00007f0f04934e06 in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
#5  0x00007f0f053b1eba in glib_pollfds_poll () at main-loop.c:187
#6  os_host_main_loop_wait (timeout=<optimized out>) at main-loop.c:232
#7  main_loop_wait (nonblocking=<optimized out>) at main-loop.c:464
#8  0x00007f0f052b2609 in main_loop () at vl.c:2029
#9  main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4419

Verified on qemu-kvm-1.5.3-20.el7.x86_64.rpm. qemu-kvm worked well after hot removing pty backend.

CLI:
/usr/libexec/qemu-kvm -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/test.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0 -drive file=/var/lib/libvirt/images/test.img,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:93:80:0b,bus=pci.0 -chardev pty,id=channel1 -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=channel1,id=channel0,name=com.redhat.spice.0 -vga cirrus -vnc :1

Steps:
# nc -U /var/lib/libvirt/qemu/test.monitor 
# {"QMP": {"version": {"qemu": {"micro": 3, "minor": 5, "major": 1}, "package": " (qemu-kvm-1.5.3-20.el7)"}, "capabilities": []}}
# {"execute":"qmp_capabilities"}
# {"return": {}}
# {"timestamp": {"seconds": 1385979309, "microseconds": 163800}, "event": "NIC_RX_FILTER_CHANGED", "data": {"name": "net0", "path": "/machine/peripheral/net0/virtio-backend"}}
# {"timestamp": {"seconds": 1385979314, "microseconds": 195430}, "event": "VNC_CONNECTED", "data": {"server": {"auth": "none", "family": "ipv4", "service": "5901", "host": "0.0.0.0"}, "client": {"family": "ipv4", "service": "36090", "host": "127.0.0.1"}}}
# {"timestamp": {"seconds": 1385979314, "microseconds": 196227}, "event": "VNC_INITIALIZED", "data": {"server": {"auth": "none", "family": "ipv4", "service": "5901", "host": "0.0.0.0"}, "client": {"family": "ipv4", "service": "36090", "host": "127.0.0.1"}}}
# {"execute":"device_del","arguments":{"id":"channel0"}}
# {"timestamp": {"seconds": 1385979333, "microseconds": 459081}, "event": "DEVICE_DELETED", "data": {"device": "channel0", "path": "/machine/peripheral/channel0"}}
# {"return": {}}
# {"execute":"chardev-remove","arguments":{"id":"channel1"}}
# {"return": {}}



As per above, this issue has been fixed.

Comment 14 Chao Yang 2013-12-02 10:21:32 UTC
(In reply to Gerd Hoffmann from comment #12)
> Stack trace looks unrelated.  Doesn't reproduce here
> (qemu-kvm-1.5.3-19.el7.x86_64).  Can you retest with latest qemu-kvm please?
> In case it still happens: please install the missing debuginfos, so we get
> symbol names instead of the question marks for stackframe #0?

Cannot reproduce with qemu-kvm-1.5.3-20.el7.x86_64.rpm, either.

Comment 16 Ludek Smid 2014-06-13 12:10:18 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.