Bug 863753

Summary: virtio_serialport data loss when hot-unplugging and re-plugging the port (guest->host and host->guest)
Product: [Fedora] Fedora Reporter: Lukas Doktor <ldoktor>
Component: virtio-winAssignee: Gal Hammer <ghammer>
Status: CLOSED NOTABUG QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 17CC: amit.shah, bcao, berrange, cfergeau, crobinso, dwmw2, ghammer, itamar, juzhang, knoel, pbonzini, rjones, scottt.tw, virt-maint, vrozenfe, yvugenfi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-03 11:06:08 EDT Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Description Flags
guest sender script (sends A, B, C, D..., resends when send fails)
host receiver script (reads the port and verifies A, B, C, D, ... is received correctly. Reopens the port when read fails) none

Description Lukas Doktor 2012-10-07 04:16:41 EDT
Description of problem:
Hi guys,

I'm developing an interrupted loopback test. I managed to go around the problem with hot-plugging of the incorrectly uninitialized port ( https://bugzilla.redhat.com/show_bug.cgi?id=796048 ) but some data are lost between port replugs even thought the send/recv commands passed (I'm resending data in cas send fails).

I created simple reproducers in pyton, this one is for guest->host (data loss)
I'm sending data from guest to host, than I unplug the port, replug it back and continue in sending. Few of the successfully sent data are missing on the other side.

Version-Release number of selected component (if applicable):

How reproducible:
10-20% (see the log for details)

Steps to Reproduce:
1) start sending data from guest (run sender.py on guest)
2) receive data on host (run listener.py on host)
3) unplug/replug the port (eg. MON=/tmp/monitor-hmp1-20121004-115412-sLj47KEF ; while :; do echo device_del vs1 | sudo socat $MON - ; sleep 5 ; echo 'device_add virtserialport,id=vs1,chardev=devvs1,nr=1,name=com.redhat.spice.0' | sudo socat $MON - ; sleep 5 ; done )
5) see the error messages

Actual results:
Error messages informing about how much data were successfully sent from guest, but were not received on host.

Expected results:
send/recv should report failure for all data which were not transferred. So no data loss should be visible using the simple reproducer.
Comment 1 Lukas Doktor 2012-10-07 04:18:27 EDT
Created attachment 622915 [details]
guest sender script (sends A, B, C, D..., resends when send fails)
Comment 2 Lukas Doktor 2012-10-07 04:19:53 EDT
Created attachment 622916 [details]
host receiver script (reads the port and verifies A, B, C, D, ... is received correctly. Reopens the port when read fails)
Comment 3 Lukas Doktor 2012-10-07 04:22:34 EDT
[5s between replug]
skipped: 3456789ABCDEFGHIJK (waiting for L)
skipped: EFGHIJKLMNOPQRSTUV (waiting for W)
skipped: STUVWXYZ0123456789 (waiting for A)

[1s between replug]
skipped: LMNOPQRSTUVWXYZ0123 (waiting for 4)
skipped: UVWXYZ0123456789ABCDEFGHIJKLMNOPQR (waiting for S)
skipped: YZ0123456789ABCDEFG (waiting for H)
skipped: CDEFGHIJKLMNOPQRSTUV (waiting for W)
skipped: OPQRSTUVWX (waiting for Y)

[immediate replug]
skipped: JKLMNOPQRSTUVWXY (waiting for Z)
skipped: STUVWXYZ01234567 (waiting for 8)
skipped: UVWXYZ0123456789 (waiting for A)
skipped: WXYZ0123456789ABCDEF (waiting for G)
skipped: 3456789ABCDEFGHIJKLMNOPQRSTUVW (waiting for X)
skipped: QRSTUVWXYZ01234 (waiting for 5)
skipped: NOPQRSTUVWXYZ012 (waiting for 3)
skipped: YZ0123456789ABCD (waiting for E)
skipped: GHIJKLMNOPQRSTUVWXYZ0123456789A (waiting for B)
skipped: HIJKLMNOPQRSTUVWXYZ (waiting for 0)
skipped: 789ABCDEFGHIJKLMN (waiting for O)
skipped: KLMNOPQRSTUVWX (waiting for Y) 
skipped: JKLMNOPQRSTUVWXYZ (waiting for 0)

Not every reconnect fails:
5s sleep - 1 out of 10 failed
2s sleep - 3 out of 10 failed
1s sleep - 5 out of 10 failed

This output was generated using 1 char buffers. With longer buffers less of them are lost. With buffers over 10 characters long only 1 packet lose was observed (not always).
Comment 4 Lukas Doktor 2012-10-07 04:43:19 EDT
Sorry, I forgot to add qemu-cmdline. It was generated by autotest:

/usr/bin/qemu-kvm -S -name 'vm1' -nodefaults -chardev socket,id=hmp_id_hmp1,path=/tmp/monitor-hmp1-20121004-115412-sLj47KEF,server,nowait -mon chardev=hmp_id_hmp1,mode=readline -chardev socket,id=serial_id_serial1,path=/tmp/serial-serial1-20121004-115412-sLj47KEF,server,nowait -device isa-serial,chardev=serial_id_serial1 -device virtio-serial-pci,id=virtio_serial_pci0 -chardev socket,id=devvs1,path=/tmp/virtio_port-vs1-20121004-115412-sLj47KEF,server,nowait -device virtserialport,chardev=devvs1,name=com.redhat.spice.0,id=vs1,bus=virtio_serial_pci0.0 -chardev socket,id=devvs2,path=/tmp/virtio_port-vs2-20121004-115412-sLj47KEF,server,nowait -device virtserialport,chardev=devvs2,name=com.redhat.spice.1,id=vs2,bus=virtio_serial_pci0.0 -chardev socket,id=devvs3,path=/tmp/virtio_port-vs3-20121004-115412-sLj47KEF,server,nowait -device virtserialport,chardev=devvs3,name=com.redhat.spice.2,id=vs3,bus=virtio_serial_pci0.0 -chardev socket,id=devvs4,path=/tmp/virtio_port-vs4-20121004-115412-sLj47KEF,server,nowait -device virtserialport,chardev=devvs4,name=com.redhat.spice.3,id=vs4,bus=virtio_serial_pci0.0 -chardev socket,id=seabioslog_id_20121004-115412-sLj47KEF,path=/tmp/seabios-20121004-115412-sLj47KEF,server,nowait -device isa-debugcon,chardev=seabioslog_id_20121004-115412-sLj47KEF,iobase=0x402 -device ich9-usb-uhci1,id=usb1 -drive file='/tmp/kvm_autotest_root/images/f17-64.qcow2',index=0,if=ide,cache=none,snapshot=on -device virtio-net-pci,netdev=idbjSa34,mac='9a:13:14:15:16:17',id='idCjnNs4' -netdev tap,id=idbjSa34,fd=21 -m 512 -smp 1,cores=1,threads=1,sockets=1 -cpu 'Penryn' -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 -vnc :0 -vga std -rtc base=utc,clock=host,driftfix=none -boot order=cdn,once=c,menu=off -enable-kvm
Comment 5 Richard W.M. Jones 2012-10-07 06:51:13 EDT
A real serial port also loses data when you unplug it.
Comment 7 Cole Robinson 2013-07-02 11:09:39 EDT
Gal, please see Lukas' questions in Comment #6, basically is this expected behavior or a bug?
Comment 8 Cole Robinson 2013-07-02 11:10:14 EDT
*** Bug 863754 has been marked as a duplicate of this bug. ***
Comment 9 Cole Robinson 2013-07-02 11:13:59 EDT
Bug #863754, which I've duped to this, detailed similar issues for host->guest communication, see that bug for more info.
Comment 11 Cole Robinson 2013-07-03 11:06:08 EDT
Summarizing private comment: This is expected behavior of a serial channel, if you need reliability you should do it at the application level.

Closing as NOTABUG, but anyone feel free to reopen if I'm mistaken.
Comment 12 Fedora End Of Life 2013-07-03 18:32:41 EDT
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.