Bug 863753 - virtio_serialport data loss when hot-unplugging and re-plugging the port (guest->host and host->guest)
Summary: virtio_serialport data loss when hot-unplugging and re-plugging the port (gue...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: virtio-win
Version: 17
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Gal Hammer
QA Contact: Virtualization Bugs
URL:
Whiteboard:
: 863754 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-10-07 08:16 UTC by Lukáš Doktor
Modified: 2013-07-03 22:32 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-07-03 15:06:08 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
guest sender script (sends A, B, C, D..., resends when send fails) (648 bytes, text/x-python)
2012-10-07 08:18 UTC, Lukáš Doktor
no flags Details
host receiver script (reads the port and verifies A, B, C, D, ... is received correctly. Reopens the port when read fails) (734 bytes, text/x-python)
2012-10-07 08:19 UTC, Lukáš Doktor
no flags Details

Description Lukáš Doktor 2012-10-07 08:16:41 UTC
Description of problem:
Hi guys,

I'm developing an interrupted loopback test. I managed to go around the problem with hot-plugging of the incorrectly uninitialized port ( https://bugzilla.redhat.com/show_bug.cgi?id=796048 ) but some data are lost between port replugs even thought the send/recv commands passed (I'm resending data in cas send fails).

I created simple reproducers in pyton, this one is for guest->host (data loss)
I'm sending data from guest to host, than I unplug the port, replug it back and continue in sending. Few of the successfully sent data are missing on the other side.

Version-Release number of selected component (if applicable):
HOST:
kernel-3.5.3-1.fc17.x86_64
qemu-kvm-1.0.1-1.fc17.x86_64
GUEST:
kernel-3.5.4-2.fc17.x86_64

How reproducible:
10-20% (see the log for details)

Steps to Reproduce:
1) start sending data from guest (run sender.py on guest)
2) receive data on host (run listener.py on host)
3) unplug/replug the port (eg. MON=/tmp/monitor-hmp1-20121004-115412-sLj47KEF ; while :; do echo device_del vs1 | sudo socat $MON - ; sleep 5 ; echo 'device_add virtserialport,id=vs1,chardev=devvs1,nr=1,name=com.redhat.spice.0' | sudo socat $MON - ; sleep 5 ; done )
5) see the error messages

  
Actual results:
Error messages informing about how much data were successfully sent from guest, but were not received on host.

Expected results:
send/recv should report failure for all data which were not transferred. So no data loss should be visible using the simple reproducer.

Comment 1 Lukáš Doktor 2012-10-07 08:18:27 UTC
Created attachment 622915 [details]
guest sender script (sends A, B, C, D..., resends when send fails)

Comment 2 Lukáš Doktor 2012-10-07 08:19:53 UTC
Created attachment 622916 [details]
host receiver script (reads the port and verifies A, B, C, D, ... is received correctly. Reopens the port when read fails)

Comment 3 Lukáš Doktor 2012-10-07 08:22:34 UTC
[5s between replug]
skipped: 3456789ABCDEFGHIJK (waiting for L)
skipped: EFGHIJKLMNOPQRSTUV (waiting for W)
skipped: STUVWXYZ0123456789 (waiting for A)



[1s between replug]
skipped: LMNOPQRSTUVWXYZ0123 (waiting for 4)
skipped: UVWXYZ0123456789ABCDEFGHIJKLMNOPQR (waiting for S)
skipped: YZ0123456789ABCDEFG (waiting for H)
skipped: CDEFGHIJKLMNOPQRSTUV (waiting for W)
skipped: OPQRSTUVWX (waiting for Y)



[immediate replug]
skipped: JKLMNOPQRSTUVWXY (waiting for Z)
skipped: STUVWXYZ01234567 (waiting for 8)
skipped: UVWXYZ0123456789 (waiting for A)
skipped: WXYZ0123456789ABCDEF (waiting for G)
skipped: 3456789ABCDEFGHIJKLMNOPQRSTUVW (waiting for X)
skipped: QRSTUVWXYZ01234 (waiting for 5)
skipped: NOPQRSTUVWXYZ012 (waiting for 3)
skipped: YZ0123456789ABCD (waiting for E)
skipped: GHIJKLMNOPQRSTUVWXYZ0123456789A (waiting for B)
skipped: HIJKLMNOPQRSTUVWXYZ (waiting for 0)
skipped: 789ABCDEFGHIJKLMN (waiting for O)
skipped: KLMNOPQRSTUVWX (waiting for Y) 
skipped: JKLMNOPQRSTUVWXYZ (waiting for 0)


Not every reconnect fails:
5s sleep - 1 out of 10 failed
2s sleep - 3 out of 10 failed
1s sleep - 5 out of 10 failed

This output was generated using 1 char buffers. With longer buffers less of them are lost. With buffers over 10 characters long only 1 packet lose was observed (not always).

Comment 4 Lukáš Doktor 2012-10-07 08:43:19 UTC
Sorry, I forgot to add qemu-cmdline. It was generated by autotest:

/usr/bin/qemu-kvm -S -name 'vm1' -nodefaults -chardev socket,id=hmp_id_hmp1,path=/tmp/monitor-hmp1-20121004-115412-sLj47KEF,server,nowait -mon chardev=hmp_id_hmp1,mode=readline -chardev socket,id=serial_id_serial1,path=/tmp/serial-serial1-20121004-115412-sLj47KEF,server,nowait -device isa-serial,chardev=serial_id_serial1 -device virtio-serial-pci,id=virtio_serial_pci0 -chardev socket,id=devvs1,path=/tmp/virtio_port-vs1-20121004-115412-sLj47KEF,server,nowait -device virtserialport,chardev=devvs1,name=com.redhat.spice.0,id=vs1,bus=virtio_serial_pci0.0 -chardev socket,id=devvs2,path=/tmp/virtio_port-vs2-20121004-115412-sLj47KEF,server,nowait -device virtserialport,chardev=devvs2,name=com.redhat.spice.1,id=vs2,bus=virtio_serial_pci0.0 -chardev socket,id=devvs3,path=/tmp/virtio_port-vs3-20121004-115412-sLj47KEF,server,nowait -device virtserialport,chardev=devvs3,name=com.redhat.spice.2,id=vs3,bus=virtio_serial_pci0.0 -chardev socket,id=devvs4,path=/tmp/virtio_port-vs4-20121004-115412-sLj47KEF,server,nowait -device virtserialport,chardev=devvs4,name=com.redhat.spice.3,id=vs4,bus=virtio_serial_pci0.0 -chardev socket,id=seabioslog_id_20121004-115412-sLj47KEF,path=/tmp/seabios-20121004-115412-sLj47KEF,server,nowait -device isa-debugcon,chardev=seabioslog_id_20121004-115412-sLj47KEF,iobase=0x402 -device ich9-usb-uhci1,id=usb1 -drive file='/tmp/kvm_autotest_root/images/f17-64.qcow2',index=0,if=ide,cache=none,snapshot=on -device virtio-net-pci,netdev=idbjSa34,mac='9a:13:14:15:16:17',id='idCjnNs4' -netdev tap,id=idbjSa34,fd=21 -m 512 -smp 1,cores=1,threads=1,sockets=1 -cpu 'Penryn' -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 -vnc :0 -vga std -rtc base=utc,clock=host,driftfix=none -boot order=cdn,once=c,menu=off -enable-kvm

Comment 5 Richard W.M. Jones 2012-10-07 10:51:13 UTC
A real serial port also loses data when you unplug it.

Comment 7 Cole Robinson 2013-07-02 15:09:39 UTC
Gal, please see Lukas' questions in Comment #6, basically is this expected behavior or a bug?

Comment 8 Cole Robinson 2013-07-02 15:10:14 UTC
*** Bug 863754 has been marked as a duplicate of this bug. ***

Comment 9 Cole Robinson 2013-07-02 15:13:59 UTC
Bug #863754, which I've duped to this, detailed similar issues for host->guest communication, see that bug for more info.

Comment 11 Cole Robinson 2013-07-03 15:06:08 UTC
Summarizing private comment: This is expected behavior of a serial channel, if you need reliability you should do it at the application level.

Closing as NOTABUG, but anyone feel free to reopen if I'm mistaken.

Comment 12 Fedora End Of Life 2013-07-03 22:32:41 UTC
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.


Note You need to log in before you can comment on or make changes to this bug.