Bug 858687 - virtio_serialport changes the order of the data guest->host using 32bit Windows XP guest
virtio_serialport changes the order of the data guest->host using 32bit Windo...
Status: CLOSED NEXTRELEASE
Product: Fedora
Classification: Fedora
Component: virtio-win (Show other bugs)
17
x86_64 Linux
unspecified Severity unspecified
: ---
: ---
Assigned To: Gal Hammer
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-09-19 08:27 EDT by Lukas Doktor
Modified: 2013-08-22 05:38 EDT (History)
18 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-08-01 13:04:09 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Reproducer writen in C (1.04 KB, text/x-c++src)
2012-09-19 09:50 EDT, Lukas Doktor
no flags Details
Reproducer writen in C (8.00 KB, application/x-ms-dos-executable)
2012-09-19 09:51 EDT, Lukas Doktor
no flags Details
Qemu-cmd for reproducer writen in C (1.04 KB, text/x-sh)
2012-09-19 09:52 EDT, Lukas Doktor
no flags Details

  None (edit)
Description Lukas Doktor 2012-09-19 08:27:28 EDT
Description of problem:
When I send data from WinXP guest to Fedora 17 host the order of the messages is not kept. Also some messages are sent multiple times.

Version-Release number of selected component (if applicable):
Guest:
   Windows XP SP3
   virtio-win-0.1-30

Host:
   Fedora 17,
   kernel 3.5.3-1.fc17.x86_64
   qemu-kvm-1.0.1-1.fc17.x86_64


How reproducible:
lots of changed messages and 0.01-0.1% of messages is sent multiple times.

Steps to Reproduce:
1. Boot guest with virtio_serialport
2. Start sending 2 messages ('a', 'b') in a loop
3. Receive the data and watch the differences
  
Actual results:
The message order is not kept. (aaabbababbababbbab)
the number of received 'a' or 'b' is sometimes higher than number of sent 'a' and 'b'.

Expected results:
Receive what I sent in the right order. (abababababababab)
Comment 1 Lukas Doktor 2012-09-19 08:29:57 EDT
Simple reproducer:

[requirements]
* Install python and pywin32 in guest

[steps]
1) boot guest with virtio_serialport

/usr/bin/qemu-kvm -S -name vm1 -nodefaults -chardev
socket,id=hmp_id_hmp1,path=/tmp/monitor-hmp1-20120913-092500-jOJB65N
Y, se rve r,nowait -mon chardev=hmp_id_hmp1,mode=readline -chardev
socket,id=serial_id_20120913-092500-jOJB65NY,path=/tmp/serial-20120
913 -0 925 00-jOJB65NY,server,nowait -device
isa-serial,chardev=serial_id_20120913-092500-jOJB65NY -device
virtio-serial-pci,id=virtio_serial_pci0 -chardev
socket,id=devvc1,path=/tmp/virtio_port-vc1-20120913-092500-jOJB65NY,
se rv er, nowait -device
virtconsole,chardev=devvc1,name=com.redhat.spice.0,id=vc1,bus=virtio
_s er ial _pci0.0 -chardev
socket,id=devvc2,path=/tmp/virtio_port-vc2-20120913-092500-jOJB65NY,
se rv er, nowait -device
virtconsole,chardev=devvc2,name=com.redhat.spice.1,id=vc2,bus=virtio
_s er ial _pci0.0 -chardev
socket,id=seabioslog_id_20120913-092500-jOJB65NY,path=/tmp/seabios-2
01 20 913 -092500-jOJB65NY,server,nowait -device
isa-debugcon,chardev=seabioslog_id_20120913-092500-jOJB65NY,iobase=0
x4 02 -device ich9-usb-uhci1,id=usb1 -drive
file=/tmp/kvm_autotest_root/images/winXP-32.qcow2,index=0,if=ide,cac
he =n one -device
virtio-net-pci,netdev=id8yGdBy,mac=9a:a4:a5:a6:a7:a8,id=idPepC47
-netdev tap,id=id8yGdBy,fd=19 -m 512 -smp
1,cores=1,threads=1,sockets=1 -cpu Penryn -drive
file=/tmp/kvm_autotest_root/isos/windows/winutils.iso,media=cdrom,in
de x= 1 -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 -vnc :0
-vga std -rtc base=localtime,clock=host,driftfix=none -boot
order=cdn,once=c,menu=off -enable-kvm

2) open the port on host:
socat /tmp/virtio_port-vc1-20120913-092500-jOJB65NY -

3) run python:
C:\Python\python.exe

4) in python open the port and send data in loop:

from ctypes import *
from win32file import *
port = CreateFile("\\\\.\\com.redhat.spice.0", GENERIC_WRITE | GENERIC_READ, 0,
None, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, None)

for _ in range(512):
    WriteFile(port, 'a')
    WriteFile(port, 'b')

5) verify the output on guest. (you might want to use python or hexdump to analyze the output.
Comment 2 Lukas Doktor 2012-09-19 08:35:19 EDT
Keep in mind that some of the data are corrupted - see bug: https://bugzilla.redhat.com/show_bug.cgi?id=858674 for details.

This issue is replicable using smp=1 and smp=2 (using smp=2 leads to more frequent corruptions).

I reproduced this using host chardev unix socket (path=/tmp/port0) and posix (host=0.0.0.0).

time.sleep() > 0.001 between WriteFile solves this issue. When I use time.sleep() < 0.001 (included 0 sleep) doesn't help.

for _ in range(512):
    WriteFile(port, 'a')
    time.sleep(sleep_time)
    WriteFile(port, 'b')
    time.sleep(sleep_time)

I checked the outputs and non of the WriteFile failed.
Comment 3 Lukas Doktor 2012-09-19 08:39:30 EDT
About the multiple messages, the script above (for _ in range(512)) should produce 512*'a' and 512*'b'. It results usually around 485-535*'a' and 485-535*'b' with total length over 1024 (1030-1050).
S ome of those characers are corrupted so the total number of received items might be significatnly over the number of the sent characters.
Comment 4 Lukas Doktor 2012-09-19 09:36:22 EDT
I forgot to add another logged message. Sometimes when I use CreateFile() on the vport (port = CreateFile("\\\\.\\com.redhat.spice.0", GENERIC_WRITE | GENERIC_READ, 0, None, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, None)) I got error messages from qemu.

Log from autotest virtio_console test running 4 virtio_serialports only:


# This CreateFile() and DeleteHandle() on all of the listed ports.
09/19 15:25:24 DEBUG|kvm_virtio:0286| Executing 'virt.init([['com.redhat.spice.0', 'no'], ['com.redhat.spice.1', 'no'], ['com.redhat.spice.2', 'no'], ['com.redhat.spice.3', 'no']])' on virtio_console_guest.py, vm: vm1, timeout: 10

# And is followed with err messages on host. There are no failures on guest.
09/19 15:25:24 INFO |   aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Guest failure in adding device virtio_serial_pci0.0
09/19 15:25:24 INFO |   aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 3510448129 for device virtio_serial_pci0.0
09/19 15:25:24 INFO |   aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 3510448129 for device virtio_serial_pci0.0
09/19 15:25:24 INFO |   aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Guest failure in adding device virtio_serial_pci0.0
09/19 15:25:24 INFO |   aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 4292869288 for device virtio_serial_pci0.0
09/19 15:25:24 INFO |   aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Guest failure in adding device virtio_serial_pci0.0
09/19 15:25:24 INFO |   aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 4292869288 for device virtio_serial_pci0.0

# Once again, open the port.
09/19 15:25:24 DEBUG|kvm_virtio:0286| Executing 'virt.open('com.redhat.spice.0')' on virtio_console_guest.py, vm: vm1, timeout: 10

# Followed with error message on host only.
09/19 15:25:24 INFO |   aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Guest failure in adding device virtio_serial_pci0.0

# Test continues with cleanup. No errors occured in guest, port was opened and was ready for use.
09/19 15:25:24 DEBUG|     error:0082| Context: Executing test: test_open --> Cleaning virtio_ports.
09/19 15:25:24 DEBUG|virtio_con:0174| Cleaning virtio_ports
09/19 15:25:24 DEBUG|kvm_virtio:0286| Executing 'is_alive()' on virtio_console_guest.py, vm: vm1, timeout: 10
09/19 15:25:24 DEBUG|kvm_virtio:0286| Executing 'guest_exit()' on virtio_console_guest.py, vm: vm1, timeout: 10
09/19 15:25:24 INFO |   aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 0 for device virtio_serial_pci0.0
09/19 15:25:24 DEBUG|kvm_virtio:0112| Cleaning port Socket,com.redhat.spice.0,no,/tmp/virtio_port-vs1-20120919-151650-NKn1Htke,0
09/19 15:25:25 DEBUG|kvm_virtio:0110| No need to clean port Socket,com.redhat.spice.1,no,/tmp/virtio_port-vs2-20120919-151650-NKn1Htke,0
09/19 15:25:25 DEBUG|kvm_virtio:0110| No need to clean port Socket,com.redhat.spice.2,no,/tmp/virtio_port-vs3-20120919-151650-NKn1Htke,0
09/19 15:25:25 DEBUG|kvm_virtio:0110| No need to clean port Socket,com.redhat.spice.3,no,/tmp/virtio_port-vs4-20120919-151650-NKn1Htke,0


# Later in the testsuit there is a loopback test:
# Initialization followed with host err messages
09/19 15:26:14 DEBUG|kvm_virtio:0286| Executing 'virt.init([['com.redhat.spice.0', 'no'], ['com.redhat.spice.1', 'no'], ['com.redhat.spice.2', 'no'], ['com.redhat.spice.3', 'no']])' on 
virtio_console_guest.py, vm: vm1, timeout: 10
09/19 15:26:14 INFO |   aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 3506438145 for device virtio_serial_pci0.0
09/19 15:26:14 INFO |   aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 3506438145 for device virtio_serial_pci0.0
09/19 15:26:14 INFO |   aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 3506438145 for device virtio_serial_pci0.0
09/19 15:26:14 INFO |   aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Guest failure in adding device virtio_serial_pci0.0
09/19 15:26:14 INFO |   aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 4292869288 for device virtio_serial_pci0.0

# This creates initiates a loop from first to second port:
09/19 15:26:14 DEBUG|kvm_virtio:0286| Executing 'virt.loopback(['com.redhat.spice.0'], ['com.redhat.spice.1'], 1024, virt.LOOP_NONE)' on virtio_console_guest.py, vm: vm1, timeout: 10

# And is followed with err messages on host (guest works fine)
09/19 15:26:14 INFO |   aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Guest failure in adding device virtio_serial_pci0.0
09/19 15:26:14 INFO |   aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Guest failure in adding device virtio_serial_pci0.0

# Some data are send from host to first port and are received in host on the second port successfully

# Cleanup the test
09/19 15:26:14 DEBUG|kvm_virtio:0286| Executing 'virt.exit_threads()' on virtio_console_guest.py, vm: vm1, timeout: 3
09/19 15:26:17 WARNI|kvm_virtio:0322| Workaround the stuck thread on guest
09/19 15:26:17 INFO |   aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 4170362024 for device virtio_serial_pci0.0
09/19 15:26:17 INFO |   aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 4170362024 for device virtio_serial_pci0.0
09/19 15:26:17 DEBUG|kvm_virtio:0286| Executing 'print 'PASS: nothing'' on virtio_console_guest.py, vm: vm1, timeout: 10
09/19 15:26:18 DEBUG|     error:0082| Context: Executing test: test_basic_loopback --> Cleaning virtio_ports.
09/19 15:26:18 DEBUG|virtio_con:0174| Cleaning virtio_ports
09/19 15:26:18 DEBUG|kvm_virtio:0286| Executing 'is_alive()' on virtio_console_guest.py, vm: vm1, timeout: 10
09/19 15:26:18 DEBUG|kvm_virtio:0286| Executing 'guest_exit()' on virtio_console_guest.py, vm: vm1, timeout: 10
09/19 15:26:18 DEBUG|kvm_virtio:0112| Cleaning port Socket,com.redhat.spice.0,no,/tmp/virtio_port-vs1-20120919-151650-NKn1Htke,0
09/19 15:26:19 DEBUG|kvm_virtio:0112| Cleaning port Socket,com.redhat.spice.1,no,/tmp/virtio_port-vs2-20120919-151650-NKn1Htke,0
09/19 15:26:20 DEBUG|kvm_virtio:0110| No need to clean port Socket,com.redhat.spice.2,no,/tmp/virtio_port-vs3-20120919-151650-NKn1Htke,0
09/19 15:26:20 DEBUG|kvm_virtio:0110| No need to clean port Socket,com.redhat.spice.3,no,/tmp/virtio_port-vs4-20120919-151650-NKn1Htke,0
Comment 5 Lukas Doktor 2012-09-19 09:50:20 EDT
Created attachment 614397 [details]
Reproducer writen in C

Reproducer by Vadim Rozenfeld <vrozenfe@redhat.com>
Comment 6 Lukas Doktor 2012-09-19 09:51:07 EDT
Created attachment 614398 [details]
Reproducer writen in C

Reproducer by Vadim Rozenfeld <vrozenfe@redhat.com>
Comment 7 Lukas Doktor 2012-09-19 09:52:17 EDT
Created attachment 614400 [details]
Qemu-cmd for reproducer writen in C

Proposed qemu-cmd for reoriducer by Vadim Rozenfeld <vrozenfe@redhat.com>
Comment 8 Lukas Doktor 2012-09-19 09:54:57 EDT
I used the Vadim Rozenfeld's reproducer on my system using modified command (different network adapter, path to qemu and disk and snapshot on:

qemu-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name SERIAL -uuid d9388815-ddd3-c38e-33c2-a9d5fcc7a775 -monitor stdio -rtc base=localtime -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,multifunction=on,addr=0x5.0x0 -drive file='/tmp/kvm_autotest_root/images/winXP-32.qcow2',if=none,id=drive-ide0-0-0,format=qcow2 -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev user,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=22:3A:40:3F:2F:12,bus=pci.0,multifunction=on,addr=0x3.0x0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,server,nowait,host=0.0.0.0,port=12345,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -usb -device usb-tablet,id=input0 -enable-kvm -snapshot

And I still got the same amount of corrutions.
Comment 9 Vadim Rozenfeld 2012-09-23 08:33:34 EDT
Hi Lukas,
Could you please try the recent drivers from build-37, available at 
http://download.devel.redhat.com/brewroot/packages/virtio-win-prewhql/0.1/37/win/virtio-win-prewhql-0.1.zip ?

Thank you,
Vadim.
Comment 10 Lukas Doktor 2012-09-24 05:55:06 EDT
Hi Vadim,

very nice. I tried "ab"*4096 without any corruptions and mis-queued characters. Also with this patch I don't have problems with missing ports anymore.

I tried more tests and the queue seems fine, although some characters are corrupted (bz 858674) - https://bugzilla.redhat.com/show_bug.cgi?id=858674
Comment 11 Lukas Doktor 2012-09-24 05:55:26 EDT
1) host-> guest - no errors observed

on host:
I created a file using 'ab'*10240 (so basically lots of 'ababababab') and executed this command:
while :; do cat /tmp/huge-ab-file | sudo socat /tmp/virtio_port-vs1-20120924-093501-0yc6VTGu - ; done

on guest:
I executed python and run a loop which received 2 characters and verified, whether it's 'ab':

from ctypes import *
from win32file import *
import time
port = CreateFile("\\\\.\\com.redhat.spice.0", GENERIC_WRITE | GENERIC_READ, 0, None, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, None)

i = 0
while True:
    if not i % 1024:
        print i
    try:
        if ReadFile(port, 2)[1] != "ab":
            print "ERROR: %s" % time.time()
    except error:
        continue
    i += 1

(first I tried it without 'i' to maximize the stress, then I added it to verify I'm not only getting receive errors. The try-catch is necessarily because when there are no data I get an read error...)

Anyway this worked brilliantly, so the host->guest part works fine.
Comment 12 Lukas Doktor 2012-09-24 05:56:40 EDT
2) guest->host - corruptions keep happening

I made very basic test which only sends 'ab' and use hexdump which ommits the output when the lines are matching:
on host:
I only execute hexdump:
sudo socat /tmp/virtio_port-vs1-20120924-093501-0yc6VTGu - | hexdump

on guest:
Send 'ab' over the first serialport.

from ctypes import *
from win32file import *
port = CreateFile("\\\\.\\com.redhat.spice.0", GENERIC_WRITE | GENERIC_READ, 0, None, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, None)

while True:
    _ = WriteFile(port, 'ab')

sudo socat /tmp/virtio_port-vs1-20120924-093501-0yc6VTGu - | hexdump 
0000000 6261 6261 6261 6261 6261 6261 6261 6261
*
000dbb0 6261 6261 6261 d948 d948 6261 6261 6261
000dbc0 6261 6261 6261 6261 6261 6261 6261 6261
*
00baef0 6261 6261 6261 6261 6261 8618 6261 6261
00baf00 6261 6261 6261 6261 6261 6261 6261 6261
*
00bc7b0 6261 6261 6261 6261 6261 6261 6261 8618
00bc7c0 6261 6261 6261 6261 6261 6261 6261 6261
*
00c9a30 6261 6261 60a8 60a8 6261 6261 6261 6261
00c9a40 6261 6261 6261 6261 6261 6261 6261 6261
*
Comment 13 Lukas Doktor 2012-09-24 05:57:10 EDT
(In reply to comment #12)
> 2) guest->host - corruptions keep happening
> 
> I made very basic test which only sends 'ab' and use hexdump which ommits
> the output when the lines are matching:
> on host:
> I only execute hexdump:
> sudo socat /tmp/virtio_port-vs1-20120924-093501-0yc6VTGu - | hexdump
> 
> on guest:
> Send 'ab' over the first serialport.
> 
> from ctypes import *
> from win32file import *
> port = CreateFile("\\\\.\\com.redhat.spice.0", GENERIC_WRITE | GENERIC_READ,
> 0, None, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, None)
> 
> while True:
>     _ = WriteFile(port, 'ab')
> 
> sudo socat /tmp/virtio_port-vs1-20120924-093501-0yc6VTGu - | hexdump 
> 0000000 6261 6261 6261 6261 6261 6261 6261 6261
> *
> 000dbb0 6261 6261 6261 d948 d948 6261 6261 6261
> 000dbc0 6261 6261 6261 6261 6261 6261 6261 6261
> *
> 00baef0 6261 6261 6261 6261 6261 8618 6261 6261
> 00baf00 6261 6261 6261 6261 6261 6261 6261 6261
> *
> 00bc7b0 6261 6261 6261 6261 6261 6261 6261 8618
> 00bc7c0 6261 6261 6261 6261 6261 6261 6261 6261
> *
> 00c9a30 6261 6261 60a8 60a8 6261 6261 6261 6261
> 00c9a40 6261 6261 6261 6261 6261 6261 6261 6261
> *

I tried stressing cpu without significant error-rate changes. When I started stressing disk using iozone the error rate increased significantly.
Comment 15 Cole Robinson 2013-07-02 11:01:59 EDT
Lukas, is this still a problem with latest fedora and virtio-win-prewhql sources? (it's 0.65 internally)
Comment 16 Fedora End Of Life 2013-07-04 01:49:26 EDT
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.
Comment 17 Lukas Doktor 2013-07-08 10:46:12 EDT
Hi Cole,

I'm sorry but currently I have no time to test this. Would you please take a look on it yourself? There is a simple reproducer in Comment 1.

Regards,
Lukáš
Comment 18 Fedora End Of Life 2013-08-01 13:04:23 EDT
Fedora 17 changed to end-of-life (EOL) status on 2013-07-30. Fedora 17 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.
Comment 19 Mike Cao 2013-08-22 05:26:23 EDT
Seems It has been fixed in virtio-win-prwehql-67
Comment 20 Lukas Doktor 2013-08-22 05:38:52 EDT
Yes, the situation in virtio-win-prewehql-67 improved a lot. I haven't encounter data corruption while running the test for 1 our. So I think this bug status should be NEXTRELEASE. (please correct me if I'm wrong)

Note You need to log in before you can comment on or make changes to this bug.