Bug 858674
Summary: | virtio_serialport corrupts the data guest->host using 32bit Windows XP guest | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Community] Virtualization Tools | Reporter: | Lukáš Doktor <ldoktor> | ||||||||||
Component: | virtio-win | Assignee: | Gal Hammer <ghammer> | ||||||||||
Status: | CLOSED WORKSFORME | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||
Priority: | unspecified | ||||||||||||
Version: | unspecified | CC: | amit.shah, berrange, cfergeau, chayang, crobinso, dwmw2, ghammer, itamar, juzhang, knoel, ldoktor, mdeng, michen, pbonzini, qzhang, rjones, scottt.tw, virt-maint, vrozenfe, yvugenfi | ||||||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||||||
Target Release: | --- | ||||||||||||
Hardware: | x86_64 | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | virtio-win-prewhql-0.1-67 | Doc Type: | Bug Fix | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2016-07-20 11:53:43 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Description
Lukáš Doktor
2012-09-19 11:58:10 UTC
Simple reproducer: [requirements] * Install python and pywin32 in guest [steps] 1) boot guest with virtio_serialport /usr/bin/qemu-kvm -S -name vm1 -nodefaults -chardev socket,id=hmp_id_hmp1,path=/tmp/monitor-hmp1-20120913-092500-jOJB65N Y, se rve r,nowait -mon chardev=hmp_id_hmp1,mode=readline -chardev socket,id=serial_id_20120913-092500-jOJB65NY,path=/tmp/serial-20120 913 -0 925 00-jOJB65NY,server,nowait -device isa-serial,chardev=serial_id_20120913-092500-jOJB65NY -device virtio-serial-pci,id=virtio_serial_pci0 -chardev socket,id=devvc1,path=/tmp/virtio_port-vc1-20120913-092500-jOJB65NY, se rv er, nowait -device virtconsole,chardev=devvc1,name=com.redhat.spice.0,id=vc1,bus=virtio _s er ial _pci0.0 -chardev socket,id=devvc2,path=/tmp/virtio_port-vc2-20120913-092500-jOJB65NY, se rv er, nowait -device virtconsole,chardev=devvc2,name=com.redhat.spice.1,id=vc2,bus=virtio _s er ial _pci0.0 -chardev socket,id=seabioslog_id_20120913-092500-jOJB65NY,path=/tmp/seabios-2 01 20 913 -092500-jOJB65NY,server,nowait -device isa-debugcon,chardev=seabioslog_id_20120913-092500-jOJB65NY,iobase=0 x4 02 -device ich9-usb-uhci1,id=usb1 -drive file=/tmp/kvm_autotest_root/images/winXP-32.qcow2,index=0,if=ide,cac he =n one -device virtio-net-pci,netdev=id8yGdBy,mac=9a:a4:a5:a6:a7:a8,id=idPepC47 -netdev tap,id=id8yGdBy,fd=19 -m 512 -smp 1,cores=1,threads=1,sockets=1 -cpu Penryn -drive file=/tmp/kvm_autotest_root/isos/windows/winutils.iso,media=cdrom,in de x= 1 -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 -vnc :0 -vga std -rtc base=localtime,clock=host,driftfix=none -boot order=cdn,once=c,menu=off -enable-kvm 2) open the port on host: socat /tmp/virtio_port-vc1-20120913-092500-jOJB65NY - 3) run python: C:\Python\python.exe 4) in python open the port and send data in loop: from ctypes import * from win32file import * port = CreateFile("\\\\.\\com.redhat.spice.0", GENERIC_WRITE | GENERIC_READ, 0, None, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, None) for _ in range(1024): WriteFile(port, 'a') 5) verify that there are different characters than 'a' in the output on host. Corruptions have the same value over the time and changes only with guest port reconnection (close the port, open it again and send data). This issue is replicable using smp=1 and smp=2 (using smp=2 leads to more frequent corruptions). The corrupted character doesn't have anything to do with the input value. I tried random input, constant input, pseudoconstant input and the corruptions always kept the value between port reconnections. I reproduced this using host chardev unix socket (path=/tmp/port0) and posix (host=0.0.0.0). I forgot to add another logged message. Sometimes when I use CreateFile() on the vport (port = CreateFile("\\\\.\\com.redhat.spice.0", GENERIC_WRITE | GENERIC_READ, 0, None, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, None)) I got error messages from qemu. Log from autotest virtio_console test running 4 virtio_serialports only: # This CreateFile() and DeleteHandle() on all of the listed ports. 09/19 15:25:24 DEBUG|kvm_virtio:0286| Executing 'virt.init([['com.redhat.spice.0', 'no'], ['com.redhat.spice.1', 'no'], ['com.redhat.spice.2', 'no'], ['com.redhat.spice.3', 'no']])' on virtio_console_guest.py, vm: vm1, timeout: 10 # And is followed with err messages on host. There are no failures on guest. 09/19 15:25:24 INFO | aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Guest failure in adding device virtio_serial_pci0.0 09/19 15:25:24 INFO | aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 3510448129 for device virtio_serial_pci0.0 09/19 15:25:24 INFO | aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 3510448129 for device virtio_serial_pci0.0 09/19 15:25:24 INFO | aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Guest failure in adding device virtio_serial_pci0.0 09/19 15:25:24 INFO | aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 4292869288 for device virtio_serial_pci0.0 09/19 15:25:24 INFO | aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Guest failure in adding device virtio_serial_pci0.0 09/19 15:25:24 INFO | aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 4292869288 for device virtio_serial_pci0.0 # Once again, open the port. 09/19 15:25:24 DEBUG|kvm_virtio:0286| Executing 'virt.open('com.redhat.spice.0')' on virtio_console_guest.py, vm: vm1, timeout: 10 # Followed with error message on host only. 09/19 15:25:24 INFO | aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Guest failure in adding device virtio_serial_pci0.0 # Test continues with cleanup. No errors occured in guest, port was opened and was ready for use. 09/19 15:25:24 DEBUG| error:0082| Context: Executing test: test_open --> Cleaning virtio_ports. 09/19 15:25:24 DEBUG|virtio_con:0174| Cleaning virtio_ports 09/19 15:25:24 DEBUG|kvm_virtio:0286| Executing 'is_alive()' on virtio_console_guest.py, vm: vm1, timeout: 10 09/19 15:25:24 DEBUG|kvm_virtio:0286| Executing 'guest_exit()' on virtio_console_guest.py, vm: vm1, timeout: 10 09/19 15:25:24 INFO | aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 0 for device virtio_serial_pci0.0 09/19 15:25:24 DEBUG|kvm_virtio:0112| Cleaning port Socket,com.redhat.spice.0,no,/tmp/virtio_port-vs1-20120919-151650-NKn1Htke,0 09/19 15:25:25 DEBUG|kvm_virtio:0110| No need to clean port Socket,com.redhat.spice.1,no,/tmp/virtio_port-vs2-20120919-151650-NKn1Htke,0 09/19 15:25:25 DEBUG|kvm_virtio:0110| No need to clean port Socket,com.redhat.spice.2,no,/tmp/virtio_port-vs3-20120919-151650-NKn1Htke,0 09/19 15:25:25 DEBUG|kvm_virtio:0110| No need to clean port Socket,com.redhat.spice.3,no,/tmp/virtio_port-vs4-20120919-151650-NKn1Htke,0 # Later in the testsuit there is a loopback test: # Initialization followed with host err messages 09/19 15:26:14 DEBUG|kvm_virtio:0286| Executing 'virt.init([['com.redhat.spice.0', 'no'], ['com.redhat.spice.1', 'no'], ['com.redhat.spice.2', 'no'], ['com.redhat.spice.3', 'no']])' on virtio_console_guest.py, vm: vm1, timeout: 10 09/19 15:26:14 INFO | aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 3506438145 for device virtio_serial_pci0.0 09/19 15:26:14 INFO | aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 3506438145 for device virtio_serial_pci0.0 09/19 15:26:14 INFO | aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 3506438145 for device virtio_serial_pci0.0 09/19 15:26:14 INFO | aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Guest failure in adding device virtio_serial_pci0.0 09/19 15:26:14 INFO | aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 4292869288 for device virtio_serial_pci0.0 # This creates initiates a loop from first to second port: 09/19 15:26:14 DEBUG|kvm_virtio:0286| Executing 'virt.loopback(['com.redhat.spice.0'], ['com.redhat.spice.1'], 1024, virt.LOOP_NONE)' on virtio_console_guest.py, vm: vm1, timeout: 10 # And is followed with err messages on host (guest works fine) 09/19 15:26:14 INFO | aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Guest failure in adding device virtio_serial_pci0.0 09/19 15:26:14 INFO | aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Guest failure in adding device virtio_serial_pci0.0 # Some data are send from host to first port and are received in host on the second port successfully # Cleanup the test 09/19 15:26:14 DEBUG|kvm_virtio:0286| Executing 'virt.exit_threads()' on virtio_console_guest.py, vm: vm1, timeout: 3 09/19 15:26:17 WARNI|kvm_virtio:0322| Workaround the stuck thread on guest 09/19 15:26:17 INFO | aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 4170362024 for device virtio_serial_pci0.0 09/19 15:26:17 INFO | aexpect:0786| [qemu output] qemu-kvm: virtio-serial-bus: Unexpected port id 4170362024 for device virtio_serial_pci0.0 09/19 15:26:17 DEBUG|kvm_virtio:0286| Executing 'print 'PASS: nothing'' on virtio_console_guest.py, vm: vm1, timeout: 10 09/19 15:26:18 DEBUG| error:0082| Context: Executing test: test_basic_loopback --> Cleaning virtio_ports. 09/19 15:26:18 DEBUG|virtio_con:0174| Cleaning virtio_ports 09/19 15:26:18 DEBUG|kvm_virtio:0286| Executing 'is_alive()' on virtio_console_guest.py, vm: vm1, timeout: 10 09/19 15:26:18 DEBUG|kvm_virtio:0286| Executing 'guest_exit()' on virtio_console_guest.py, vm: vm1, timeout: 10 09/19 15:26:18 DEBUG|kvm_virtio:0112| Cleaning port Socket,com.redhat.spice.0,no,/tmp/virtio_port-vs1-20120919-151650-NKn1Htke,0 09/19 15:26:19 DEBUG|kvm_virtio:0112| Cleaning port Socket,com.redhat.spice.1,no,/tmp/virtio_port-vs2-20120919-151650-NKn1Htke,0 09/19 15:26:20 DEBUG|kvm_virtio:0110| No need to clean port Socket,com.redhat.spice.2,no,/tmp/virtio_port-vs3-20120919-151650-NKn1Htke,0 09/19 15:26:20 DEBUG|kvm_virtio:0110| No need to clean port Socket,com.redhat.spice.3,no,/tmp/virtio_port-vs4-20120919-151650-NKn1Htke,0 I tried this with new http://download.devel.redhat.com/brewroot/packages/virtio-win-prewhql/0.1/37/win/virtio-win-prewhql-0.1.zip drivers. The number of corruptions is lower, but the problem still persists: I made very basic test which only sends the same character and use hexdump which ommits the output when the lines are matching: on host: I only execute hexdump: sudo socat /tmp/virtio_port-vs1-20120924-093501-0yc6VTGu - | hexdump on guest: I write the same charater to the port: from ctypes import * from win32file import * port = CreateFile("\\\\.\\com.redhat.spice.0", GENERIC_WRITE | GENERIC_READ, 0, None, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, None) while True: _ = WriteFile(port, 'a') After few seconds there were corruptions. You can see it in the output: sudo socat /tmp/virtio_port-vs1-20120924-093501-0yc6VTGu - | hexdump 0000000 6161 6161 6161 6161 6161 6161 6161 6161 * 00093a0 6161 6161 6161 6161 6161 6161 f061 6161 00093b0 6161 6161 6161 6161 6161 6161 6161 6161 * 00266b0 6161 6161 6161 6161 6161 6161 6161 f061 00266c0 6161 6161 6161 6161 6161 6161 6161 6161 * 006c930 6161 6161 6161 6161 6161 6161 6118 6161 006c940 6161 6161 6161 6161 6161 6161 6161 6161 * 006f0e0 6161 6118 6161 6161 6161 6161 6161 6161 006f0f0 6161 6161 6161 6161 6161 6161 6161 6161 * 00b8190 6161 6161 6161 6161 6161 6161 6161 61f0 00b81a0 6161 6161 6161 6161 6161 6161 6161 6161 * 00e0870 6161 f061 6161 6161 6161 6161 6161 6161 00e0880 6161 6161 6161 6161 6161 6161 6161 6161 * ^C I tried stressing cpu without significant error-rate changes. When I started stressing disk using iozone the error rate increased significantly. Similar test from host to guest works without errors. I tried the autotest loopback test and it failed. I used simple variant with 2 serialports - vs1 and vs2. The test sends random data host->guest over vs1 and in guest reads the data and resends them back over vs2. Host machine then compares the differences between queue and received data. With this setup using 16 characters buffers (for simplicity I set all send/recv buffers to the same value) it failed to receive 7664th using SMP guest and 13104th character using UP guest. The same setup with 1 character buffer and UP guest the 1007th char failed. I tried the same setup with host->guest buffer 1 and guest->host buffer 1024 (this way guest is not stressed that much plus host don't have the time to send too much data). This way it's easier to analyze the failure. Queue contains only not yet received character sent from host: Queue: '\x9b\xae\x8e\xcb\xcc\xbb\x1c\xe9\xa1\xef\xcc\xe68\xed\xc3\xba}\xc2\x047\xa9V\xb2\xfdj\x12b4\xe1\x8e\x00\xaa\r\xe7\xb1\x15,]/$R\xdcf\x02\xb5\x84\xc2\xd3\xdah\xbb\x10m\\*T\xe8Dk\xa3{n\xeb\x96R\x05\xea#\xcc0\xda)R\x86\'h\x00\x83\x16\xf4\xa5\xc9b\xe1\x1a\xa9\x14u(l;\xc9e\x1a+\x83\xbc\xee\xfa4~\x9ap\x85q5sr\xaa\xb7\x83T\xfc\xe8~\xdc\xf36\xb6\x8d)\xfaI\xe5sG\xa0\xc0\xab\xd8\xf9\xee\xfc\xee\xc2"[S.\x1687\xc8\xe9\x84\xadN\xa4|\xae\xe4\xa3V,Rm%mM$\x13\xa1\xf4i\xb1Z&\t\xec\xd6\x85\xf5\xb0\xa2\xed\x14i\x9c\x83\xc1(\x99[r\xe2Z\xa6\xa1{V;\x8b\xfa\x9c\x98\x84\xc4\x04\t)\xf5K\xe3\xcb\xf1\xd7W\xa1\x1a\xb8x8\xfdj\xfa}\xc6\x0bc\xf7[\x9eP\x07\xf5}KB\x10b\x05\x0eUf\x15\xf7A~\xc6\xae%<g\xe3\xc4\xcc\x0e#,W"\xa5\xa6\x1b|\xd4d\xa6v\x18\xd2[\xccC\xaef\xf4\x0b\xbe(\x8f\xdf=JQ\x8d\xcf\xc8\x8et\xea]\xb7\xa8\x9d\x88P\xde\xc3\xee\x0c\xdf\xc8\xd1\xdb\x8b\x17d\xd1\xdb-\xab\x037~)k\xf6\x96Icj\xda\x8dGXBP9\xc1\x93s#j\xf1K\x18t\xe6]x"Mw\x08!\x0c\xe2\xf6\n~>2GL\x0c\xd4\x8a]o\xf8\xcc\x90:\x96T\xdd\x14\x9b\xda\xc9\x12\xe7\x1fo;\xec\x9f/\xb1\xc1\x9d$\xef\xed+\x83#\xfa\x8e\xb6\x8a\x95~\x0c\x9a\x99\x82W\x9db\x03c\x9c?,\xc1\xa9\xacV\xfb\xb0m\xbf\x15\x03G5\x92\xd8i\xb4\xd1\x08;\x0e}[lj\xfb\xd1g\xa3\xa8.\x88A`&\x80M\xe34\xe5\x14\xe6}\x1fEW\xec\x19\x18M\x85\xabv\x03r\xc6\x11\x9c\x0f\xeb\xb0\x07&\x9c\x96\x9e\\\xd1pm\xb4Q\xac\x82d>9\x9b\xe0\xa2\x96\xfe|\x9f3Je\x91\x00\xe7\x06\x9eLI\x9b\xbb:\x9b\x08,\x02\xc7\xe4\xa9\x90\xab\xf6V\xe2\x19GQ\xca\x01\xe9{\xa3\xd3\x930m\x12\x05I-\x89jGi\xf6\x8b\xc3\x8d\x8f#\x13\xbdm5\xe2\x96\xd5#\x06\xbe\xee\x1e`\xe9\x06\xd0Xb.4\x05\xf1I\xc4\xf0{S\x9cR\xb7L\x83ALTOAN77\xf3\xea(\xc24\xabQ\xfb\xd5\xd4\x980\x100Q\x97\'+\x10\xbfH\xcf\xee\xfd\xe0a\xf4\xcey\xda1\x9e\x86\xc4gU\xce:82H\xd9$I\x8a\xb7S\x888\x17\xf5\xbc3-\xb2\xf0\xc0' Received from guest: '\xc0GD\x82\xcb\xcc\xbb\x1c\xe9\xa1\xea#\xcc\x00\x96R\x05\xea#\xcc\x00\x96R\x05\xea#\xcc\x00\x96R\xea#\xcc\x00\x96R\x05,\x88\xbdI\x82\xdcf\x02\xb5\x84\xc2\xd3\xdah\xbb\x10m\\*T\xe8Dk\xea#\xcc\x00\x96R\x05\xea#\xcc' You can see that the first 3 characters are corrupted. The 10th character is corrupted too and so on. The guest sometimes become unresponsive. Hi Lukas, The latest virtio-win RHEL (and fedora) serial drivers include many bug fixes, including data corruption fixes. Can you please retest them. Thanks, Ronen. Hi Ronen, well the situation is much better. I executed the loop over night with those results: host: RHEL6-nightely, 2 cores, 5G ram guest: WinXP.sp3, vio drivers 58, 1GB mem, 4 serialports (vs1-vs4), 4 consoles (vc1-vc4) guests were destroyed and created between tests. cases: send data from host via $port1 using $port1_buf send lenght to guest. Guest reads the data per $buf_len and resends via $portXs. Host receives the data on $portXs using $portX_buf recv lenght. Keep t ransfering data for 1 hour and than stop the sender, recv and loopback threads. $case_name = $port1@$port1_buf => $port2@$port2_buf $port3@$port3_buf ... $buf_len serialport_small = vs1@4 => vs2@2 vs3@4 vs4@8 8 serialport_big = vs1@16384 => vs2@2048 vs3@4096 vs4@8192 8192 console_small = vc1@4 => vc2@2 vc3@4 vc4@8 8 console_big = vc1@16384 => vc2@2048 vc3@4096 vc4@8192 8192 mixed_small = vs1@4 => vc1@2 vs2@4 vc2@8 8 mixed_big = vc1@16384 => vs1@2048 vc2@4096 vs2@8192 8192 VIO setup: spread$num = how many ports on a single virtio_pci (0=all ports on the first PCI) spread0 = virtio_serial_pci0 -> vs1, vs2, vs3, vs4, vc1, vc2, vc3, vc4 spread1 = virtio_serial_pci0 -> vs1; virtio_serial_pci1 -> vs2, ... virtio_serial_pci7 -> vc4 spread2 = virtio_serial_pci0 -> vs1, vs2; virtio_serial_pci1 -> vs3, vs4; ... ... RESULTS: $case | $time| $transfered | $reason spread0.console_big | 35 | 1810176 | qemu coredump spread0.console_small | 3600 | 26059712 | unable to stop python, Win responsive spread0.mixed_big | 9 | 1220352 | incorrect recv character in vs2 spread0.mixed_small | 3600 | 28137584 | unable to stop python, Win responsive spread0.serialport_big | 38 | 5758720 | incorrect recv character in vs2 spread0.serialport_small | 3600 | 35790808 | unable to stop python, Win responsive spread1.* | - | - | Couldn't login to VM spread2.* | 0 | 0 | Fail to open vs3, vs4, vc3, vc4 spread3.* | - | - | Couldn't login to VM spread4.* | - | - | Couldn't login to VM this means that with smaller buffers VIO works fine (apart from cleanup, I will take a look whether it's VIO or python issue). With bigger buffers there are still failures. I can send you the complete logs, if you are interested. NOTE: I tried the spread2 on Fedora 17 with upstream qemu 1.3.91 and only vc3 and vc4 were missing. OK the problem with "Couldn't login to VM" is caused by new hardware wizard, which doesn't install virtio network drivers automatically. Still when I install all the drivers and rerun the test it fails with: SPREAD1: (always) pci0: vc1 pci1: vc2 pci2: vc3 pci3: vc4 pci4: vs1 pci5: vs2 pci6: vs3 pvi7: vs4 [qemu output] qemu-system-x86_64: virtio-serial-bus: Unexpected port id 1 for device virtio_serial_pci2.0 Fail to open port vs2 vc2 (sometimes the Unexpected port was on virtio_serial_pci0.0) SPREAD2: (always) pci0: vc1 vc2 pci1: vc3 vc4 pci2: vs1 vs2 pci3: vs3 vs4 [qemu output] qemu-system-x86_64: virtio-serial-bus: Guest failure in adding device virtio_serial_pci0.0 Fail to open port vc3 vc4 SPREAD3: (first time) pci0: vc1 vc2 vc3 pci1: vc4 vs1 vs2 pci2: vs3 vs4 Fail to open port vs2 SPREAD3 (second+ time) PASS SPREAD4: pci0: vc1 vc2 vc3 vc4 pci1: vs1 vs2 vs3 vs4 Fail to open port vs3 vs4 It seems that the failures in Comment 8 were due of old (RHEL6 nightly) qemu. I tried the loopback tests with upstream qemu-1.4.50 and 60s virtserialport tests are not failing. I'll retest it with 3600s long tests with all variants and let you know. OK I tried multiple settings with those results: 1) when all devices are on a single virtio-serial-pci it works just fine. 2) when spread 1 or 2 devices per virtio-serial-pci it fails (see details below) 3) with spread 3+ it works fine with a multithread bonus (it copies twice as much data with spread4 than spread0 when using small buffers) There are two usual failures: * On Fedora 17, qemu-1.4.0-upstream, kernel-3.8.3-103.fc17.x86_64, smp4 no all ports are initialized in Windows XP with spread 1 or 2, other setups work fine. * On RHEL6-nightly, qemi-1.4.0-upstream, kernel-2.6.32-358.2.1.el6, smp2 qemu core-dumps with spread 1 or 2, sometimes with other settings too. I hope this was helpful. Should I close this bugzilla since the original problem disappeared and open new one for multiple ports on multiple virtio-serial-pci? Lu(In reply to comment #12) > OK I tried multiple settings with those results: > > 1) when all devices are on a single virtio-serial-pci it works just fine. > 2) when spread 1 or 2 devices per virtio-serial-pci it fails (see details > below) > 3) with spread 3+ it works fine with a multithread bonus (it copies twice as > much data with spread4 than spread0 when using small buffers) > > There are two usual failures: > > * On Fedora 17, qemu-1.4.0-upstream, kernel-3.8.3-103.fc17.x86_64, smp4 no > all ports are initialized in Windows XP with spread 1 or 2, other setups > work fine. > * On RHEL6-nightly, qemi-1.4.0-upstream, kernel-2.6.32-358.2.1.el6, smp2 > qemu core-dumps with spread 1 or 2, sometimes with other settings too. > > I hope this was helpful. Should I close this bugzilla since the original > problem disappeared and open new one for multiple ports on multiple > virtio-serial-pci? Lukas ,Ronen Is this bug related to https://bugzilla.redhat.com/show_bug.cgi?id=702611 ? (In reply to comment #13) > Lu(In reply to comment #12) > > OK I tried multiple settings with those results: > > > > 1) when all devices are on a single virtio-serial-pci it works just fine. > > 2) when spread 1 or 2 devices per virtio-serial-pci it fails (see details > > below) > > 3) with spread 3+ it works fine with a multithread bonus (it copies twice as > > much data with spread4 than spread0 when using small buffers) > > > > There are two usual failures: > > > > * On Fedora 17, qemu-1.4.0-upstream, kernel-3.8.3-103.fc17.x86_64, smp4 no > > all ports are initialized in Windows XP with spread 1 or 2, other setups > > work fine. > > * On RHEL6-nightly, qemi-1.4.0-upstream, kernel-2.6.32-358.2.1.el6, smp2 > > qemu core-dumps with spread 1 or 2, sometimes with other settings too. > > > > I hope this was helpful. Should I close this bugzilla since the original > > problem disappeared and open new one for multiple ports on multiple > > virtio-serial-pci? > > Lukas ,Ronen > > Is this bug related to https://bugzilla.redhat.com/show_bug.cgi?id=702611 ? Hi Mike, this is a problem with Windows guest so it's kind of different. Also the setup is different. I'm using 4 virtserialports + 4 virtconsoles. Ronen, this related bug convinced me to easier the test a bit. When I use only 4 virtserialports, the initialization with spread2 works fine. Also the whole test (transfer from vs1 -> vs2, vs3, vs4 with different buffer lengths) works smoothly. I'll try various settings and let you know. From what I learned about virtconsoles so far I'm guessing that the problem will be with virtconsoles rather than with virtserialports... (just guessing). Regards, Lukáš (In reply to comment #12) > I hope this was helpful. Should I close this bugzilla since the original > problem disappeared and open new one for multiple ports on multiple > virtio-serial-pci? I think you should open a new bug. It is easier to track/find a bug if the title matchs the actual error. Thanks, Gal. Lukáš, Is it possible for you to run the test with virtio-win-prewhql-0.1-62? It include a fix which related to write requests buffers. Thank, Gal. Results for virtio-win-prewhql-0.1-62 (4 virtconsoles and 4 virtserialports) spread1 (first boot): Fail to open port vs1 Fail to open port vs2 Fail to open port vs3 Fail to open port vs4 Fail to open port vc3 Fail to open port vc4 spread1 (other boots): (60s) PASS spread2 (first boot): Fail to open port vs3 Fail to open port vs4 Fail to open port vc3 spread2 (other boots): Incorrect received character after ~3s spread3 (always): Incorrect received character after ~13s spread4 (first boot): Fail to open port vs3 Fail to open port vs4 spread4 (other boots): Incorrect received character after ~3s NOTES: First boot means the boot where new drivers were installed. SpreadX means how many ports per virtio-serial-pci Lukas, what qemu version were you using in Comment #18? This bug is against Fedora 17 which will be EOL in a month. Testing closest to qemu.git would be the most useful. This message is a reminder that Fedora 17 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 17. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '17'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 17's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 17 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior to Fedora 17's end of life. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. (In reply to Cole Robinson from comment #19) > Lukas, what qemu version were you using in Comment #18? This bug is against > Fedora 17 which will be EOL in a month. Testing closest to qemu.git would be > the most useful. I can't tell for sure now, but probably qemu-1.4.0 (+- upstream git version). Currently I'm overloaded with work so I can't retest it with newer version now. Hi Cole, I retested it on Fedora 19 with stock qemu-kvm-1.4.2-4.fc19.x86_64 and upstream qemu-1.5.50 (fbe2e26c15af35e4d157874dc80f6a19eebaa83b) and latest 0.1.65 windows virtio drivers (Win XP SP3). The results are similar: qemu-kvm-1.4.2-4.fc19.x86_64: * spread_1 version worked successfully 60s * spread_0, spread_2, spread_3, spread_4, spread_5 failed with incorrect char after 2-3s qemu-1.5.50: * spread_1: - serialport_small, mixed_small - worked successfully 60s - serialport_big - incorrect char after 37s - console_small, console_big, mixed_big - incorrect char after 3s * spread_0, spread_2, spread_3, spread_4, spread_5 failed with incorrect char after 1-3s NOTE: spread_$num means how many virtio ports per virtio-serial-pci (the only working solution was each port on separate virtio-serial-pci). I'll try the 3600s tests over the weekend and probably different SMP setups (this was executed on 4xCPU host machine with smp=2). Hi, I finished the 3600s tests with these results: smp1: spread0 serialport small 2s serialport big 70s console small 2s console big 71s mixed small 2s mixed big 76s spread1 serialport small PASS serialport big PASS console small PASS console big PASS mixed small PASS mixed big PASS spread2 serialport small 2s serialport big 2s console small 2s console big 2s mixed small 2s mixed big 2s spread5 serialport small PASS serialport big PASS console small 2s console big 2s mixed small 1s mixed big 2s smp2: spread0 serialport small 2s 2s 1s 2s serialport big 2s 2s 2s 5s console small 2s 2s 2s 2s console big 2s 2s 2s 3s mixed small 2s 2s 1s 1s mixed big 2s 2s 2s 4s spread1 serialport small 2s 1311s 1664s PASS serialport big PASS PASS PASS PASS console small 630s 329s 2246s 1984s console big PASS 708s PASS PASS mixed small 471s 585s PASS PASS mixed big PASS PASS PASS PASS spread2 serialport small 2s 1s 2s 2s serialport big 2s 2s 3s 4s console small 1s 2s 2s 1s console big 2s 2s 2s 3s mixed small 2s 1s 2s 2s mixed big 63s 2s 2s 3s spread5 serialport small 4s 2s 2s 2s serialport big 3s 2s 4s 10s console small 2s 2s 2s 1s console big 2s 2s 2s 2s mixed small 1s 2s 2s 2s mixed big 2s 2s 2s 2s smp4: spread0 serialport small 2s serialport big 2s console small 1s console big 2s mixed small 2s mixed big 64s spread1 serialport small 1031s serialport big PASS (cleanup failed: Can't open vs1 sock on host) console small PASS console big PASS mixed small 30s mixed big PASS spread2 serialport small 2s serialport big 2s console small 2s console big 2s mixed small 2s mixed big 2s spread5 serialport small 2s serialport big 2s console small 2s console big 2s mixed small 2s mixed big 2s host: Fedora 19, qemu-kvm-1.4.2-4.fc19.x86_64, 4 CPUs, 12G RAM guest: Windows XP SP3, vio drivers 0.1.65, 1G RAM serialports: vs1..vs4 console: vc1..vc4 order: vc1..vc4, vs1..vs4 spread0 = {vc1..vs4} (all ports on sinle virtio-serial-pci) spread1 = {vc1}{vc2}{vc3}...{vs4} (each port on separate virtio-serial-pci) spread2 = {vc1,vc2}{vc3,vc4}{vs1,vs2}... spread5 = {vc1,vc2,vc3,vc4,vs1}{vs2,vs3,vs4} scenarios: sends data from $port1 using buffer of length $buf1 from host to guest. Guest receives the data using buffer of length $guestbuf to 3 other ports in guest. Host reads the data on $port2, $port3 and $port4 using received buffer of length $buf2, $buf3, $buf4 and verifies. Test length is 3600s. $test name: $port1@$buf1:$port2@$buf2:$port3@$buf4:$port4@buf4:$guestbuf serialport small: serialport@4:serialport@2:serialport@4:serialport@8:8 serialport big: serialport@16384:serialport@2048:serialport@4096:serialport@8192:8192 console small: console@4:console@2:console@4:console@8:8 console big: console@16384:console@2048:console@4096:console@8192:8192 mixed small: serialport@4:console@2:serialport@5:console@6:8 mixed big: console@16384:serialport@2048:console@4096:serialport@8192:8192 I forgot to add that in case of test failure the VM was rebooted. In case the test passed and the next test used the same setup (spread, smp) it was reused. Gal, any thoughts on Lukas data above? (In reply to Cole Robinson from comment #25) > Gal, any thoughts on Lukas data above? I'm still trying to understand. At a first look it seems that the problem is that Luaks is using virtconsole rather than virtserialport. I'm using multiple scenarios: serialport - use only virtserialport to send data from/to host console - use only virtconsole ... mixed small - use virtserialport to send data from host and booth types to send them back mixed big - use virtconsole to send data from host and booth types to send them back Comment from bug 949972: When working with only virtserialports (e.g. no virtconsole), I was able to reproduce the problem with build-64. The problem was not reproduced with build-66. There is a submitted patch that fixed the virtconsole related problems. So I hope this bug will be closed on the next build. Thanks Lukas and Gal for following up! I'll close this bug when I upload a new virtio-win build to the fedora site Hi, yes, the 66 seems to fix these issues. On the other hand build 66 breaks the port initialization: spread0: works fine (all variants, 3600s run) spread1: always fails with missing ports vs2, vc2 spread2: always fails with missing ports vc3, vc4 spread5: works fine (all variants, 3600s run) ad spread1: the layout is: {vc1}{vc2}{vc3}{vc4}{vs1}{vs2}{vs3}{vs4} - where vs2 and vc4 are missing MALLOC_PERTURB_=1 /bin/qemu-kvm \ -S \ -name 'virt-tests-vm1' \ -M pc \ -nodefaults \ -vga std \ -chardev socket,id=hmp_id_hmp1,path=/tmp/monitor-hmp1-20130807-154331-kvucacTW,server,nowait \ -mon chardev=hmp_id_hmp1,mode=readline \ -chardev socket,id=serial_id_serial1,path=/tmp/serial-serial1-20130807-154331-kvucacTW,server,nowait \ -device isa-serial,chardev=serial_id_serial1 \ -device driver=virtio-serial-pci,id=virtio_serial_pci0 \ -chardev socket,id=devvc1,path=/tmp/virtio_port-vc1-20130807-154331-kvucacTW,server,nowait \ -device virtconsole,chardev=devvc1,name=vc1,id=vc1,bus=virtio_serial_pci0.0 \ -device driver=virtio-serial-pci,id=virtio_serial_pci1 \ -chardev socket,id=devvc2,path=/tmp/virtio_port-vc2-20130807-154331-kvucacTW,server,nowait \ -device virtconsole,chardev=devvc2,name=vc2,id=vc2,bus=virtio_serial_pci1.0 \ -device driver=virtio-serial-pci,id=virtio_serial_pci2 \ -chardev socket,id=devvc3,path=/tmp/virtio_port-vc3-20130807-154331-kvucacTW,server,nowait \ -device virtconsole,chardev=devvc3,name=vc3,id=vc3,bus=virtio_serial_pci2.0 \ -device driver=virtio-serial-pci,id=virtio_serial_pci3 \ -chardev socket,id=devvc4,path=/tmp/virtio_port-vc4-20130807-154331-kvucacTW,server,nowait \ -device virtconsole,chardev=devvc4,name=vc4,id=vc4,bus=virtio_serial_pci3.0 \ -device driver=virtio-serial-pci,id=virtio_serial_pci4 \ -chardev socket,id=devvs1,path=/tmp/virtio_port-vs1-20130807-154331-kvucacTW,server,nowait \ -device virtserialport,chardev=devvs1,name=vs1,id=vs1,bus=virtio_serial_pci4.0 \ -device driver=virtio-serial-pci,id=virtio_serial_pci5 \ -chardev socket,id=devvs2,path=/tmp/virtio_port-vs2-20130807-154331-kvucacTW,server,nowait \ -device virtserialport,chardev=devvs2,name=vs2,id=vs2,bus=virtio_serial_pci5.0 \ -device driver=virtio-serial-pci,id=virtio_serial_pci6 \ -chardev socket,id=devvs3,path=/tmp/virtio_port-vs3-20130807-154331-kvucacTW,server,nowait \ -device virtserialport,chardev=devvs3,name=vs3,id=vs3,bus=virtio_serial_pci6.0 \ -device driver=virtio-serial-pci,id=virtio_serial_pci7 \ -chardev socket,id=devvs4,path=/tmp/virtio_port-vs4-20130807-154331-kvucacTW,server,nowait \ -device virtserialport,chardev=devvs4,name=vs4,id=vs4,bus=virtio_serial_pci7.0 \ -chardev socket,id=seabioslog_id_20130807-154331-kvucacTW,path=/tmp/seabios-20130807-154331-kvucacTW,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20130807-154331-kvucacTW,iobase=0x402 \ -device driver=ich9-usb-uhci1,id=usb1 \ -drive file=/home/medic/Work/Projekty/autotest/autotest/client/tests/virt/shared/data/images/winXP-32.qcow2,if=none,id=drive-ide0-0-0,media=disk,format=qcow2,aio=native \ -device driver=ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 \ -device driver=virtio-net-pci,mac=9a:20:21:22:23:24,id=idV9vHy3,netdev=ideIT3jb \ -netdev user,id=ideIT3jb,hostfwd=tcp::5001-:10022,hostfwd=tcp::5002-:10023 \ -m 1024 \ -smp 4,maxcpus=4,cores=1,threads=1,sockets=4 \ -cpu 'SandyBridge' \ -drive file=/home/medic/Work/Projekty/autotest/autotest/client/tests/virt/shared/data/isos/windows/winutils.iso,if=none,id=drive-ide0-0-1,media=cdrom,format=raw \ -device driver=ide-drive,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -rtc base=localtime,clock=host,driftfix=none \ -boot order=cdn,once=c,menu=off \ -enable-kvm ad spread2: the layout is: {vc1,vc2}{vc3,vc4}{vs1,vs2}{vs3,vs4} - where vc3 and vc4 are missing MALLOC_PERTURB_=1 /bin/qemu-kvm \ -S \ -name 'virt-tests-vm1' \ -M pc \ -nodefaults \ -vga std \ -chardev socket,id=hmp_id_hmp1,path=/tmp/monitor-hmp1-20130807-154331-kvucacTW,server,nowait \ -mon chardev=hmp_id_hmp1,mode=readline \ -chardev socket,id=serial_id_serial1,path=/tmp/serial-serial1-20130807-154331-kvucacTW,server,nowait \ -device isa-serial,chardev=serial_id_serial1 \ -device driver=virtio-serial-pci,id=virtio_serial_pci0 \ -chardev socket,id=devvc1,path=/tmp/virtio_port-vc1-20130807-154331-kvucacTW,server,nowait \ -device virtconsole,chardev=devvc1,name=vc1,id=vc1,bus=virtio_serial_pci0.0 \ -chardev socket,id=devvc2,path=/tmp/virtio_port-vc2-20130807-154331-kvucacTW,server,nowait \ -device virtconsole,chardev=devvc2,name=vc2,id=vc2,bus=virtio_serial_pci0.0 \ -device driver=virtio-serial-pci,id=virtio_serial_pci1 \ -chardev socket,id=devvc3,path=/tmp/virtio_port-vc3-20130807-154331-kvucacTW,server,nowait \ -device virtconsole,chardev=devvc3,name=vc3,id=vc3,bus=virtio_serial_pci1.0 \ -chardev socket,id=devvc4,path=/tmp/virtio_port-vc4-20130807-154331-kvucacTW,server,nowait \ -device virtconsole,chardev=devvc4,name=vc4,id=vc4,bus=virtio_serial_pci1.0 \ -device driver=virtio-serial-pci,id=virtio_serial_pci2 \ -chardev socket,id=devvs1,path=/tmp/virtio_port-vs1-20130807-154331-kvucacTW,server,nowait \ -device virtserialport,chardev=devvs1,name=vs1,id=vs1,bus=virtio_serial_pci2.0 \ -chardev socket,id=devvs2,path=/tmp/virtio_port-vs2-20130807-154331-kvucacTW,server,nowait \ -device virtserialport,chardev=devvs2,name=vs2,id=vs2,bus=virtio_serial_pci2.0 \ -device driver=virtio-serial-pci,id=virtio_serial_pci3 \ -chardev socket,id=devvs3,path=/tmp/virtio_port-vs3-20130807-154331-kvucacTW,server,nowait \ -device virtserialport,chardev=devvs3,name=vs3,id=vs3,bus=virtio_serial_pci3.0 \ -chardev socket,id=devvs4,path=/tmp/virtio_port-vs4-20130807-154331-kvucacTW,server,nowait \ -device virtserialport,chardev=devvs4,name=vs4,id=vs4,bus=virtio_serial_pci3.0 \ -chardev socket,id=seabioslog_id_20130807-154331-kvucacTW,path=/tmp/seabios-20130807-154331-kvucacTW,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20130807-154331-kvucacTW,iobase=0x402 \ -device driver=ich9-usb-uhci1,id=usb1 \ -drive file=/home/medic/Work/Projekty/autotest/autotest/client/tests/virt/shared/data/images/winXP-32.qcow2,if=none,id=drive-ide0-0-0,media=disk,format=qcow2,aio=native \ -device driver=ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 \ -device driver=virtio-net-pci,mac=9a:3e:3f:40:41:42,id=idV9vHy3,netdev=ideIT3jb \ -netdev user,id=ideIT3jb,hostfwd=tcp::5001-:10022,hostfwd=tcp::5002-:10023 \ -m 1024 \ -smp 4,maxcpus=4,cores=1,threads=1,sockets=4 \ -cpu 'SandyBridge' \ -drive file=/home/medic/Work/Projekty/autotest/autotest/client/tests/virt/shared/data/isos/windows/winutils.iso,if=none,id=drive-ide0-0-1,media=cdrom,format=raw \ -device driver=ide-drive,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -rtc base=localtime,clock=host,driftfix=none \ -boot order=cdn,once=c,menu=off \ -enable-kvm I'll try more setups and let you know about the port initialization. (In reply to Lukas Doktor from comment #30) > Hi, yes, the 66 seems to fix these issues. On the other hand build 66 breaks > the port initialization: It is good to know that the data corruption was fixed. The next build (probably build-67) should fix the initialization issue. > I'll try more setups and let you know about the port initialization. Thanks, but please wait for the next build. No need to check more setups if we know that the problem still exist in build 66. Gal. Build virtio-win-prewhql-0.1-67 is out :-). Okay this should be fixed with latest published virtio-win drivers. If there are any other lingering issues, please reproduce with the latest drivers and file a new bug report. Created attachment 846628 [details]
guest sender script
This script sends pseudo-random data to virtconsole vc1.
Created attachment 846631 [details]
host receiver script
This script reads data from given socket (virtconsole port) and verifies the correctness.
In case of failure it drops expected characters and search for next valid string (len 10). Then it resumes the correct loop and prints details about the failure. After 10 failures it exits.
Hi Guys, I can confirm, that the newest drivers fixed this issue for virtserialport. The problem still persist when using virtconsole. I created a simple reproducer, which sends data from guest to host over virtconsole. From time to time some packets are lost, even thought no failure is reported in guest. The problem occurs with bigger send buffers (over 220 chars long) on guest. Receive buffer on host doesn't affect the failure (it takes a bit longer to trigger this issue with big receive buffer). Also no matter how big receive buffer is, the length of lost data is multiple of the send buffer. Regards, Lukáš Created attachment 846636 [details]
guest sender script
Created attachment 846638 [details]
host receiver script
Well I have another bad news, the problem persists even on virtserialport. I executed 1 hour test and booth small and big buffer test failed after ~1600s. With which versions of qemu and virtio-serial driver did you reproduce it? Thanks. winXP + virtio-serial 74 and upstream tag qemu-1.7.0 using smp4. You can execute the autotest virtio_console tests, just extend the virtio_console_test_time to 3600s and all of the loopback test fails. (virtconsole variants sooner) Hi Gal, sorry the failure on virtserialport was false alarm. It was caused by DHCP. Anyway virtconsole still generates failures and they are proved by the simple reproducer posted here. Also the failure is only when guest (win) is sending data to host. The opposite direction works fine. Thanks for the additional information. It is a little bit odd because as far as I know the code path is same for ports and consoles, but I'll double check it and will try to reproduce. This message is a notice that Fedora 19 is now at end of life. Fedora has stopped maintaining and issuing updates for Fedora 19. It is Fedora's policy to close all bug reports from releases that are no longer maintained. Approximately 4 (four) weeks from now this bug will be closed as EOL if it remains open with a Fedora 'version' of '19'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 19 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. This bug has lingered for a while. Lukas, any idea if this was ever fixed? Are those reproducing tests in autotest these days? I have no idea and I don't have any windows system available to try it. There is a simple reproducer, which shouldn't be hard to use if you have windows guest, though. I was unable to reproduce this bug using the attached script (QEMU 2.6, vioserial build 122, Windows 7): ... correct 51000000, incorrect 0 correct 52000000, incorrect 0 correct 53000000, incorrect 0 Since Windows XP is longer supported, I'm closing this bug (again). Please reopen if reproducible on Windows 7 or above. |