Bug 909059
Summary: | Switch to upstream solution for chardev flow control | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Amit Shah <amit.shah> | ||||
Component: | qemu-kvm | Assignee: | Amit Shah <amit.shah> | ||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 6.5 | CC: | acathrow, armbru, bcao, bsarathy, flang, ghammer, hdegoede, italkohe, juzhang, kraxel, lnovich, mazhang, michal.skrivanek, mkenneth, pbonzini, qzhang, rhod, sradvan, virt-maint, xfu | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | qemu-kvm-0.12.1.2-2.368.el6 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 949182 949183 (view as bug list) | Environment: | |||||
Last Closed: | 2013-11-21 06:34:45 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 953551 | ||||||
Bug Blocks: | 870447, 896690, 949182, 949183, 949312, 964195, 966306, 967899, 985334 | ||||||
Attachments: |
|
Description
Amit Shah
2013-02-08 06:33:05 UTC
(In reply to comment #0) > In particular, see bug 729923, bug 702271, bug 808295, bug 822386, bug 822078. Last should've been bug 882078. I believe the non-upstream flow control patches we currently have in RHEL-6 are flawed, and cause at least some of the "other bugs" Amit listed. Fixing them looks difficult. Amit is right in that replacing our flow control patches by a backport of upstream's patches carries risk. The alternative is attempting to fix the flaws in our non-upstream patches, which is a different risk, and one I like a whole lot less. Let's attempt to switch to upstream flow control, and see how we do in testing, and how much it helps with the "other bugs". Upstream status (along with current known bugs) is tracked in http://wiki.qemu.org/Features/ChardevFlowControl (In reply to comment #1) > (In reply to comment #0) > > In particular, see bug 729923, bug 702271, bug 808295, bug 822386, bug 822078. > > Last should've been bug 882078. I've put all these in the 'see also' field, so that it's easier to manage related bugs. The new patches have now been merged upstream. I'll let them sit there for a while, so that they receive testing and any initial bugs get shaken out. I will then propose a backport to RHEL6. Created attachment 716914 [details]
test steps of two issues
For the third issue, official build(qemu-kvm-0.12.1.2-2.358.el6.x86_64) also hit the same issue. so I open a new bug(928207) to track it.
(In reply to comment #10) > > Found three new issues. > > 1.hot unplug virtio-serial bus cause guest kernel panic during transferring > > data from guest to host. This sounds like a guest bug, should not be due to this build. Please file a new bug. > > 2. transfer data from guest to host,then host will get wrong md5sum values > > (BTW, if transfer data from host to guest,then md5sum values is correct) Please file a new bug for this, also note that you got this with the scratch build I gave in the bug report (make that bug depend on this one). There are some older bug reports about similar as well, so looks like it's not new in this build. > > 3. transfer data with two ports at the same time from guest to host,guest > > hang and call trace. > > use the following script to transfer data > > while true;do echo abc >/dev/vport0p1;done > > while true;do echo edf >/dev/vport0p2;done This is also a guest bug, details in bug 928207. Couple more test cases. Please add these to the test plan as well. 1. Migrate guest while transferring data over spice. Ensure everything is fine. 2. Induce throttling by opening host-side chardev but not reading from it. This can be achieved by the simple steps mentioned in comment 0 using python. Send data from guest, but don't read on host. Now, migrate the guest. On the destination, use these two scenarios: 2.a) Do not connect host-side chardev on destination before migration completes. Ensure qemu works fine after migration. Then connect chardev and read data from port. Data sent by guest before migration should be obtained fine. 2.b) Connect host-side chardev on destination before migration is started. Ensure data sent from guest is read fine on host when migration completes. (In reply to comment #16) > Couple more test cases. Please add these to the test plan as well. > > 1. Migrate guest while transferring data over spice. Ensure everything is > fine. > > 2. Induce throttling by opening host-side chardev but not reading from it. > This can be achieved by the simple steps mentioned in comment 0 using > python. Send data from guest, but don't read on host. Now, migrate the > guest. On the destination, use these two scenarios: > > 2.a) Do not connect host-side chardev on destination before migration > completes. Ensure qemu works fine after migration. Then connect chardev > and read data from port. Data sent by guest before migration should be > obtained fine. > > 2.b) Connect host-side chardev on destination before migration is started. > Ensure data sent from guest is read fine on host when migration completes. Thank a lot for your suggestions first. You mean the above test scenarios should be added in rhel6.5 and rhel7 test plan both, right? Best Regards, Junyi (In reply to comment #18) > Thank a lot for your suggestions first. You mean the above test scenarios > should be added in rhel6.5 and rhel7 test plan both, right? Right. One more test case to be added, similar to the previous one: Induce throttling by opening host-side chardev but not reading from it. This can be achieved by the simple steps mentioned in comment 0 using python. Send data from guest, but don't read on host. Now, hot-unplug the port. Also attempt migration after unplug. Hi, Amit I found an aborted issue in your v4 build. Boot guest with following command line and check "info qtree" in qemu monitor, qemu gets aborted. Maybe can not reproduce at the first time but repeat input "info qtree", will be aborted at the second time. CLI: (gdb) r -cpu SandyBridge -M rhel6.4.0 -enable-kvm -m 4096 -smp 2,sockets=2,cores=1,threads=1 -name rhel6.4-64 -uuid 9a0e67ec-f286-d8e7-0548-0c1c9ec93009 -nodefconfig -nodefaults -monitor stdio -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive file=/home/RHEL-Server-6.4-64-virtio.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:d5:51:8a,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=channel1,path=/tmp/helloworld1,server,nowait -device virtserialport,chardev=channel1,name=port1,bus=virtio-serial0.0,id=port1 -chardev socket,id=channel2,path=/tmp/helloworld2,server,nowait -device virtserialport,chardev=channel2,name=port2,bus=virtio-serial0.0,id=port2 -device usb-tablet,id=input0 -spice port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -vga qxl -global qxl-vga.vram_size=67108864 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 qemu-kvm: /builddir/build/BUILD/qemu-kvm-0.12.1.2/monitor.c:4334: handler_audit: Assertion `!monitor_has_error(mon)' failed. Program received signal SIGABRT, Aborted. (gdb) #0 0x00007ffff57418a5 in raise () from /lib64/libc.so.6 #1 0x00007ffff5743085 in abort () from /lib64/libc.so.6 #2 0x00007ffff573aa1e in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007ffff573aae0 in __assert_fail () from /lib64/libc.so.6 #4 0x00007ffff7de65d5 in handler_audit (mon=0x7ffff88fe010, cmd=0x7ffff82bf730, params=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/monitor.c:4334 #5 monitor_call_handler (mon=0x7ffff88fe010, cmd=0x7ffff82bf730, params=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/monitor.c:4349 #6 0x00007ffff7deb98f in handle_user_command (mon=0x7ffff88fe010, cmdline=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/monitor.c:4385 #7 0x00007ffff7debaca in monitor_command_cb (mon=0x7ffff88fe010, cmdline=<value optimized out>, opaque=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/monitor.c:5020 #8 0x00007ffff7e4987d in readline_handle_byte (rs=0x7ffff9d35a20, ch=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/readline.c:369 #9 0x00007ffff7debcf0 in monitor_read (opaque=<value optimized out>, buf=0x7fffffffb6c0 "\r", size=1) at /usr/src/debug/qemu-kvm-0.12.1.2/monitor.c:5006 #10 0x00007ffff7e5fce6 in qemu_chr_be_write (chan=<value optimized out>, cond=<value optimized out>, opaque=0x7ffff86e17d0) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-char.c:164 #11 fd_chr_read (chan=<value optimized out>, cond=<value optimized out>, opaque=0x7ffff86e17d0) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-char.c:747 #12 0x00007ffff7484f0e in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 #13 0x00007ffff7ddeb6a in glib_select_poll (timeout=1000) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:3960 #14 main_loop_wait (timeout=1000) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4033 #15 0x00007ffff7e0121a in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2244 #16 0x00007ffff7de1848 in main_loop (argc=70, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4227 #17 main (argc=70, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:6565 (gdb) ====================== Hi, Amit I re-test with official qemu-kvm-360 build can not reproduce this issue. And also try the following scenarios, have no problem too. (1) Remove the following chardev and serial port, only leave 1 serial port in the command line. ==> Can not reproduce. -chardev socket,id=channel2,path=/tmp/helloworld2,server,nowait -device virtserialport,chardev=channel2,name=port2,bus=virtio-serial0.0,id=port2 (2) Remove the sound device. ==> Can not reproduce. Please help have a check, thanks! Regards, Qunfang The current upstream patches cause a deadlock. https://bugzilla.gnome.org/show_bug.cgi?id=626702 Comment 6 of the latter bug is exactly this scenario. One more testcase to be added: If a VM is started with the rhel6.0.0 machine type, flow control code should be disabled. So if the host-side chardev is not reading while the guest continues to send data, the guest should freeze. The guest will unfreeze only when the host-side chardev is read from. This should be the desired behaviour with the older machine type. One more testcase from upstream. From Jan Kiszka's message: "It's trivial to reproduce: qemu-system-x86_64 -serial stdio -S, and then hit a key twice on that console." This problem was introduced by some patches in this rework, and was resolved by some patches in v7+ of the builds. Please add this testcase to our tests so we don't regress. Hi, Amit I re-tested the bugs in 'see also' list and summarize the results here: (1)For closed bugs regression check: Bug 588916 - qemu char fixes for nonblocking writes, virtio-console flow control (passed) Bug 621484 - Broken pipe when working with unix socket chardev (reproduce again with comment 25, need to file new bug once official build comes) Bug 745758 - Segmentation fault occurs after hot unplug virtio-serial-pci while virtio-serial-port in use (passed the scenario in the bug, but guest hang after interrupt the writing operation to serial port. create new bug 956637) Bug 839156 - Fedora 16 and 17 guests hang during boot (passed) (2) opening issues: Bug 797854 - host can't receive characters if disconnect to TCP socket then re-connect again (still can reproduce,bug is moved to rhel7 now) Bug 882078 - Restart libvirt during snapshot-create causes VM fails to resume (60%~80% reproduced in v9) Bug 729923 - [virtio-serial] First message is not delivered after connected to virtio-port (Should be same issue with 702271, needinfo reporter to have a try to make sure the scenario is passed) Bug 702271 - Guest terminal returns directly in even number times w/o sending data out via virtio-serial-port (now it has same issue of 621484) Bug 808295 - qemu-kvm segfaults under heavy QMP I/O (still failed with a difference bt log in v9, updated details to bug 808295) Bug 822386 - qemu-kvm core dumps after virtio-blk hotplug-in/removed then stop/cont (fail, both reproduce on official qemu-kvm-361 and v9) Bug 911571 - [Hitachi 6.5 FEAT] virtio-trace: Named-pipe non-blocking (Verified pass) Bug 720535 (virtio serial) Guest aborted when transferring data from guest to host (using bug 880139 to track) Bug 880139 - guest abort when transfer file with virtio-serial from guest to host (Reproduced on both official build and v9) Thanks, Qunfang (In reply to comment #34) > Hi, Amit > I re-tested the bugs in 'see also' list and summarize the results here: > Re-tested them with v9 build in comment 30. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-1553.html |