Bug 624767
Summary: | Replace virtio-net TX timer mitigation with bottom half handler | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Alex Williamson <alex.williamson> |
Component: | qemu-kvm | Assignee: | Alex Williamson <alex.williamson> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 6.0 | CC: | akong, bcao, chrisw, ddumas, ehabkost, Jes.Sorensen, jwest, lihuang, llim, michen, mjenner, mkenneth, mwagner, syeghiay, tburke, virt-maint |
Target Milestone: | rc | Keywords: | ZStream |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-0.12.1.2-2.114.el6 | Doc Type: | Bug Fix |
Doc Text: |
Cause: update, virtio-net used a packet transmission algorithm that was using a timer to delay a transmission in an attempt to batch multiple packets together
Consequence: higher virtio-net transmission latency.
Fix: the default algorithm has been changed to use an asynchronous bottom half transmitter
Result: improved virtio-net transmission latency.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2011-05-19 11:22:12 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 633963 |
Description
Alex Williamson
2010-08-17 17:15:05 UTC
Initial port brew build, seems to work but still needs more thorough code review: https://brewweb.devel.redhat.com/taskinfo?taskID=2686097 Another brew, including bz624769 fix, plus an exit fix. Matches posted patches: https://brewweb.devel.redhat.com/taskinfo?taskID=2686547 Compare TCP throughput between the regular userspace virtio-net and regular userspace virtio-net with the tx mitigation patch by using the following steps: 1. start guest by: /usr/libexec/qemu-kvm -M rhel6.0.0 -enable-kvm -m 2G -smp 2 -uuid `uuidgen` -monitor stdio -rtc base=localtime -usbdevice tablet -drive file=/root/michen/rhel6-64-0811.2.qcow2,if=none,format=qcow2,werror=stop,rerror=stop,id=drive-ide,cache=none -device ide-drive,drive=drive-ide -netdev tap,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=20:20:20:56:90:90 -vnc :10 2. run netperf from guest to host and external host: # for b in 32 64 128 256 512 1024 2048 4096 8192 16834 32768 65507; do ./netperf -t TCP_STREAM -f m -H 192.162.0.1 -P 0 -l 10 -- -m $b; done 3. got the average throughput Actual result: 1. guest --> host Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec for qemu-kvm-0.12.1.2-2.110.el6.bz624767.2: 87380 16384 32 10.00 220.06 87380 16384 64 10.00 262.74 87380 16384 128 10.00 440.72 87380 16384 256 10.00 1037.03 87380 16384 512 10.00 2167.89 87380 16384 1024 10.00 4918.59 87380 16384 2048 10.00 10538.86 87380 16384 4096 10.00 12017.91 87380 16384 8192 10.00 10681.91 87380 16384 16834 10.00 11317.24 87380 16384 32768 10.00 10014.85 87380 16384 65507 10.00 10844.33 for qemu-kvm-0.12.1.2-2.110.el6: 87380 16384 32 10.00 343.93 87380 16384 64 10.00 528.98 87380 16384 128 10.00 792.85 87380 16384 256 10.00 1042.84 87380 16384 512 10.00 2084.18 87380 16384 1024 10.00 3292.36 87380 16384 2048 10.00 3351.13 87380 16384 4096 10.00 4263.10 87380 16384 8192 10.00 4087.28 87380 16384 16834 10.00 3995.17 87380 16384 32768 10.00 3875.20 87380 16384 65507 10.00 4120.90 when message size is larger than 128 bytes, throughput got big jump. 2. guest --> externel host Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec qemu-kvm-0.12.1.2-2.110.el6.bz624767.2: 87380 16384 32 10.00 226.60 87380 16384 64 10.00 320.50 87380 16384 128 10.02 868.02 87380 16384 256 10.02 860.57 87380 16384 512 10.02 794.40 87380 16384 1024 10.02 866.56 87380 16384 2048 10.02 824.94 87380 16384 4096 10.02 899.32 87380 16384 8192 10.02 871.77 87380 16384 16834 10.01 846.30 87380 16384 32768 10.02 875.60 87380 16384 65507 10.01 868.32 qemu-kvm-0.12.1.2-2.110.el6: 87380 16384 32 10.00 366.13 87380 16384 64 10.00 534.82 87380 16384 128 10.00 843.38 87380 16384 256 10.02 913.79 87380 16384 512 10.02 869.24 87380 16384 1024 10.02 851.47 87380 16384 2048 10.01 897.35 87380 16384 4096 10.02 854.27 87380 16384 8192 10.01 823.00 87380 16384 16834 10.02 895.84 87380 16384 32768 10.02 867.61 87380 16384 65507 10.02 861.71 No obvious improvement. michen-->Alex, could you help to check the above test result? thanks. Based on what we saw in RHEL5.4 I would expect the biggest differences to come with UDP and also when using multi-guest to multi-external drivers. For simple TCP netperfs I would expect a slow down at small message sizes. I'm running tests on 10Gbit but it will take another day to get things completed. (In reply to comment #4) > > michen-->Alex, could you help to check the above test result? thanks. They seem plausible. When doing guest<->host numbers it's important to use numactl to ensure host and guest are running with optimal locality. Running on the wrong core, even within a single socket, can significantly skew numbers. A 1Gbps link also really isn't that hard to saturate, and the difference comes with how much CPU is required to push the data. I think probably the best division of testing would be for qe to do functionality testing, particularly migration and hotplug, while Mark's team tries to evaluate the performance. I'm seeing in an issue with the NIC under test when using the new user space. I'm seeing this in the /var/log/message file on the guest (and the netperf fails...) I did not see this with the other userspace. In fairness it just showed up so it didn't happen immediately either... (*** guest *** ) [root@dhcp47-55 ~]# uname -a Linux dhcp47-55.lab.bos.redhat.com 2.6.32-63.el6.x86_64 #1 SMP Wed Aug 11 00:38:12 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux [ Host [root@perf22 ~]# uname -a Linux perf22.lab.bos.redhat.com 2.6.32-63.el6.x86_64 #1 SMP Wed Aug 11 00:38:12 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux [root@perf22 ~]# rpm -qa | grep kvm qemu-kvm-0.12.1.2-2.110.el6.bz624767.2.x86_64 qemu-kvm-debuginfo-0.12.1.2-2.110.el6.bz624767.2.x86_64 qemu-kvm-tools-0.12.1.2-2.110.el6.bz624767.2.x86_64 [root@perf22 ~]# ps -ef | grep test1 qemu 2700 1 86 08:38 ? 09:41:24 /usr/libexec/qemu-kvm -S -M rhel6.0.0 -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -name test1 -uuid df4bee91-9961-801a-957d-d3da87bb68b7 -nodefconfig -nodefaults -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/test1.monitor,server,nowait -mon chardev=monitor,mode=control -rtc base=utc -boot c -drive file=/var/lib/libvirt/images/test1.img,if=none,id=drive-virtio-disk0,boot=on,format=raw,cache=none -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -netdev tap,fd=20,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:60:4b:64,bus=pci.0,addr=0x3 -netdev tap,fd=21,id=hostnet1 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:e7:11:59,bus=pci.0,addr=0x6 -netdev tap,fd=22,id=hostnet2 -device virtio-net-pci,netdev=hostnet2,id=net2,mac=52:54:00:8c:a3:92,bus=pci.0,addr=0x7 -chardev pty,id=serial0 -device isa-serial,chardev=serial0 -usb -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 root 32245 32186 0 19:52 pts/11 00:00:00 grep test1 bridge name bridge id STP enabled interfaces br0 8000.000000000000 no br2 8000.003048c5a686 no eth2 vnet11 vnet14 vnet17 vnet2 vnet20 vnet23 vnet5 vnet8 br3 8000.003048c5a687 no eth3 vnet0 vnet12 vnet15 vnet18 vnet21 vnet3 vnet6 vnet9 br4 8000.001b213cabfc no eth4 vnet1 vnet10 vnet13 vnet16 vnet19 vnet22 vnet4 vnet7 virbr0 8000.000000000000 yes [root@perf22 ~]# lsmod | grep vhost [root@perf22 ~]# (There are more messages as well) Aug 20 16:20:03 dhcp47-55 kernel: BUG: soft lockup - CPU#0 stuck for 171s! [ksoftirqd/0:4] Aug 20 16:20:03 dhcp47-55 kernel: Modules linked in: ipv6 dm_mirror dm_region_hash dm_log virtio_net virtio_balloon i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: microcode] Aug 20 16:20:03 dhcp47-55 kernel: CPU 0: Aug 20 16:20:03 dhcp47-55 kernel: Modules linked in: ipv6 dm_mirror dm_region_hash dm_log virtio_net virtio_balloon i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: microcode] Aug 20 16:20:03 dhcp47-55 kernel: Pid: 4, comm: ksoftirqd/0 Not tainted 2.6.32-63.el6.x86_64 #1 KVM Aug 20 16:20:03 dhcp47-55 kernel: RIP: 0010:[<ffffffff814cb1e7>] [<ffffffff814cb1e7>] _spin_unlock_irqrestore+0x17/0x20 Aug 20 16:20:03 dhcp47-55 kernel: RSP: 0018:ffff880001e03df8 EFLAGS: 00000282 Aug 20 16:20:03 dhcp47-55 kernel: RAX: 0000000000000001 RBX: ffff880001e03df8 RCX: 000000000000c020 Aug 20 16:20:03 dhcp47-55 kernel: RDX: ffff880037626220 RSI: 0000000000000282 RDI: 0000000000000282 Aug 20 16:20:03 dhcp47-55 kernel: RBP: ffffffff81013c93 R08: 000000000000080a R09: 0000000000000001 Aug 20 16:20:03 dhcp47-55 kernel: R10: ffff8800376262b8 R11: 0000000000000002 R12: ffff880001e03d70 Aug 20 16:20:03 dhcp47-55 kernel: R13: ffff880037626000 R14: ffff880037626198 R15: ffffffff814d015c Aug 20 16:20:03 dhcp47-55 kernel: FS: 0000000000000000(0000) GS:ffff880001e00000(0000) knlGS:0000000000000000 Aug 20 16:20:03 dhcp47-55 kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b Aug 20 16:20:03 dhcp47-55 kernel: CR2: 000000391f2a67f0 CR3: 00000000799dd000 CR4: 00000000000006f0 Aug 20 16:20:03 dhcp47-55 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Aug 20 16:20:03 dhcp47-55 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Aug 20 16:20:03 dhcp47-55 kernel: Call Trace: Aug 20 16:20:03 dhcp47-55 kernel: <IRQ> [<ffffffff813aa06e>] ? uhci_hub_status_data+0x6e/0x200 Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff81385400>] ? rh_timer_func+0x0/0x10 Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff81384b3f>] ? usb_hcd_poll_rh_status+0x5f/0x180 Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff81385400>] ? rh_timer_func+0x0/0x10 Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff8138540e>] ? rh_timer_func+0xe/0x10 Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff8107da37>] ? run_timer_softirq+0x197/0x340 Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff81073807>] ? __do_softirq+0xb7/0x1e0 Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff810142cc>] ? call_softirq+0x1c/0x30 Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff810142cc>] ? call_softirq+0x1c/0x30 Aug 20 16:20:03 dhcp47-55 kernel: <EOI> [<ffffffff81015f35>] ? do_softirq+0x65/0xa0 Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff81073380>] ? ksoftirqd+0x80/0x110 Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff81073300>] ? ksoftirqd+0x0/0x110 Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff810915d6>] ? kthread+0x96/0xa0 Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff810141ca>] ? child_rip+0xa/0x20 Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff81091540>] ? kthread+0x0/0xa0 Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff810141c0>] ? child_rip+0x0/0x20 Aug 20 16:25:21 dhcp47-55 dhclient: DHCPREQUEST on eth0 to 10.16.46.224 port 67 Aug 20 16:25:21 dhcp47-55 dhclient: DHCPACK from 10.16.46.224 Aug 20 16:25:21 dhcp47-55 dhclient: bound to 10.16.47.55 -- renewal in 835 seconds. Aug 20 16:35:53 dhcp47-55 kernel: BUG: soft lockup - CPU#0 stuck for 83s! [events/0:9] Aug 20 16:35:53 dhcp47-55 kernel: Modules linked in: ipv6 dm_mirror dm_region_hash dm_log virtio_net virtio_balloon i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: microcode] Aug 20 16:35:53 dhcp47-55 kernel: CPU 0: Aug 20 16:35:53 dhcp47-55 kernel: Modules linked in: ipv6 dm_mirror dm_region_hash dm_log virtio_net virtio_balloon i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: microcode] Aug 20 16:35:53 dhcp47-55 kernel: Pid: 9, comm: events/0 Not tainted 2.6.32-63.el6.x86_64 #1 KVM Aug 20 16:35:53 dhcp47-55 kernel: RIP: 0010:[<ffffffff81059a02>] [<ffffffff81059a02>] finish_task_switch+0x52/0xd0 Aug 20 16:35:53 dhcp47-55 kernel: RSP: 0018:ffff88007cd07d40 EFLAGS: 00000246 Aug 20 16:35:53 dhcp47-55 kernel: RAX: ffff880001e16980 RBX: ffff88007cd07d70 RCX: ffff8800795c9800 Aug 20 16:35:53 dhcp47-55 kernel: RDX: 0000000000000000 RSI: ffff8800794b7520 RDI: ffff88007cd05560 Aug 20 16:35:53 dhcp47-55 kernel: RBP: ffffffff81013c8e R08: ffff88007cd06000 R09: 0000000000000000 Aug 20 16:35:53 dhcp47-55 kernel: R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 Aug 20 16:35:53 dhcp47-55 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000800 Aug 20 16:35:53 dhcp47-55 kernel: FS: 0000000000000000(0000) GS:ffff880001e00000(0000) knlGS:0000000000000000 Aug 20 16:35:53 dhcp47-55 kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b Aug 20 16:35:53 dhcp47-55 kernel: CR2: 000000391f2a67f0 CR3: 00000000798f9000 CR4: 00000000000006f0 Aug 20 16:35:53 dhcp47-55 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Aug 20 16:35:53 dhcp47-55 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Aug 20 16:35:53 dhcp47-55 kernel: Call Trace: Aug 20 16:35:53 dhcp47-55 kernel: [<ffffffff814c89e6>] ? thread_return+0x4e/0x778 Aug 20 16:35:53 dhcp47-55 kernel: [<ffffffff8112df50>] ? vmstat_update_periodic+0x0/0x50 Aug 20 16:35:53 dhcp47-55 kernel: [<ffffffff8108c33c>] ? worker_thread+0x1fc/0x2a0 Aug 20 16:35:53 dhcp47-55 kernel: [<ffffffff81091940>] ? autoremove_wake_function+0x0/0x40 Aug 20 16:35:53 dhcp47-55 kernel: [<ffffffff8108c140>] ? worker_thread+0x0/0x2a0 Aug 20 16:35:53 dhcp47-55 kernel: [<ffffffff810915d6>] ? kthread+0x96/0xa0 Aug 20 16:35:53 dhcp47-55 kernel: [<ffffffff810141ca>] ? child_rip+0xa/0x20 Aug 20 16:35:53 dhcp47-55 kernel: [<ffffffff81091540>] ? kthread+0x0/0xa0 Aug 20 16:35:53 dhcp47-55 kernel: [<ffffffff810141c0>] ? child_rip+0x0/0x20 Aug 20 16:37:23 dhcp47-55 kernel: BUG: soft lockup - CPU#0 stuck for 83s! [swapper:0] Aug 20 16:37:23 dhcp47-55 kernel: Modules linked in: ipv6 dm_mirror dm_region_hash dm_log virtio_net virtio_balloon i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: microcode] Aug 20 16:37:23 dhcp47-55 kernel: CPU 0: Aug 20 16:37:23 dhcp47-55 kernel: Modules linked in: ipv6 dm_mirror dm_region_hash dm_log virtio_net virtio_balloon i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: microcode] Aug 20 16:37:23 dhcp47-55 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-63.el6.x86_64 #1 KVM Aug 20 16:37:23 dhcp47-55 kernel: RIP: 0010:[<ffffffff8107da16>] [<ffffffff8107da16>] run_timer_softirq+0x176/0x340 Aug 20 16:37:23 dhcp47-55 kernel: RSP: 0018:ffff880001e03e50 EFLAGS: 00000246 Aug 20 16:37:23 dhcp47-55 kernel: RAX: ffff880001e03e90 RBX: ffff880001e03ed0 RCX: 00000000000000d0 Aug 20 16:37:23 dhcp47-55 kernel: RDX: ffff880001e03e90 RSI: dead000000200200 RDI: 0000000000000000 Aug 20 16:37:23 dhcp47-55 kernel: RBP: ffffffff81013c93 R08: ffffffff81ad9230 R09: 0000000000000000 Aug 20 16:37:23 dhcp47-55 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff880001e03dd0 Aug 20 16:37:23 dhcp47-55 kernel: R13: ffff8800792bee18 R14: ffffffff81ad9140 R15: ffffffff814d015c Aug 20 16:37:23 dhcp47-55 kernel: FS: 0000000000000000(0000) GS:ffff880001e00000(0000) knlGS:0000000000000000 Aug 20 16:37:23 dhcp47-55 kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b Aug 20 16:37:23 dhcp47-55 kernel: CR2: 000000391f2a67f0 CR3: 0000000079f24000 CR4: 00000000000006f0 Aug 20 16:37:23 dhcp47-55 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Aug 20 16:37:23 dhcp47-55 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Aug 20 16:37:23 dhcp47-55 kernel: Call Trace: Aug 20 16:37:23 dhcp47-55 kernel: <IRQ> [<ffffffff8102f52d>] ? lapic_next_event+0x1d/0x30 Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff81073807>] ? __do_softirq+0xb7/0x1e0 Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff810956f0>] ? hrtimer_interrupt+0x140/0x250 Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff810142cc>] ? call_softirq+0x1c/0x30 Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff81015f35>] ? do_softirq+0x65/0xa0 Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff81073605>] ? irq_exit+0x85/0x90 Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff814d0161>] ? smp_apic_timer_interrupt+0x71/0x9c Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff81013c93>] ? apic_timer_interrupt+0x13/0x20 Aug 20 16:37:23 dhcp47-55 kernel: <EOI> [<ffffffff8103baab>] ? native_safe_halt+0xb/0x10 Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff8101bd3d>] ? default_idle+0x3d/0x90 Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff81011e96>] ? cpu_idle+0xb6/0x110 Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff814b118b>] ? rest_init+0x6b/0x80 Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff818c1e9c>] ? start_kernel+0x403/0x40f Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff818c133a>] ? x86_64_start_reservations+0x125/0x129 Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff818c1438>] ? x86_64_start_kernel+0xfa/0x109 tried steps on comment #8 .Can not reproduce it. host info: # lspci |grep net 04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 02) # uname -a Linux dhcp-91-71.nay.redhat.com 2.6.32-66.el6.x86_64 #1 SMP Wed Aug 18 00:24:41 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux # rpm -qa|grep qemu-kvm qemu-kvm-tools-0.12.1.2-2.110.el6.bz624767.2.x86_64 qemu-kvm-0.12.1.2-2.110.el6.bz624767.2.x86_64 qemu-kvm-debuginfo-0.12.1.2-2.110.el6.bz624767.2.x86_64 Guest info: # uname -a Linux dhcp-91-136.nay.redhat.com 2.6.32-66.el6.x86_64 #1 SMP Wed Aug 18 00:24:41 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux Steps : 1. start a VM with virt_net by using virt-manager. #ps -eaf|grep qemu-kvm /usr/libexec/qemu-kvm -S -M rhel6.0.0 -enable-kvm -m 4096 -smp 2,sockets=2,cores=1,threads=1 -name rhel10 -uuid 30ec7b3f-a312-db85-6d0d-ce3e71452656 -nodefconfig -nodefaults -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/rhel10.monitor,server,nowait -mon chardev=monitor,mode=control -rtc base=utc -boot c -drive file=/home/rhel10,if=none,id=drive-virtio-disk0,boot=on,format=raw,cache=none -device virtio-blk-pci,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0 -netdev tap,fd=20,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:b1:65:e8,bus=pci.0,addr=0x3 -netdev tap,fd=21,id=hostnet1 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:a5:da:e1,bus=pci.0,addr=0x7 -netdev tap,fd=22,id=hostnet2 -device virtio-net-pci,netdev=hostnet2,id=net2,mac=52:54:00:9a:dd:be,bus=pci.0,addr=0x8 -chardev pty,id=serial0 -device isa-serial,chardev=serial0 -usb -vnc 127.0.0.1:0 -vga cirrus -device AC97,id=sound0,bus=pci.0,addr=0x4 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 #brctl show bridge name bridge id STP enabled interfaces breth0 8000.00219bffb80c no eth0 vnet0 vnet1 vnet2 2.check /var/log/messages 3.run netperf Actual Results: after step2 ,can NOT find any soft lockup in messages. after step3 ,netperf can run successully. Increasing priority as we probably want it on day0. Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: The virtio network device has two available algorithms for transmitting packets. The default is to use an asynchronous bottom half transmitter which typically shows good performance for all workloads. The alternate implementation uses a timer to delay transmit in an attempt to batch multiple packets together. The timer approach typically results higher latency, but may improve overall efficiency. If a user wishes to change from the default algorithm to the timer based approach, it can be done by creating a wrapper script around qemu-kvm and specifying it as the emulator for guests requiring it. Instructions to do so follow: 1) Create wrapper script $ cat > /usr/libexec/qemu-kvm.txtimer << EOF #!/bin/sh exec /usr/libexec/qemu-kvm \`echo "\$@" | sed 's|virtio-net-pci|virtio-net-pci,tx=timer|g'\` EOF 2) Make script executable $ chmod 755 /usr/libexec/qemu-kvm.txtimer 3) Set selinux permisions $ restorecon /usr/libexec/qemu-kvm.txtimer 4) Create selinux module $ cat > qemutxtimer.te << EOF policy_module(qemutxtimer, 1.0) gen_require(\` attribute virt_domain; type qemu_exec_t; ') can_exec(virt_domain, qemu_exec_t) EOF 5) Build selinux module $ make -f /usr/share/selinux/devel/Makefile 6) Install selinux module $ semodule -i qemutxtimer.pp # May later be uninstalled with -r 7) Update guest XML to use qemu-kvm wrapper $ virsh edit $GUEST Replace: <emulator>/usr/libexec/qemu-kvm</emulator> With: <emulator>/usr/libexec/qemu-kvm.txtimer</emulator> Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,23 +1,16 @@ -The virtio network device has two available algorithms for transmitting packets. The default is to use an asynchronous bottom half transmitter which typically shows good performance for all workloads. The alternate implementation uses a timer to delay transmit in an attempt to batch multiple packets together. The timer approach typically results higher latency, but may improve overall efficiency. If a user wishes to change from the default algorithm to the timer based approach, it can be done by creating a wrapper script around qemu-kvm and specifying it as the emulator for guests requiring it. Instructions to do so follow: - -1) Create wrapper script - +virtio network device packet transmission algorithms +The virtio network device has two available algorithms for transmitting packets. The default is to use an asynchronous bottom half transmitter which typically shows good performance for all workloads. The alternate implementation uses a timer to delay transmit in an attempt to batch multiple packets together. The timer approach typically results higher latency, but may improve overall efficiency. To change from the default algorithm to the timer based approach, use the following procedure to create a wrapper script around qemu-kvm and specify it as the emulator for guests that require it. +create the wrapper script $ cat > /usr/libexec/qemu-kvm.txtimer << EOF #!/bin/sh exec /usr/libexec/qemu-kvm \`echo "\$@" | sed 's|virtio-net-pci|virtio-net-pci,tx=timer|g'\` EOF - -2) Make script executable - +Make script executable $ chmod 755 /usr/libexec/qemu-kvm.txtimer - -3) Set selinux permisions - +Set selinux permisions $ restorecon /usr/libexec/qemu-kvm.txtimer - -4) Create selinux module - +Create selinux module $ cat > qemutxtimer.te << EOF policy_module(qemutxtimer, 1.0) @@ -28,23 +21,13 @@ can_exec(virt_domain, qemu_exec_t) EOF - -5) Build selinux module - +Build selinux module $ make -f /usr/share/selinux/devel/Makefile - -6) Install selinux module - +Install selinux module $ semodule -i qemutxtimer.pp # May later be uninstalled with -r - -7) Update guest XML to use qemu-kvm wrapper - +Update guest XML to use qemu-kvm wrapper $ virsh edit $GUEST - Replace: - <emulator>/usr/libexec/qemu-kvm</emulator> - With: - <emulator>/usr/libexec/qemu-kvm.txtimer</emulator> Compare UDP throughput between the regular userspace virtio-net under build qemu-kvm-0.12.1.2-2.113.el6 and build qemu-kvm-0.12.1.2-2.129.el6 by using the following steps: 1. start guest by: /usr/libexec/qemu-kvm -M rhel6.0.0 -enable-kvm -m 2G -smp 2,sockets=2,cores=1,threads=1 -name BugVerify -uuid b13c1ed4-0eca-3db9-b754-478768f50409 -monitor stdio -rtc base=utc -boot dc -drive file=/home/RHEL6.raw,if=none,id=drive-virtio-disk0,boot=on,format=raw,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0 -netdev tap,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:00:01:00:00:01 -chardev socket,id=serial0,path=/tmp/ch,server,nowait -device isa-serial,chardev=serial0 -usb -vnc :4 -usb -vga cirrus -device virtio-balloon-pci,id=balloon0 2. 2vcpus, pin each of them to host cpu. tasket -pc 0 $thread_id1 tasket -pc 1 $thread_id2 3. on guest, pin the interrupt to the second vcpu. echo $i > /proc/irq/$irqcur/smp_affinity 4. run netperf from guest to external host: # for b in 32 64 128 256 512 1024 2048 4096 8192 16834 32768 65507; do netperf -t UDP_STREAM -f m -H 192.162.0.3 -P 0 -l 10 -- -m $b; done 5. got the average throughput guest ---> host qemu-kvm-0.12.1.2-2.113.el6.x86_64: Send Send Elasped Throughput Socket Message Time Size Size (bytes) (bytes) (secs) (10^6 bits/sec) 124928 32 10 16.84 124928 64 10 32.99 124928 128 10 66.49 124928 256 10 130.84 124928 512 10 260.26 124928 1024 10 530.21 124928 2048 10 903.95 124928 4096 10 1643.02 124928 8192 10 2773.66 124928 16834 10 4168.64 124928 32768 10 5952.95 124928 65507 10 7471.95 qemu-kvm-0.12.1.2-2.113.el6_0.1.x86_64: Send Send Elasped Throughput Speedup Socket Message Time Size Size (bytes) (bytes) (secs)(10^6bps) 124928 32 10 53.35 124928 64 10 104.02 124928 128 10 201.34 124928 256 10 489.21 124928 512 10 719.36 124928 1024 10 1378.72 124928 2048 10 2599.38 124928 4096 10 5301.71 124928 8192 10 9361.36 124928 16834 10 11662.85 124928 32768 10 17007.65 124928 65507 10 20615.35 guest --->external host qemu-kvm-0.12.1.2-2.113.el6.x86_64: Send Send Elasped Throughput Socket Message Time Size Size (bytes) (bytes) (secs) (10^6 bits/sec) 124928 32 10 24.74 124928 64 10 49.76 124928 128 10 105.45 124928 256 10 171.90 124928 512 10 372.52 124928 1024 10 642.70 124928 2048 10 867.99 124928 4096 10 917.10 124928 8192 10 934.75 124928 16834 10 938.89 124928 32768 10 946.25 124928 65507 10 959.59 qemu-kvm-0.12.1.2-2.113.el6_0.1.x86_64: Send Send Elasped Throughput Speedup Socket Message Time Size Size (bytes) (bytes) (secs)(10^6bps) 124928 32 10 41.14 124928 64 10 99.13 124928 128 10 193.28 124928 256 10 344.12 124928 512 10 575.17 124928 1024 10 805.72 124928 2048 10 834.75 124928 4096 10 917.13 124928 8192 10 934.76 124928 16834 10 938.53 124928 32768 10 946.62 124928 65507 10 951.94 Summary: The bottom half transmitter improves the tx throughput in UDP_STREAM. Testing on qemu-kvm-0.12.1.2-2.129.el6 When running netperf on the guest guest responds slowly. cpu stuck 2 times. How reproducible: sometimes Addtional info: did not find this issue on qemu-kvm-0.12.1.2-2.113.el6 # /var/log/message in the guest Jan 12 01:13:17 localhost rtkit-daemon[1905]: The canary thread is apparently starving. Taking action. Jan 12 01:13:17 localhost rtkit-daemon[1905]: Demoting known real-time threads. Jan 12 01:13:17 localhost rtkit-daemon[1905]: Demoted 0 threads. Jan 12 01:13:17 localhost rtkit-daemon[1905]: The canary thread is apparently st arving. Taking action. Jan 12 01:13:17 localhost rtkit-daemon[1905]: Demoting known real-time threads. Jan 12 01:13:17 localhost rtkit-daemon[1905]: Demoted 0 threads. Jan 12 01:13:17 localhost kernel: BUG: soft lockup - CPU#1 stuck for 186s! [gnom e-settings-:2009] Jan 12 01:13:17 localhost kernel: Modules linked in: fuse autofs4 sunrpc ip6t_RE JECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_ mirror dm_region_hash dm_log uinput ppdev parport_pc parport sg virtio_balloon v irtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk sr_mod cdrom virtio_pc i virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: speed step_lib] Jan 12 01:13:17 localhost kernel: CPU 1: Jan 12 01:13:17 localhost kernel: Modules linked in: fuse autofs4 sunrpc ip6t_RE JECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_ mirror dm_region_hash dm_log uinput ppdev parport_pc parport sg virtio_balloon v irtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk sr_mod cdrom virtio_pc i virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: speed step_lib] Jan 12 01:13:17 localhost kernel: Pid: 2009, comm: gnome-settings- Not tainted 2 .6.32-71.el6.x86_64 #1 KVM Jan 12 01:13:17 localhost kernel: RIP: 0010:[<ffffffff812679f3>] [<ffffffff8126 79f3>] iowrite16+0x23/0x50 Jan 12 01:13:17 localhost kernel: RSP: 0000:ffff880001f03700 EFLAGS: 00000206 Jan 12 01:13:17 localhost kernel: RAX: 0000000000000001 RBX: ffff880001f03700 RC X: 0000000000000001 Jan 12 01:13:17 localhost kernel: RDX: 000000000000c090 RSI: 000000000001c090 RD I: 0000000000000001 Jan 12 01:13:17 localhost kernel: RBP: ffffffff81013c93 R08: ffff88007be268c0 R0 9: ffff880037900430 Jan 12 01:13:17 localhost kernel: R10: ffff880001f034c0 R11: 0000000000000002 R1 2: ffff880001f03680 Jan 12 01:13:17 localhost kernel: R13: ffff88007aea9ec0 R14: ffff88007be268c0 R1 5: ffffffff814cf9fc Jan 12 01:13:17 localhost kernel: FS: 00007ff3bba067a0(0000) GS:ffff880001f0000 0(0000) knlGS:0000000000000000 Jan 12 01:13:17 localhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050 033 Jan 12 01:13:17 localhost kernel: CR2: 000000348a2662a0 CR3: 000000007b8b1000 CR 4: 00000000000006e0 Jan 12 01:13:17 localhost kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR 2: 0000000000000000 Jan 12 01:13:17 localhost kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR 7: 0000000000000400 Jan 12 01:13:17 localhost kernel: Call Trace: Jan 12 01:13:17 localhost kernel: <IRQ> [<ffffffffa0044115>] ? vp_notify+0x25/0 x30 [virtio_pci] Jan 12 01:13:17 localhost kernel: [<ffffffffa003d033>] ? virtqueue_kick+0x33/0x4 0 [virtio_ring] Jan 12 01:13:17 localhost kernel: [<ffffffffa012461d>] ? start_xmit+0x5d/0x1d0 [ virtio_net] Jan 12 01:13:17 localhost kernel: [<ffffffff81410528>] ? dev_hard_start_xmit+0x2 b8/0x370 Jan 12 01:13:17 localhost kernel: [<ffffffff814288aa>] ? sch_direct_xmit+0x15a/0 x1c0 Jan 12 01:13:17 localhost kernel: [<ffffffff81413ab8>] ? dev_queue_xmit+0x378/0x 4a0 Jan 12 01:13:17 localhost kernel: [<ffffffff8142798a>] ? eth_header+0x3a/0xe0 Jan 12 01:13:17 localhost kernel: [<ffffffff81418445>] ? neigh_resolve_output+0x 105/0x370 Jan 12 01:13:17 localhost kernel: [<ffffffff81446b40>] ? ip_finish_output+0x0/0x 310 Jan 12 01:13:17 localhost kernel: [<ffffffff81446c7c>] ? ip_finish_output+0x13c/ 0x310 Jan 12 01:13:17 localhost kernel: [<ffffffff81446f08>] ? ip_output+0xb8/0xc0 Jan 12 01:13:17 localhost kernel: [<ffffffff81445e5f>] ? __ip_local_out+0x9f/0xb 0 Jan 12 01:13:17 localhost kernel: [<ffffffff81445e95>] ? ip_local_out+0x25/0x30 Jan 12 01:13:17 localhost kernel: [<ffffffff814466e0>] ? ip_queue_xmit+0x190/0x4 20 Jan 12 01:13:17 localhost kernel: [<ffffffff814064ca>] ? __alloc_skb+0x7a/0x180 Jan 12 01:13:17 localhost kernel: [<ffffffff8145b391>] ? tcp_transmit_skb+0x3f1/ 0x790 Jan 12 01:13:17 localhost kernel: [<ffffffff8105c490>] ? default_wake_function+0 x0/0x20 Jan 12 01:13:17 localhost kernel: [<ffffffff8145cca9>] ? tcp_send_ack+0xd9/0x120 Jan 12 01:13:17 localhost kernel: [<ffffffff81455bf2>] ? tcp_fin+0xf2/0x1d0 Jan 12 01:13:17 localhost kernel: [<ffffffff81456c96>] ? tcp_data_queue+0x756/0x c80 Jan 12 01:13:17 localhost kernel: [<ffffffff8145a311>] ? tcp_rcv_state_process+0 x6b1/0xa30 Jan 12 01:13:17 localhost kernel: [<ffffffff814616ae>] ? tcp_v4_do_rcv+0x1ce/0x4 30 Jan 12 01:13:17 localhost kernel: [<ffffffff81463050>] ? tcp_v4_rcv+0x5b0/0x7e0 Jan 12 01:13:17 localhost kernel: [<ffffffff8105fa72>] ? enqueue_entity+0x122/0x 320 Jan 12 01:13:17 localhost kernel: [<ffffffff8144108d>] ? ip_local_deliver_finish +0xdd/0x2d0 Jan 12 01:13:17 localhost kernel: [<ffffffff81441318>] ? ip_local_deliver+0x98/0 xa0 Jan 12 01:13:17 localhost kernel: [<ffffffff814407dd>] ? ip_rcv_finish+0x12d/0x4 40 Jan 12 01:13:17 localhost kernel: [<ffffffff81440d65>] ? ip_rcv+0x275/0x350 Jan 12 01:13:17 localhost kernel: [<ffffffff8140f2cb>] ? netif_receive_skb+0x38b /0x670 Jan 12 01:13:17 localhost kernel: [<ffffffffa0125395>] ? virtnet_poll+0x5c5/0x8b 0 [virtio_net] Jan 12 01:13:17 localhost kernel: [<ffffffff8140fe53>] ? net_rx_action+0x103/0x2 10 Jan 12 01:13:17 localhost kernel: [<ffffffff81073bd7>] ? __do_softirq+0xb7/0x1e0 Jan 12 01:13:17 localhost kernel: [<ffffffff810d8740>] ? handle_IRQ_event+0x60/0 x170 Jan 12 01:13:17 localhost kernel: [<ffffffff81073c34>] ? __do_softirq+0x114/0x1e 0 Jan 12 01:13:17 localhost kernel: [<ffffffff810142cc>] ? call_softirq+0x1c/0x30 Jan 12 01:13:17 localhost kernel: [<ffffffff81015f35>] ? do_softirq+0x65/0xa0 Jan 12 01:13:17 localhost kernel: [<ffffffff810739d5>] ? irq_exit+0x85/0x90 Jan 12 01:13:17 localhost kernel: [<ffffffff814cf915>] ? do_IRQ+0x75/0xf0 Jan 12 01:13:17 localhost kernel: [<ffffffff81013ad3>] ? ret_from_intr+0x0/0x11 Jan 12 01:13:17 localhost kernel: <EOI> Jan 12 01:13:17 localhost kernel: BUG: soft lockup - CPU#0 stuck for 186s! [scsi _eh_1:335] Jan 12 01:13:17 localhost kernel: Modules linked in: fuse autofs4 sunrpc ip6t_RE JECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_ mirror dm_region_hash dm_log uinput ppdev parport_pc parport sg virtio_balloon v irtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk sr_mod cdrom virtio_pc i virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: speed step_lib] Jan 12 01:13:17 localhost kernel: CPU 0: Jan 12 01:13:17 localhost kernel: Modules linked in: fuse autofs4 sunrpc ip6t_RE JECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_ mirror dm_region_hash dm_log uinput ppdev parport_pc parport sg virtio_balloon v irtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk sr_mod cdrom virtio_pc i virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: speed step_lib] Jan 12 01:13:17 localhost kernel: Pid: 335, comm: scsi_eh_1 Not tainted 2.6.32-7 1.el6.x86_64 #1 KVM Jan 12 01:13:17 localhost kernel: RIP: 0010:[<ffffffff8107dde6>] [<ffffffff8107 dde6>] run_timer_softirq+0x176/0x340 Jan 12 01:13:17 localhost kernel: RSP: 0018:ffff880001e03e50 EFLAGS: 00000246 Jan 12 01:13:17 localhost kernel: RAX: ffff880001e03e90 RBX: ffff880001e03ed0 RC X: 00000000000005c0 Jan 12 01:13:17 localhost kernel: RDX: ffff880001e03e90 RSI: dead000000200200 RD I: 0000000000000000 Jan 12 01:13:17 localhost kernel: RBP: ffffffff81013c93 R08: ffffffff81ada7a0 R0 9: 0000000000000000 Jan 12 01:13:17 localhost kernel: R10: 0000000000000000 R11: 0000000000000400 R1 2: ffff880001e03dd0 Jan 12 01:13:17 localhost kernel: R13: ffffffffa01f51e8 R14: ffffffff81ada1c0 R1 5: ffffffff814cf9fc Jan 12 01:13:17 localhost kernel: FS: 0000000000000000(0000) GS:ffff880001e0000 0(0000) knlGS:0000000000000000 Jan 12 01:13:17 localhost kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 0000000080050 03b Jan 12 01:13:17 localhost kernel: CR2: 000000348a339d50 CR3: 000000003772b000 CR 4: 00000000000006f0 Jan 12 01:13:17 localhost kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR 2: 0000000000000000 Jan 12 01:13:17 localhost kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR 7: 0000000000000400 Jan 12 01:13:17 localhost kernel: Call Trace: Jan 12 01:13:17 localhost kernel: <IRQ> [<ffffffff8102f52d>] ? lapic_next_event +0x1d/0x30 Jan 12 01:13:17 localhost kernel: [<ffffffff81073bd7>] ? __do_softirq+0xb7/0x1e0 Jan 12 01:13:17 localhost kernel: [<ffffffff81095a50>] ? hrtimer_interrupt+0x140 /0x250 Jan 12 01:13:17 localhost kernel: [<ffffffff81366a80>] ? sata_sff_hardreset+0x0/ 0x80 Jan 12 01:13:17 localhost kernel: [<ffffffff810142cc>] ? call_softirq+0x1c/0x30 Jan 12 01:13:17 localhost kernel: [<ffffffff81015f35>] ? do_softirq+0x65/0xa0 Jan 12 01:13:17 localhost kernel: [<ffffffff810739d5>] ? irq_exit+0x85/0x90 Jan 12 01:13:17 localhost kernel: [<ffffffff814cfa01>] ? smp_apic_timer_interrup t+0x71/0x9c Jan 12 01:13:17 localhost kernel: [<ffffffff813677e0>] ? ata_sff_softreset+0x0/0 x180 Jan 12 01:13:17 localhost kernel: [<ffffffff81013c93>] ? apic_timer_interrupt+0x 13/0x20 Jan 12 01:13:17 localhost kernel: <EOI> [<ffffffff814caa87>] ? _spin_unlock_irq restore+0x17/0x20 Jan 12 01:13:17 localhost kernel: [<ffffffff81366700>] ? ata_sff_error_handler+0 xd0/0x240 Jan 12 01:13:17 localhost kernel: [<ffffffff81363736>] ? ata_scsi_error+0x656/0x 910 Jan 12 01:13:17 localhost kernel: [<ffffffff81347846>] ? scsi_error_handler+0x12 6/0x630 Jan 12 01:13:17 localhost kernel: Clocksource tsc unstable (delta = -2728019381 ns) Jan 12 01:13:17 localhost kernel: [<ffffffff8105c4a2>] ? default_wake_function+0 x12/0x20 Jan 12 01:13:17 localhost kernel: [<ffffffff81347720>] ? scsi_error_handler+0x0/ 0x630 Jan 12 01:13:17 localhost kernel: [<ffffffff81091936>] ? kthread+0x96/0xa0 Jan 12 01:13:17 localhost kernel: [<ffffffff810141ca>] ? child_rip+0xa/0x20 Jan 12 01:13:17 localhost kernel: [<ffffffff810918a0>] ? kthread+0x0/0xa0 Jan 12 01:13:17 localhost kernel: [<ffffffff810141c0>] ? child_rip+0x0/0x20 bcao--->alex Does this above related to this bug? There are a couple new options you can use to try to diagnose if this is related to the new tx bottom half handler: tx=timer|bh "bh" is the new default, and enables the bottom half handler, "timer" re-enables the old code. example: -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:00:01:00:00:01,tx=timer If you find that the problem goes away with the timer, you can switch back to the "bh" handler, and use this option: x-txburst= This controls how many packets can be sent on a single flush of the tx queue. The default is 256. Reducing this value ensures we don't hog too much time in the bottom half handler. example: -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:00:01:00:00:01,x-txburst=128 Which netperf test case was causing this? We really should not be able to hold off one of the guest CPUs for so long with the defaults. Does not pinning the interrupt in the guest make a difference? (In reply to comment #29) > There are a couple new options you can use to try to diagnose if this is > related to the new tx bottom half handler: > > tx=timer|bh > > "bh" is the new default, and enables the bottom half handler, "timer" > re-enables the old code. example: > > -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:00:01:00:00:01,tx=timer When running netperf and ping www.redhat.com in the guest call trace occured(Tried lots of times,only happened 1 times ,the other times guest works fine ,no stalls occurs) # dmesg BUG: soft lockup - CPU#0 stuck for 88s! [udisks-daemon:2106] Modules linked in: fuse autofs4 sunrpc ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log uinput ppdev parport_pc parport sg virtio_balloon virtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk sr_mod cdrom virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: speedstep_lib] CPU 0: Modules linked in: fuse autofs4 sunrpc ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log uinput ppdev parport_pc parport sg virtio_balloon virtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk sr_mod cdrom virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: speedstep_lib] Pid: 2106, comm: udisks-daemon Not tainted 2.6.32-71.el6.x86_64 #1 KVM RIP: 0010:[<ffffffff814caa87>] [<ffffffff814caa87>] _spin_unlock_irqrestore+0x17/0x20 RSP: 0018:ffff880001e03d00 EFLAGS: 00000282 RAX: 0000000000000000 RBX: ffff880001e03d00 RCX: 0000000000000000 RDX: ffffffff81c984e0 RSI: 0000000000000282 RDI: 0000000000000282 RBP: ffffffff81013c93 R08: ffff88007ba92000 R09: ffff880001e03ee0 R10: 0000000000000228 R11: 0000000000000282 R12: ffff880001e03c80 R13: 0000000000000000 R14: 0000000000000010 R15: ffffffff814cf9fc FS: 00007f1b3ed067a0(0000) GS:ffff880001e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000348a277900 CR3: 000000007bb7e000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Call Trace: <IRQ> [<ffffffff813ae6ee>] ? i8042_interrupt+0x10e/0x3a0 [<ffffffff8103ccd8>] ? pvclock_clocksource_read+0x58/0xd0 [<ffffffff810d8740>] ? handle_IRQ_event+0x60/0x170 [<ffffffff810dae26>] ? handle_edge_irq+0xc6/0x160 [<ffffffff810670c1>] ? check_preempt_wakeup+0x41/0x3c0 [<ffffffff81015fb9>] ? handle_irq+0x49/0xa0 [<ffffffff814cf90c>] ? do_IRQ+0x6c/0xf0 [<ffffffff81013ad3>] ? ret_from_intr+0x0/0x11 [<ffffffff81073b8b>] ? __do_softirq+0x6b/0x1e0 [<ffffffff81095a50>] ? hrtimer_interrupt+0x140/0x250 [<ffffffff810142cc>] ? call_softirq+0x1c/0x30 [<ffffffff81015f35>] ? do_softirq+0x65/0xa0 [<ffffffff810739d5>] ? irq_exit+0x85/0x90 [<ffffffff814cfa01>] ? smp_apic_timer_interrupt+0x71/0x9c [<ffffffff81013c93>] ? apic_timer_interrupt+0x13/0x20 <EOI> [<ffffffff814caa87>] ? _spin_unlock_irqrestore+0x17/0x20 [<ffffffff81341bcc>] ? scsi_dispatch_cmd+0x1bc/0x340 [<ffffffff81349535>] ? scsi_request_fn+0x425/0x5c0 [<ffffffff81239c2e>] ? elv_insert+0x4e/0x1f0 [<ffffffff8123fff2>] ? __generic_unplug_device+0x32/0x40 [<ffffffff8124640d>] ? blk_execute_rq_nowait+0x6d/0xc0 [<ffffffff812464e4>] ? blk_execute_rq+0x84/0xf0 [<ffffffff812423a5>] ? blk_get_request+0x75/0xa0 [<ffffffff8134ab5c>] ? scsi_execute+0xfc/0x160 [<ffffffff8134adb6>] ? scsi_execute_req+0xb6/0x190 [<ffffffffa005d2eb>] ? sr_test_unit_ready+0x6b/0xf0 [sr_mod] [<ffffffff81156bb5>] ? kmem_cache_alloc_notrace+0x115/0x130 [<ffffffffa005e2bb>] ? sr_media_change+0xab/0x350 [sr_mod] [<ffffffff8109205c>] ? remove_wait_queue+0x3c/0x50 [<ffffffff811810c6>] ? free_poll_entry+0x26/0x30 [<ffffffffa004c0b0>] ? media_changed+0x70/0xc0 [cdrom] [<ffffffffa004c137>] ? cdrom_media_changed+0x37/0x40 [cdrom] [<ffffffffa005d3d9>] ? sr_block_media_changed+0x19/0x20 [sr_mod] [<ffffffff811a3553>] ? check_disk_change+0x33/0x80 [<ffffffffa004f308>] ? cdrom_open+0x248/0x600 [cdrom] [<ffffffff8125afca>] ? kobject_get+0x1a/0x30 [<ffffffff81248a1d>] ? get_disk+0x7d/0xf0 [<ffffffff8125afca>] ? kobject_get+0x1a/0x30 [<ffffffffa005d599>] ? sr_block_open+0x79/0xd0 [sr_mod] [<ffffffff8132a3c9>] ? get_device+0x19/0x20 [<ffffffff811a4393>] ? __blkdev_get+0xc3/0x3c0 [<ffffffff811a46b0>] ? blkdev_open+0x0/0xc0 [<ffffffff811a46a0>] ? blkdev_get+0x10/0x20 [<ffffffff811a4721>] ? blkdev_open+0x71/0xc0 [<ffffffff81169b80>] ? __dentry_open+0x110/0x370 [<ffffffff81208442>] ? selinux_inode_permission+0x72/0xb0 [<ffffffff8120051f>] ? security_inode_permission+0x1f/0x30 [<ffffffff81169ef7>] ? nameidata_to_filp+0x57/0x70 [<ffffffff8117d313>] ? do_filp_open+0x5f3/0xd40 [<ffffffff81186330>] ? iput+0x30/0x70 [<ffffffff8103ccd8>] ? pvclock_clocksource_read+0x58/0xd0 [<ffffffff8103ccd8>] ? pvclock_clocksource_read+0x58/0xd0 [<ffffffff811892f2>] ? alloc_fd+0x92/0x160 [<ffffffff81169929>] ? do_sys_open+0x69/0x140 [<ffffffff81014645>] ? math_state_restore+0x45/0x60 [<ffffffff81169a40>] ? sys_open+0x20/0x30 [<ffffffff81013172>] ? system_call_fastpath+0x16/0x1b BUG: soft lockup - CPU#1 stuck for 109s! [swapper:0] Modules linked in: fuse autofs4 sunrpc ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log uinput ppdev parport_pc parport sg virtio_balloon virtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk sr_mod cdrom virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: speedstep_lib] CPU 1: Modules linked in: fuse autofs4 sunrpc ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log uinput ppdev parport_pc parport sg virtio_balloon virtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk sr_mod cdrom virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: speedstep_lib] Pid: 0, comm: swapper Not tainted 2.6.32-71.el6.x86_64 #1 KVM RIP: 0010:[<ffffffff8107dde6>] [<ffffffff8107dde6>] run_timer_softirq+0x176/0x340 RSP: 0018:ffff880001f03e50 EFLAGS: 00000246 RAX: ffff880001f03e90 RBX: ffff880001f03ed0 RCX: 0000000000000f00 RDX: ffff880001f03e90 RSI: dead000000200200 RDI: 0000000000000000 RBP: ffffffff81013c93 R08: ffff88007ccd8f20 R09: 0000000000000000 R10: 0000000000000000 R11: 000000000000000f R12: ffff880001f03dd0 R13: ffff880079491df0 R14: ffff88007ccd8000 R15: ffffffff814cf9fc FS: 0000000000000000(0000) GS:ffff880001f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 000000348a2da060 CR3: 000000007ba54000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Call Trace: <IRQ> [<ffffffff81073bd7>] ? __do_softirq+0xb7/0x1e0 [<ffffffff81095a50>] ? hrtimer_interrupt+0x140/0x250 [<ffffffff810142cc>] ? call_softirq+0x1c/0x30 [<ffffffff81015f35>] ? do_softirq+0x65/0xa0 [<ffffffff810739d5>] ? irq_exit+0x85/0x90 [<ffffffff814cfa01>] ? smp_apic_timer_interrupt+0x71/0x9c [<ffffffff81013c93>] ? apic_timer_interrupt+0x13/0x20 <EOI> [<ffffffff8103be8b>] ? native_safe_halt+0xb/0x10 [<ffffffff8101bd3d>] ? default_idle+0x3d/0x90 [<ffffffff81011e96>] ? cpu_idle+0xb6/0x110 [<ffffffff814c17c8>] ? start_secondary+0x1fc/0x23f ata2: lost interrupt (Status 0x58) ata2: drained 6 bytes to clear DRQ. ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen sr 1:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00 ata2.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0 res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout) ata2.00: status: { DRDY } ata2: soft resetting link ata2.01: NODEV after polling detection ata2.00: configured for MWDMA2 ata2: EH complete ip_tables: (C) 2000-2006 Netfilter Core Team > > If you find that the problem goes away with the timer, you can switch back to > the "bh" handler, and use this option: > > x-txburst= > > This controls how many packets can be sent on a single flush of the tx queue. > The default is 256. Reducing this value ensures we don't hog too much time in > the bottom half handler. example: > > -device > virtio-net-pci,netdev=hostnet0,id=net0,mac=00:00:01:00:00:01,x-txburst=128 > guest responds always slowly when running nerperf,sometimes guest,sometimes guest hang even cpu soft stuck. > Which netperf test case was causing this? We really should not be able to hold > off one of the guest CPUs for so long with the defaults. Does not pinning the > interrupt in the guest make a difference? netperf command: for b in 32 64 128 256 512 1024 2048 4096 8192 16834 32768 65507; do ./netperf -t TCP_STREAM -f m -H 192.162.0.1 -P 0 -l 10 -- -m $b; done I tried all the testing without pinning interrupt in this comment. (In reply to comment #30) > (In reply to comment #29) > > There are a couple new options you can use to try to diagnose if this is > > related to the new tx bottom half handler: > > > > tx=timer|bh > > > > "bh" is the new default, and enables the bottom half handler, "timer" > > re-enables the old code. example: > > > > -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:00:01:00:00:01,tx=timer > > When running netperf and ping www.redhat.com in the guest call trace > occured(Tried lots of times,only happened 1 times ,the other times guest works > fine ,no stalls occurs) If the problem still occurs with tx=timer, then chances are good it's not related to this change. Perhaps you can bisect the revisions between 113 and 129 to figure out more precisely when the problem was introduced. Also, can you verify the host cpu load when this softlockup happens? It looks like the vcpu is preempted and not run for quite some time. (In reply to comment #32) > Also, can you verify the host cpu load when this softlockup happens? It looks > like the vcpu is preempted and not run for quite some time. no softlockup on the host. (In reply to comment #28) > Jan 12 01:13:17 localhost rtkit-daemon[1905]: The canary thread is apparently > starving. Taking action. > Jan 12 01:13:17 localhost rtkit-daemon[1905]: Demoting known real-time threads. > Jan 12 01:13:17 localhost rtkit-daemon[1905]: Demoted 0 threads. > Jan 12 01:13:17 localhost rtkit-daemon[1905]: The canary thread is apparently > st > arving. Taking action. > Jan 12 01:13:17 localhost rtkit-daemon[1905]: Demoting known real-time threads. > Jan 12 01:13:17 localhost rtkit-daemon[1905]: Demoted 0 threads. > Jan 12 01:13:17 localhost kernel: BUG: soft lockup - CPU#1 stuck for 186s! ... What is rtkit doing here? Can you reproduce this without it? Letting rtkit set real time priorities definitely seems like it could be contributing. (In reply to comment #30) ... > [<ffffffff81011e96>] ? cpu_idle+0xb6/0x110 > [<ffffffff814c17c8>] ? start_secondary+0x1fc/0x23f > ata2: lost interrupt (Status 0x58) > ata2: drained 6 bytes to clear DRQ. > ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen > sr 1:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00 > ata2.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0 > res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout) > ata2.00: status: { DRDY } > ata2: soft resetting link > ata2.01: NODEV after polling detection > ata2.00: configured for MWDMA2 > ata2: EH complete > ip_tables: (C) 2000-2006 Netfilter Core Team This softlockup looks like it happened during boot, before even starting the netperf test. Can you confirm? I've run well over 100GB of netperf tests through a guest on qemu-kvm 128. The guest definitely gets sluggish during small packet UDP_STREAM guest->external tests, but I haven't seen any soft lockups. The txburt limiter we added during the development of this patch is specifically to prevent such things. I'm suspicious that something else is going on here. > What is rtkit doing here? Can you reproduce this without it? Letting rtkit > set real time priorities definitely seems like it could be contributing. > I install RHEL6 default with gnome ,Could you tell me how to stop it ? > This softlockup looks like it happened during boot, before even starting the > netperf test. Can you confirm? I am sure the softlockup was happened after running nerperf because the softup was collected by abrtd service . But I can not reproduce it with a newly installed rhel6 image.only find guest gets sluggish when netperf is running. Maybe something wrong with the previous image . (In reply to comment #37) > > What is rtkit doing here? Can you reproduce this without it? Letting rtkit > > set real time priorities definitely seems like it could be contributing. > > > > I install RHEL6 default with gnome ,Could you tell me how to stop it ? Looks like you can use 'rtkitctl -k' to have it exit. I don't know that it's related, but it's suspicious that it seems to be actively working in close approximation of the softlockups. It could also be triggered by the lack of progress as a result of the softlockup. > > This softlockup looks like it happened during boot, before even starting the > > netperf test. Can you confirm? > > I am sure the softlockup was happened after running nerperf because the softup > was collected by abrtd service . The part that draws my attention is that ip_tables is starting up after the softlockup. That module is typically loaded via the init scripts during bootup. Perhaps abrtd is reporting a softlockup that occurred earlier? > But I can not reproduce it with a newly installed rhel6 image.only find guest > gets sluggish when netperf is running. > > Maybe something wrong with the previous image . That matches my experience. The guest gets sluggish, especially when netperf is using small packets, but there are no softlockups. Based on comment #37 & comment #38, this issue has been fixed Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,33 +1,7 @@ -virtio network device packet transmission algorithms -The virtio network device has two available algorithms for transmitting packets. The default is to use an asynchronous bottom half transmitter which typically shows good performance for all workloads. The alternate implementation uses a timer to delay transmit in an attempt to batch multiple packets together. The timer approach typically results higher latency, but may improve overall efficiency. To change from the default algorithm to the timer based approach, use the following procedure to create a wrapper script around qemu-kvm and specify it as the emulator for guests that require it. -create the wrapper script -$ cat > /usr/libexec/qemu-kvm.txtimer << EOF -#!/bin/sh -exec /usr/libexec/qemu-kvm \`echo "\$@" | sed -'s|virtio-net-pci|virtio-net-pci,tx=timer|g'\` -EOF -Make script executable -$ chmod 755 /usr/libexec/qemu-kvm.txtimer -Set selinux permisions -$ restorecon /usr/libexec/qemu-kvm.txtimer -Create selinux module -$ cat > qemutxtimer.te << EOF -policy_module(qemutxtimer, 1.0) +Cause: update, virtio-net used a packet transmission algorithm that was using a timer to delay a transmission in an attempt to batch multiple packets together -gen_require(\` - attribute virt_domain; - type qemu_exec_t; -') +Consequence: higher virtio-net transmission latency. -can_exec(virt_domain, qemu_exec_t) +Fix: the default algorithm has been changed to use an asynchronous bottom half transmitter -EOF + -Build selinux module +Result: improved virtio-net transmission latency.-$ make -f /usr/share/selinux/devel/Makefile -Install selinux module -$ semodule -i qemutxtimer.pp # May later be uninstalled with -r -Update guest XML to use qemu-kvm wrapper -$ virsh edit $GUEST -Replace: - <emulator>/usr/libexec/qemu-kvm</emulator> -With: - <emulator>/usr/libexec/qemu-kvm.txtimer</emulator> An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0534.html An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0534.html |