Bug 624767

Summary: Replace virtio-net TX timer mitigation with bottom half handler
Product: Red Hat Enterprise Linux 6 Reporter: Alex Williamson <alex.williamson>
Component: qemu-kvmAssignee: Alex Williamson <alex.williamson>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.0CC: akong, bcao, chrisw, ddumas, ehabkost, Jes.Sorensen, jwest, lihuang, llim, michen, mjenner, mkenneth, mwagner, syeghiay, tburke, virt-maint
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-0.12.1.2-2.114.el6 Doc Type: Bug Fix
Doc Text:
Cause: update, virtio-net used a packet transmission algorithm that was using a timer to delay a transmission in an attempt to batch multiple packets together Consequence: higher virtio-net transmission latency. Fix: the default algorithm has been changed to use an asynchronous bottom half transmitter Result: improved virtio-net transmission latency.
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-05-19 11:22:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 633963    

Description Alex Williamson 2010-08-17 17:15:05 UTC
Description of problem:
Userspace virtio-net has a performance regression from RHEL5, as a fallback to vhost, we need to forward port RHEL5 TX mitigation support

Version-Release number of selected component (if applicable):
qemu-kvm-0.12.1.2-2.110.el6.x86_64

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Alex Williamson 2010-08-17 17:28:55 UTC
Initial port brew build, seems to work but still needs more thorough code review:

https://brewweb.devel.redhat.com/taskinfo?taskID=2686097

Comment 2 Alex Williamson 2010-08-17 19:46:39 UTC
Another brew, including bz624769 fix, plus an exit fix.  Matches posted patches:

https://brewweb.devel.redhat.com/taskinfo?taskID=2686547

Comment 4 Miya Chen 2010-08-19 09:21:59 UTC
Compare TCP throughput between the regular userspace virtio-net and regular userspace virtio-net with the tx mitigation patch by using the following steps:
1. start guest by:
/usr/libexec/qemu-kvm -M rhel6.0.0 -enable-kvm -m 2G -smp 2 -uuid `uuidgen` -monitor stdio -rtc base=localtime -usbdevice tablet -drive file=/root/michen/rhel6-64-0811.2.qcow2,if=none,format=qcow2,werror=stop,rerror=stop,id=drive-ide,cache=none -device ide-drive,drive=drive-ide -netdev tap,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=20:20:20:56:90:90 -vnc :10
2. run netperf from guest to host and external host:
#  for b in 32 64 128 256 512 1024 2048 4096 8192 16834 32768 65507; do ./netperf -t TCP_STREAM -f m -H 192.162.0.1 -P 0 -l 10 -- -m $b; done
3. got the average throughput

Actual result:
1. guest --> host
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec
for qemu-kvm-0.12.1.2-2.110.el6.bz624767.2:
 87380  16384     32    10.00     220.06
 87380  16384     64    10.00     262.74
 87380  16384    128    10.00     440.72
 87380  16384    256    10.00     1037.03
 87380  16384    512    10.00     2167.89
 87380  16384   1024    10.00     4918.59
 87380  16384   2048    10.00     10538.86
 87380  16384   4096    10.00     12017.91
 87380  16384   8192    10.00     10681.91
 87380  16384  16834    10.00     11317.24
 87380  16384  32768    10.00     10014.85
 87380  16384  65507    10.00     10844.33
for qemu-kvm-0.12.1.2-2.110.el6:
 87380  16384     32    10.00     343.93   
 87380  16384     64    10.00     528.98   
 87380  16384    128    10.00     792.85   
 87380  16384    256    10.00    1042.84   
 87380  16384    512    10.00    2084.18   
 87380  16384   1024    10.00    3292.36   
 87380  16384   2048    10.00    3351.13   
 87380  16384   4096    10.00    4263.10   
 87380  16384   8192    10.00    4087.28   
 87380  16384  16834    10.00    3995.17   
 87380  16384  32768    10.00    3875.20   
 87380  16384  65507    10.00    4120.90  

when message size is larger than 128 bytes, throughput got big jump.

2. guest --> externel host
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec
qemu-kvm-0.12.1.2-2.110.el6.bz624767.2:
 87380  16384     32    10.00     226.60   
 87380  16384     64    10.00     320.50   
 87380  16384    128    10.02     868.02   
 87380  16384    256    10.02     860.57   
 87380  16384    512    10.02     794.40   
 87380  16384   1024    10.02     866.56   
 87380  16384   2048    10.02     824.94   
 87380  16384   4096    10.02     899.32   
 87380  16384   8192    10.02     871.77   
 87380  16384  16834    10.01     846.30   
 87380  16384  32768    10.02     875.60   
 87380  16384  65507    10.01     868.32   
qemu-kvm-0.12.1.2-2.110.el6:
 87380  16384     32    10.00     366.13   
 87380  16384     64    10.00     534.82   
 87380  16384    128    10.00     843.38   
 87380  16384    256    10.02     913.79   
 87380  16384    512    10.02     869.24   
 87380  16384   1024    10.02     851.47   
 87380  16384   2048    10.01     897.35   
 87380  16384   4096    10.02     854.27   
 87380  16384   8192    10.01     823.00   
 87380  16384  16834    10.02     895.84   
 87380  16384  32768    10.02     867.61   
 87380  16384  65507    10.02     861.71   

No obvious improvement.

michen-->Alex, could you help to check the above test result? thanks.

Comment 5 Mark Wagner 2010-08-19 11:23:31 UTC
Based on what we saw in RHEL5.4 I would expect the biggest differences to come with UDP and also when using multi-guest to multi-external  drivers.   For simple TCP netperfs I would expect a slow down at small message sizes.  I'm running tests on 10Gbit but it will take another day to get things completed.

Comment 6 Alex Williamson 2010-08-19 13:59:17 UTC
(In reply to comment #4)
> 
> michen-->Alex, could you help to check the above test result? thanks.

They seem plausible.  When doing guest<->host numbers it's important to use numactl to ensure host and guest are running with optimal locality.  Running on the wrong core, even within a single socket, can significantly skew numbers.  A 1Gbps link also really isn't that hard to saturate, and the difference comes with how much CPU is required to push the data.  I think probably the best division of testing would be for qe to do functionality testing, particularly migration and hotplug, while Mark's team tries to evaluate the performance.

Comment 8 Mark Wagner 2010-08-21 00:02:06 UTC
I'm seeing in an issue with the NIC under test when using the new 
user space.  I'm seeing this in the /var/log/message file on the 
guest (and the netperf fails...) 

I did not see this with the other userspace. In fairness it just showed up 
so it didn't happen immediately either...

 (*** guest *** )
[root@dhcp47-55 ~]# uname -a  
Linux dhcp47-55.lab.bos.redhat.com 2.6.32-63.el6.x86_64 #1 SMP Wed Aug 11
00:38:12 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
[

Host
[root@perf22 ~]# uname -a
Linux perf22.lab.bos.redhat.com 2.6.32-63.el6.x86_64 #1 SMP Wed Aug 11 00:38:12
EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

[root@perf22 ~]# rpm -qa | grep kvm
qemu-kvm-0.12.1.2-2.110.el6.bz624767.2.x86_64
qemu-kvm-debuginfo-0.12.1.2-2.110.el6.bz624767.2.x86_64
qemu-kvm-tools-0.12.1.2-2.110.el6.bz624767.2.x86_64

[root@perf22 ~]# ps -ef | grep test1
qemu      2700     1 86 08:38 ?        09:41:24 /usr/libexec/qemu-kvm -S -M
rhel6.0.0 -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -name test1
-uuid df4bee91-9961-801a-957d-d3da87bb68b7 -nodefconfig -nodefaults -chardev
socket,id=monitor,path=/var/lib/libvirt/qemu/test1.monitor,server,nowait -mon
chardev=monitor,mode=control -rtc base=utc -boot c -drive
file=/var/lib/libvirt/images/test1.img,if=none,id=drive-virtio-disk0,boot=on,format=raw,cache=none
-device
virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
-netdev tap,fd=20,id=hostnet0 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:60:4b:64,bus=pci.0,addr=0x3
-netdev tap,fd=21,id=hostnet1 -device
virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:e7:11:59,bus=pci.0,addr=0x6
-netdev tap,fd=22,id=hostnet2 -device
virtio-net-pci,netdev=hostnet2,id=net2,mac=52:54:00:8c:a3:92,bus=pci.0,addr=0x7
-chardev pty,id=serial0 -device isa-serial,chardev=serial0 -usb -device
usb-tablet,id=input0 -vnc 127.0.0.1:0 -vga cirrus -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
root     32245 32186  0 19:52 pts/11   00:00:00 grep test1

bridge name bridge id  STP enabled interfaces
br0  8000.000000000000 no  
br2  8000.003048c5a686 no  eth2
       vnet11
       vnet14
       vnet17
       vnet2
       vnet20
       vnet23
       vnet5
       vnet8
br3  8000.003048c5a687 no  eth3
       vnet0
       vnet12
       vnet15
       vnet18
       vnet21
       vnet3
       vnet6
       vnet9
br4  8000.001b213cabfc no  eth4
       vnet1
       vnet10
       vnet13
       vnet16
       vnet19
       vnet22
       vnet4
       vnet7
virbr0  8000.000000000000 yes  

[root@perf22 ~]# lsmod | grep vhost
[root@perf22 ~]# 



(There are more messages as well)

Aug 20 16:20:03 dhcp47-55 kernel: BUG: soft lockup - CPU#0 stuck for 171s!
[ksoftirqd/0:4]
Aug 20 16:20:03 dhcp47-55 kernel: Modules linked in: ipv6 dm_mirror
dm_region_hash dm_log virtio_net virtio_balloon i2c_piix4 i2c_core ext4 mbcache
jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix
dm_mod [last unloaded: microcode]
Aug 20 16:20:03 dhcp47-55 kernel: CPU 0:
Aug 20 16:20:03 dhcp47-55 kernel: Modules linked in: ipv6 dm_mirror
dm_region_hash dm_log virtio_net virtio_balloon i2c_piix4 i2c_core ext4 mbcache
jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix
dm_mod [last unloaded: microcode]
Aug 20 16:20:03 dhcp47-55 kernel: Pid: 4, comm: ksoftirqd/0 Not tainted
2.6.32-63.el6.x86_64 #1 KVM
Aug 20 16:20:03 dhcp47-55 kernel: RIP: 0010:[<ffffffff814cb1e7>] 
[<ffffffff814cb1e7>] _spin_unlock_irqrestore+0x17/0x20
Aug 20 16:20:03 dhcp47-55 kernel: RSP: 0018:ffff880001e03df8  EFLAGS: 00000282
Aug 20 16:20:03 dhcp47-55 kernel: RAX: 0000000000000001 RBX: ffff880001e03df8
RCX: 000000000000c020
Aug 20 16:20:03 dhcp47-55 kernel: RDX: ffff880037626220 RSI: 0000000000000282
RDI: 0000000000000282
Aug 20 16:20:03 dhcp47-55 kernel: RBP: ffffffff81013c93 R08: 000000000000080a
R09: 0000000000000001
Aug 20 16:20:03 dhcp47-55 kernel: R10: ffff8800376262b8 R11: 0000000000000002
R12: ffff880001e03d70
Aug 20 16:20:03 dhcp47-55 kernel: R13: ffff880037626000 R14: ffff880037626198
R15: ffffffff814d015c
Aug 20 16:20:03 dhcp47-55 kernel: FS:  0000000000000000(0000)
GS:ffff880001e00000(0000) knlGS:0000000000000000
Aug 20 16:20:03 dhcp47-55 kernel: CS:  0010 DS: 0018 ES: 0018 CR0:
000000008005003b
Aug 20 16:20:03 dhcp47-55 kernel: CR2: 000000391f2a67f0 CR3: 00000000799dd000
CR4: 00000000000006f0
Aug 20 16:20:03 dhcp47-55 kernel: DR0: 0000000000000000 DR1: 0000000000000000
DR2: 0000000000000000
Aug 20 16:20:03 dhcp47-55 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0
DR7: 0000000000000400
Aug 20 16:20:03 dhcp47-55 kernel: Call Trace:
Aug 20 16:20:03 dhcp47-55 kernel: <IRQ>  [<ffffffff813aa06e>] ?
uhci_hub_status_data+0x6e/0x200
Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff81385400>] ? rh_timer_func+0x0/0x10
Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff81384b3f>] ?
usb_hcd_poll_rh_status+0x5f/0x180
Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff81385400>] ? rh_timer_func+0x0/0x10
Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff8138540e>] ? rh_timer_func+0xe/0x10
Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff8107da37>] ?
run_timer_softirq+0x197/0x340
Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff81073807>] ?
__do_softirq+0xb7/0x1e0
Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff810142cc>] ? call_softirq+0x1c/0x30
Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff810142cc>] ? call_softirq+0x1c/0x30
Aug 20 16:20:03 dhcp47-55 kernel: <EOI>  [<ffffffff81015f35>] ?
do_softirq+0x65/0xa0
Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff81073380>] ? ksoftirqd+0x80/0x110
Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff81073300>] ? ksoftirqd+0x0/0x110
Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff810915d6>] ? kthread+0x96/0xa0
Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff810141ca>] ? child_rip+0xa/0x20
Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff81091540>] ? kthread+0x0/0xa0
Aug 20 16:20:03 dhcp47-55 kernel: [<ffffffff810141c0>] ? child_rip+0x0/0x20
Aug 20 16:25:21 dhcp47-55 dhclient: DHCPREQUEST on eth0 to 10.16.46.224 port 67
Aug 20 16:25:21 dhcp47-55 dhclient: DHCPACK from 10.16.46.224
Aug 20 16:25:21 dhcp47-55 dhclient: bound to 10.16.47.55 -- renewal in 835
seconds.
Aug 20 16:35:53 dhcp47-55 kernel: BUG: soft lockup - CPU#0 stuck for 83s!
[events/0:9]
Aug 20 16:35:53 dhcp47-55 kernel: Modules linked in: ipv6 dm_mirror
dm_region_hash dm_log virtio_net virtio_balloon i2c_piix4 i2c_core ext4 mbcache
jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix
dm_mod [last unloaded: microcode]
Aug 20 16:35:53 dhcp47-55 kernel: CPU 0:
Aug 20 16:35:53 dhcp47-55 kernel: Modules linked in: ipv6 dm_mirror
dm_region_hash dm_log virtio_net virtio_balloon i2c_piix4 i2c_core ext4 mbcache
jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix
dm_mod [last unloaded: microcode]
Aug 20 16:35:53 dhcp47-55 kernel: Pid: 9, comm: events/0 Not tainted
2.6.32-63.el6.x86_64 #1 KVM
Aug 20 16:35:53 dhcp47-55 kernel: RIP: 0010:[<ffffffff81059a02>] 
[<ffffffff81059a02>] finish_task_switch+0x52/0xd0
Aug 20 16:35:53 dhcp47-55 kernel: RSP: 0018:ffff88007cd07d40  EFLAGS: 00000246
Aug 20 16:35:53 dhcp47-55 kernel: RAX: ffff880001e16980 RBX: ffff88007cd07d70
RCX: ffff8800795c9800
Aug 20 16:35:53 dhcp47-55 kernel: RDX: 0000000000000000 RSI: ffff8800794b7520
RDI: ffff88007cd05560
Aug 20 16:35:53 dhcp47-55 kernel: RBP: ffffffff81013c8e R08: ffff88007cd06000
R09: 0000000000000000
Aug 20 16:35:53 dhcp47-55 kernel: R10: 0000000000000000 R11: 0000000000000001
R12: 0000000000000000
Aug 20 16:35:53 dhcp47-55 kernel: R13: 0000000000000000 R14: 0000000000000000
R15: 0000000000000800
Aug 20 16:35:53 dhcp47-55 kernel: FS:  0000000000000000(0000)
GS:ffff880001e00000(0000) knlGS:0000000000000000
Aug 20 16:35:53 dhcp47-55 kernel: CS:  0010 DS: 0018 ES: 0018 CR0:
000000008005003b
Aug 20 16:35:53 dhcp47-55 kernel: CR2: 000000391f2a67f0 CR3: 00000000798f9000
CR4: 00000000000006f0
Aug 20 16:35:53 dhcp47-55 kernel: DR0: 0000000000000000 DR1: 0000000000000000
DR2: 0000000000000000
Aug 20 16:35:53 dhcp47-55 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0
DR7: 0000000000000400
Aug 20 16:35:53 dhcp47-55 kernel: Call Trace:
Aug 20 16:35:53 dhcp47-55 kernel: [<ffffffff814c89e6>] ?
thread_return+0x4e/0x778
Aug 20 16:35:53 dhcp47-55 kernel: [<ffffffff8112df50>] ?
vmstat_update_periodic+0x0/0x50
Aug 20 16:35:53 dhcp47-55 kernel: [<ffffffff8108c33c>] ?
worker_thread+0x1fc/0x2a0
Aug 20 16:35:53 dhcp47-55 kernel: [<ffffffff81091940>] ?
autoremove_wake_function+0x0/0x40
Aug 20 16:35:53 dhcp47-55 kernel: [<ffffffff8108c140>] ?
worker_thread+0x0/0x2a0
Aug 20 16:35:53 dhcp47-55 kernel: [<ffffffff810915d6>] ? kthread+0x96/0xa0
Aug 20 16:35:53 dhcp47-55 kernel: [<ffffffff810141ca>] ? child_rip+0xa/0x20
Aug 20 16:35:53 dhcp47-55 kernel: [<ffffffff81091540>] ? kthread+0x0/0xa0
Aug 20 16:35:53 dhcp47-55 kernel: [<ffffffff810141c0>] ? child_rip+0x0/0x20
Aug 20 16:37:23 dhcp47-55 kernel: BUG: soft lockup - CPU#0 stuck for 83s!
[swapper:0]
Aug 20 16:37:23 dhcp47-55 kernel: Modules linked in: ipv6 dm_mirror
dm_region_hash dm_log virtio_net virtio_balloon i2c_piix4 i2c_core ext4 mbcache
jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix
dm_mod [last unloaded: microcode]
Aug 20 16:37:23 dhcp47-55 kernel: CPU 0:
Aug 20 16:37:23 dhcp47-55 kernel: Modules linked in: ipv6 dm_mirror
dm_region_hash dm_log virtio_net virtio_balloon i2c_piix4 i2c_core ext4 mbcache
jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix
dm_mod [last unloaded: microcode]
Aug 20 16:37:23 dhcp47-55 kernel: Pid: 0, comm: swapper Not tainted
2.6.32-63.el6.x86_64 #1 KVM
Aug 20 16:37:23 dhcp47-55 kernel: RIP: 0010:[<ffffffff8107da16>] 
[<ffffffff8107da16>] run_timer_softirq+0x176/0x340
Aug 20 16:37:23 dhcp47-55 kernel: RSP: 0018:ffff880001e03e50  EFLAGS: 00000246
Aug 20 16:37:23 dhcp47-55 kernel: RAX: ffff880001e03e90 RBX: ffff880001e03ed0
RCX: 00000000000000d0
Aug 20 16:37:23 dhcp47-55 kernel: RDX: ffff880001e03e90 RSI: dead000000200200
RDI: 0000000000000000
Aug 20 16:37:23 dhcp47-55 kernel: RBP: ffffffff81013c93 R08: ffffffff81ad9230
R09: 0000000000000000
Aug 20 16:37:23 dhcp47-55 kernel: R10: 0000000000000000 R11: 0000000000000000
R12: ffff880001e03dd0
Aug 20 16:37:23 dhcp47-55 kernel: R13: ffff8800792bee18 R14: ffffffff81ad9140
R15: ffffffff814d015c
Aug 20 16:37:23 dhcp47-55 kernel: FS:  0000000000000000(0000)
GS:ffff880001e00000(0000) knlGS:0000000000000000
Aug 20 16:37:23 dhcp47-55 kernel: CS:  0010 DS: 0018 ES: 0018 CR0:
000000008005003b
Aug 20 16:37:23 dhcp47-55 kernel: CR2: 000000391f2a67f0 CR3: 0000000079f24000
CR4: 00000000000006f0
Aug 20 16:37:23 dhcp47-55 kernel: DR0: 0000000000000000 DR1: 0000000000000000
DR2: 0000000000000000
Aug 20 16:37:23 dhcp47-55 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0
DR7: 0000000000000400
Aug 20 16:37:23 dhcp47-55 kernel: Call Trace:
Aug 20 16:37:23 dhcp47-55 kernel: <IRQ>  [<ffffffff8102f52d>] ?
lapic_next_event+0x1d/0x30
Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff81073807>] ?
__do_softirq+0xb7/0x1e0
Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff810956f0>] ?
hrtimer_interrupt+0x140/0x250
Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff810142cc>] ? call_softirq+0x1c/0x30
Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff81015f35>] ? do_softirq+0x65/0xa0
Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff81073605>] ? irq_exit+0x85/0x90
Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff814d0161>] ?
smp_apic_timer_interrupt+0x71/0x9c
Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff81013c93>] ?
apic_timer_interrupt+0x13/0x20
Aug 20 16:37:23 dhcp47-55 kernel: <EOI>  [<ffffffff8103baab>] ?
native_safe_halt+0xb/0x10
Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff8101bd3d>] ? default_idle+0x3d/0x90
Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff81011e96>] ? cpu_idle+0xb6/0x110
Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff814b118b>] ? rest_init+0x6b/0x80
Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff818c1e9c>] ?
start_kernel+0x403/0x40f
Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff818c133a>] ?
x86_64_start_reservations+0x125/0x129
Aug 20 16:37:23 dhcp47-55 kernel: [<ffffffff818c1438>] ?
x86_64_start_kernel+0xfa/0x109

Comment 11 Mike Cao 2010-08-25 07:21:47 UTC
tried steps on comment #8 .Can not reproduce it.

host info:
# lspci |grep net
04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 02)
# uname -a
Linux dhcp-91-71.nay.redhat.com 2.6.32-66.el6.x86_64 #1 SMP Wed Aug 18 00:24:41 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
# rpm -qa|grep qemu-kvm
qemu-kvm-tools-0.12.1.2-2.110.el6.bz624767.2.x86_64
qemu-kvm-0.12.1.2-2.110.el6.bz624767.2.x86_64
qemu-kvm-debuginfo-0.12.1.2-2.110.el6.bz624767.2.x86_64

Guest info:
# uname -a
Linux dhcp-91-136.nay.redhat.com 2.6.32-66.el6.x86_64 #1 SMP Wed Aug 18 00:24:41 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

Steps :
1. start a VM with virt_net by using virt-manager.
#ps -eaf|grep qemu-kvm
/usr/libexec/qemu-kvm -S -M rhel6.0.0 -enable-kvm -m 4096 -smp 2,sockets=2,cores=1,threads=1 -name rhel10 -uuid 30ec7b3f-a312-db85-6d0d-ce3e71452656 -nodefconfig -nodefaults -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/rhel10.monitor,server,nowait -mon chardev=monitor,mode=control -rtc base=utc -boot c -drive file=/home/rhel10,if=none,id=drive-virtio-disk0,boot=on,format=raw,cache=none -device virtio-blk-pci,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0 -netdev tap,fd=20,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:b1:65:e8,bus=pci.0,addr=0x3 -netdev tap,fd=21,id=hostnet1 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:a5:da:e1,bus=pci.0,addr=0x7 -netdev tap,fd=22,id=hostnet2 -device virtio-net-pci,netdev=hostnet2,id=net2,mac=52:54:00:9a:dd:be,bus=pci.0,addr=0x8 -chardev pty,id=serial0 -device isa-serial,chardev=serial0 -usb -vnc 127.0.0.1:0 -vga cirrus -device AC97,id=sound0,bus=pci.0,addr=0x4 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

#brctl show
bridge name     bridge id               STP enabled     interfaces
breth0          8000.00219bffb80c       no              eth0
                                                        vnet0
                                                        vnet1
                                                        vnet2

2.check /var/log/messages
3.run netperf


Actual Results:
after step2 ,can NOT find any soft lockup in messages.
after step3 ,netperf can run successully.

Comment 13 Eduardo Habkost 2010-09-09 21:29:12 UTC
Increasing priority as we probably want it on day0.

Comment 18 Alex Williamson 2010-09-14 15:42:26 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
The virtio network device has two available algorithms for transmitting packets.  The default is to use an asynchronous bottom half transmitter which typically shows good performance for all workloads.  The alternate implementation uses a timer to delay transmit in an attempt to batch multiple packets together.  The timer approach typically results higher latency, but may improve overall efficiency.  If a user wishes to change from the default algorithm to the timer based approach, it can be done by creating a wrapper script around qemu-kvm and specifying it as the emulator for guests requiring it.  Instructions to do so follow:

1) Create wrapper script

$ cat > /usr/libexec/qemu-kvm.txtimer << EOF
#!/bin/sh
exec /usr/libexec/qemu-kvm \`echo "\$@" | sed
's|virtio-net-pci|virtio-net-pci,tx=timer|g'\`
EOF

2) Make script executable

$ chmod 755 /usr/libexec/qemu-kvm.txtimer

3) Set selinux permisions

$ restorecon /usr/libexec/qemu-kvm.txtimer

4) Create selinux module

$ cat > qemutxtimer.te << EOF
policy_module(qemutxtimer, 1.0)

gen_require(\`
 attribute virt_domain;
 type qemu_exec_t;
')

can_exec(virt_domain, qemu_exec_t)
EOF

5) Build selinux module

$ make -f /usr/share/selinux/devel/Makefile

6) Install selinux module

$ semodule -i qemutxtimer.pp # May later be uninstalled with -r

7) Update guest XML to use qemu-kvm wrapper

$ virsh edit $GUEST

Replace:

    <emulator>/usr/libexec/qemu-kvm</emulator>

With:

    <emulator>/usr/libexec/qemu-kvm.txtimer</emulator>

Comment 23 Ryan Lerch 2010-09-29 04:59:29 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,23 +1,16 @@
-The virtio network device has two available algorithms for transmitting packets.  The default is to use an asynchronous bottom half transmitter which typically shows good performance for all workloads.  The alternate implementation uses a timer to delay transmit in an attempt to batch multiple packets together.  The timer approach typically results higher latency, but may improve overall efficiency.  If a user wishes to change from the default algorithm to the timer based approach, it can be done by creating a wrapper script around qemu-kvm and specifying it as the emulator for guests requiring it.  Instructions to do so follow:
-
-1) Create wrapper script
-
+virtio network device packet transmission algorithms
+The virtio network device has two available algorithms for transmitting packets. The default is to use an asynchronous bottom half transmitter which typically shows good performance for all workloads. The alternate implementation uses a timer to delay transmit in an attempt to batch multiple packets together. The timer approach typically results higher latency, but may improve overall efficiency. To change from the default algorithm to the timer based approach, use the following procedure to create a wrapper script around qemu-kvm and specify it as the emulator for guests that require it.
+create the wrapper script
 $ cat > /usr/libexec/qemu-kvm.txtimer << EOF
 #!/bin/sh
 exec /usr/libexec/qemu-kvm \`echo "\$@" | sed
 's|virtio-net-pci|virtio-net-pci,tx=timer|g'\`
 EOF
-
-2) Make script executable
-
+Make script executable
 $ chmod 755 /usr/libexec/qemu-kvm.txtimer
-
-3) Set selinux permisions
-
+Set selinux permisions
 $ restorecon /usr/libexec/qemu-kvm.txtimer
-
-4) Create selinux module
-
+Create selinux module
 $ cat > qemutxtimer.te << EOF
 policy_module(qemutxtimer, 1.0)
 
@@ -28,23 +21,13 @@
 
 can_exec(virt_domain, qemu_exec_t)
 EOF
-
-5) Build selinux module
-
+Build selinux module
 $ make -f /usr/share/selinux/devel/Makefile
-
-6) Install selinux module
-
+Install selinux module
 $ semodule -i qemutxtimer.pp # May later be uninstalled with -r
-
-7) Update guest XML to use qemu-kvm wrapper
-
+Update guest XML to use qemu-kvm wrapper
 $ virsh edit $GUEST
-
 Replace:
-
     <emulator>/usr/libexec/qemu-kvm</emulator>
-
 With:
-
     <emulator>/usr/libexec/qemu-kvm.txtimer</emulator>

Comment 26 Mike Cao 2011-01-12 07:11:16 UTC
Compare UDP throughput between the regular userspace virtio-net under build
qemu-kvm-0.12.1.2-2.113.el6 and build
qemu-kvm-0.12.1.2-2.129.el6 by using the following steps:
1. start guest by:
/usr/libexec/qemu-kvm -M rhel6.0.0 -enable-kvm -m 2G -smp 2,sockets=2,cores=1,threads=1 -name BugVerify -uuid b13c1ed4-0eca-3db9-b754-478768f50409 -monitor stdio -rtc base=utc -boot dc -drive file=/home/RHEL6.raw,if=none,id=drive-virtio-disk0,boot=on,format=raw,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0 -netdev tap,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:00:01:00:00:01 -chardev socket,id=serial0,path=/tmp/ch,server,nowait -device isa-serial,chardev=serial0 -usb -vnc :4 -usb -vga cirrus -device virtio-balloon-pci,id=balloon0
2. 2vcpus, pin each of them to host cpu.
   tasket -pc 0 $thread_id1
   tasket -pc 1 $thread_id2
3. on guest, pin the interrupt to the second vcpu.
   echo $i > /proc/irq/$irqcur/smp_affinity 
4. run netperf from guest to external host:
#  for b in 32 64 128 256 512 1024 2048 4096 8192 16834 32768 65507; do
netperf -t UDP_STREAM -f m -H 192.162.0.3 -P 0 -l 10 -- -m $b; done
5. got the average throughput

guest ---> host
qemu-kvm-0.12.1.2-2.113.el6.x86_64:
Send    Send    Elasped  Throughput
Socket  Message Time
Size    Size
(bytes) (bytes) (secs) (10^6 bits/sec)
124928  32      10     16.84
124928  64      10     32.99
124928  128     10     66.49
124928  256     10     130.84
124928  512     10     260.26
124928  1024    10     530.21
124928  2048    10     903.95
124928  4096    10     1643.02
124928  8192    10     2773.66
124928  16834   10     4168.64
124928  32768   10     5952.95
124928  65507   10     7471.95

qemu-kvm-0.12.1.2-2.113.el6_0.1.x86_64:
Send    Send    Elasped  Throughput  Speedup
Socket  Message Time              
Size    Size
(bytes) (bytes) (secs)(10^6bps) 
124928  32      10     53.35
124928  64      10     104.02
124928  128     10     201.34
124928  256     10     489.21
124928  512     10     719.36
124928  1024    10     1378.72
124928  2048    10     2599.38
124928  4096    10     5301.71
124928  8192    10     9361.36
124928  16834   10     11662.85
124928  32768   10     17007.65
124928  65507   10     20615.35

guest --->external host
qemu-kvm-0.12.1.2-2.113.el6.x86_64:
Send    Send    Elasped  Throughput
Socket  Message Time
Size    Size
(bytes) (bytes) (secs) (10^6 bits/sec)
124928  32      10     24.74
124928  64      10     49.76
124928  128     10     105.45
124928  256     10     171.90
124928  512     10     372.52
124928  1024    10     642.70
124928  2048    10     867.99
124928  4096    10     917.10
124928  8192    10     934.75
124928  16834   10     938.89
124928  32768   10     946.25
124928  65507   10     959.59

qemu-kvm-0.12.1.2-2.113.el6_0.1.x86_64:
Send    Send    Elasped  Throughput  Speedup
Socket  Message Time              
Size    Size
(bytes) (bytes) (secs)(10^6bps) 
124928  32      10     41.14
124928  64      10     99.13
124928  128     10     193.28
124928  256     10     344.12
124928  512     10     575.17
124928  1024    10     805.72
124928  2048    10     834.75
124928  4096    10     917.13
124928  8192    10     934.76
124928  16834   10     938.53
124928  32768   10     946.62
124928  65507   10     951.94

Summary:
  The bottom half transmitter improves the tx throughput in UDP_STREAM.

Comment 28 Mike Cao 2011-01-12 07:31:22 UTC
Testing on qemu-kvm-0.12.1.2-2.129.el6
When running netperf on the guest
guest responds slowly. cpu stuck 2 times.

How reproducible:
sometimes

Addtional info:
did not find this issue on qemu-kvm-0.12.1.2-2.113.el6

# /var/log/message in the guest
Jan 12 01:13:17 localhost rtkit-daemon[1905]: The canary thread is apparently starving. Taking action.
Jan 12 01:13:17 localhost rtkit-daemon[1905]: Demoting known real-time threads.
Jan 12 01:13:17 localhost rtkit-daemon[1905]: Demoted 0 threads.
Jan 12 01:13:17 localhost rtkit-daemon[1905]: The canary thread is apparently st
arving. Taking action.
Jan 12 01:13:17 localhost rtkit-daemon[1905]: Demoting known real-time threads.
Jan 12 01:13:17 localhost rtkit-daemon[1905]: Demoted 0 threads.
Jan 12 01:13:17 localhost kernel: BUG: soft lockup - CPU#1 stuck for 186s! [gnom
e-settings-:2009]
Jan 12 01:13:17 localhost kernel: Modules linked in: fuse autofs4 sunrpc ip6t_RE
JECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_
mirror dm_region_hash dm_log uinput ppdev parport_pc parport sg virtio_balloon v
irtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk sr_mod cdrom virtio_pc
i virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: speed
step_lib]
Jan 12 01:13:17 localhost kernel: CPU 1:
Jan 12 01:13:17 localhost kernel: Modules linked in: fuse autofs4 sunrpc ip6t_RE
JECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_
mirror dm_region_hash dm_log uinput ppdev parport_pc parport sg virtio_balloon v
irtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk sr_mod cdrom virtio_pc
i virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: speed
step_lib]
Jan 12 01:13:17 localhost kernel: Pid: 2009, comm: gnome-settings- Not tainted 2
.6.32-71.el6.x86_64 #1 KVM
Jan 12 01:13:17 localhost kernel: RIP: 0010:[<ffffffff812679f3>]  [<ffffffff8126
79f3>] iowrite16+0x23/0x50
Jan 12 01:13:17 localhost kernel: RSP: 0000:ffff880001f03700  EFLAGS: 00000206
Jan 12 01:13:17 localhost kernel: RAX: 0000000000000001 RBX: ffff880001f03700 RC
X: 0000000000000001
Jan 12 01:13:17 localhost kernel: RDX: 000000000000c090 RSI: 000000000001c090 RD
I: 0000000000000001
Jan 12 01:13:17 localhost kernel: RBP: ffffffff81013c93 R08: ffff88007be268c0 R0
9: ffff880037900430
Jan 12 01:13:17 localhost kernel: R10: ffff880001f034c0 R11: 0000000000000002 R1
2: ffff880001f03680
Jan 12 01:13:17 localhost kernel: R13: ffff88007aea9ec0 R14: ffff88007be268c0 R1
5: ffffffff814cf9fc
Jan 12 01:13:17 localhost kernel: FS:  00007ff3bba067a0(0000) GS:ffff880001f0000
0(0000) knlGS:0000000000000000
Jan 12 01:13:17 localhost kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050
033
Jan 12 01:13:17 localhost kernel: CR2: 000000348a2662a0 CR3: 000000007b8b1000 CR
4: 00000000000006e0
Jan 12 01:13:17 localhost kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR
2: 0000000000000000
Jan 12 01:13:17 localhost kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR
7: 0000000000000400
Jan 12 01:13:17 localhost kernel: Call Trace:
Jan 12 01:13:17 localhost kernel: <IRQ>  [<ffffffffa0044115>] ? vp_notify+0x25/0
x30 [virtio_pci]
Jan 12 01:13:17 localhost kernel: [<ffffffffa003d033>] ? virtqueue_kick+0x33/0x4
0 [virtio_ring]
Jan 12 01:13:17 localhost kernel: [<ffffffffa012461d>] ? start_xmit+0x5d/0x1d0 [
virtio_net]
Jan 12 01:13:17 localhost kernel: [<ffffffff81410528>] ? dev_hard_start_xmit+0x2
b8/0x370
Jan 12 01:13:17 localhost kernel: [<ffffffff814288aa>] ? sch_direct_xmit+0x15a/0
x1c0
Jan 12 01:13:17 localhost kernel: [<ffffffff81413ab8>] ? dev_queue_xmit+0x378/0x
4a0
Jan 12 01:13:17 localhost kernel: [<ffffffff8142798a>] ? eth_header+0x3a/0xe0
Jan 12 01:13:17 localhost kernel: [<ffffffff81418445>] ? neigh_resolve_output+0x
105/0x370
Jan 12 01:13:17 localhost kernel: [<ffffffff81446b40>] ? ip_finish_output+0x0/0x
310
Jan 12 01:13:17 localhost kernel: [<ffffffff81446c7c>] ? ip_finish_output+0x13c/
0x310
Jan 12 01:13:17 localhost kernel: [<ffffffff81446f08>] ? ip_output+0xb8/0xc0
Jan 12 01:13:17 localhost kernel: [<ffffffff81445e5f>] ? __ip_local_out+0x9f/0xb
0
Jan 12 01:13:17 localhost kernel: [<ffffffff81445e95>] ? ip_local_out+0x25/0x30
Jan 12 01:13:17 localhost kernel: [<ffffffff814466e0>] ? ip_queue_xmit+0x190/0x4
20
Jan 12 01:13:17 localhost kernel: [<ffffffff814064ca>] ? __alloc_skb+0x7a/0x180
Jan 12 01:13:17 localhost kernel: [<ffffffff8145b391>] ? tcp_transmit_skb+0x3f1/
0x790
Jan 12 01:13:17 localhost kernel: [<ffffffff8105c490>] ? default_wake_function+0
x0/0x20
Jan 12 01:13:17 localhost kernel: [<ffffffff8145cca9>] ? tcp_send_ack+0xd9/0x120
Jan 12 01:13:17 localhost kernel: [<ffffffff81455bf2>] ? tcp_fin+0xf2/0x1d0
Jan 12 01:13:17 localhost kernel: [<ffffffff81456c96>] ? tcp_data_queue+0x756/0x
c80
Jan 12 01:13:17 localhost kernel: [<ffffffff8145a311>] ? tcp_rcv_state_process+0
x6b1/0xa30
Jan 12 01:13:17 localhost kernel: [<ffffffff814616ae>] ? tcp_v4_do_rcv+0x1ce/0x4
30
Jan 12 01:13:17 localhost kernel: [<ffffffff81463050>] ? tcp_v4_rcv+0x5b0/0x7e0
Jan 12 01:13:17 localhost kernel: [<ffffffff8105fa72>] ? enqueue_entity+0x122/0x
320
Jan 12 01:13:17 localhost kernel: [<ffffffff8144108d>] ? ip_local_deliver_finish
+0xdd/0x2d0
Jan 12 01:13:17 localhost kernel: [<ffffffff81441318>] ? ip_local_deliver+0x98/0
xa0
Jan 12 01:13:17 localhost kernel: [<ffffffff814407dd>] ? ip_rcv_finish+0x12d/0x4
40
Jan 12 01:13:17 localhost kernel: [<ffffffff81440d65>] ? ip_rcv+0x275/0x350
Jan 12 01:13:17 localhost kernel: [<ffffffff8140f2cb>] ? netif_receive_skb+0x38b
/0x670
Jan 12 01:13:17 localhost kernel: [<ffffffffa0125395>] ? virtnet_poll+0x5c5/0x8b
0 [virtio_net]
Jan 12 01:13:17 localhost kernel: [<ffffffff8140fe53>] ? net_rx_action+0x103/0x2
10
Jan 12 01:13:17 localhost kernel: [<ffffffff81073bd7>] ? __do_softirq+0xb7/0x1e0
Jan 12 01:13:17 localhost kernel: [<ffffffff810d8740>] ? handle_IRQ_event+0x60/0
x170
Jan 12 01:13:17 localhost kernel: [<ffffffff81073c34>] ? __do_softirq+0x114/0x1e
0
Jan 12 01:13:17 localhost kernel: [<ffffffff810142cc>] ? call_softirq+0x1c/0x30
Jan 12 01:13:17 localhost kernel: [<ffffffff81015f35>] ? do_softirq+0x65/0xa0
Jan 12 01:13:17 localhost kernel: [<ffffffff810739d5>] ? irq_exit+0x85/0x90
Jan 12 01:13:17 localhost kernel: [<ffffffff814cf915>] ? do_IRQ+0x75/0xf0
Jan 12 01:13:17 localhost kernel: [<ffffffff81013ad3>] ? ret_from_intr+0x0/0x11
Jan 12 01:13:17 localhost kernel: <EOI> 
Jan 12 01:13:17 localhost kernel: BUG: soft lockup - CPU#0 stuck for 186s! [scsi
_eh_1:335]
Jan 12 01:13:17 localhost kernel: Modules linked in: fuse autofs4 sunrpc ip6t_RE
JECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_
mirror dm_region_hash dm_log uinput ppdev parport_pc parport sg virtio_balloon v
irtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk sr_mod cdrom virtio_pc
i virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: speed
step_lib]
Jan 12 01:13:17 localhost kernel: CPU 0:
Jan 12 01:13:17 localhost kernel: Modules linked in: fuse autofs4 sunrpc ip6t_RE
JECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_
mirror dm_region_hash dm_log uinput ppdev parport_pc parport sg virtio_balloon v
irtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk sr_mod cdrom virtio_pc
i virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: speed
step_lib]
Jan 12 01:13:17 localhost kernel: Pid: 335, comm: scsi_eh_1 Not tainted 2.6.32-7
1.el6.x86_64 #1 KVM
Jan 12 01:13:17 localhost kernel: RIP: 0010:[<ffffffff8107dde6>]  [<ffffffff8107
dde6>] run_timer_softirq+0x176/0x340
Jan 12 01:13:17 localhost kernel: RSP: 0018:ffff880001e03e50  EFLAGS: 00000246
Jan 12 01:13:17 localhost kernel: RAX: ffff880001e03e90 RBX: ffff880001e03ed0 RC
X: 00000000000005c0
Jan 12 01:13:17 localhost kernel: RDX: ffff880001e03e90 RSI: dead000000200200 RD
I: 0000000000000000
Jan 12 01:13:17 localhost kernel: RBP: ffffffff81013c93 R08: ffffffff81ada7a0 R0
9: 0000000000000000
Jan 12 01:13:17 localhost kernel: R10: 0000000000000000 R11: 0000000000000400 R1
2: ffff880001e03dd0
Jan 12 01:13:17 localhost kernel: R13: ffffffffa01f51e8 R14: ffffffff81ada1c0 R1
5: ffffffff814cf9fc
Jan 12 01:13:17 localhost kernel: FS:  0000000000000000(0000) GS:ffff880001e0000
0(0000) knlGS:0000000000000000
Jan 12 01:13:17 localhost kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 0000000080050
03b
Jan 12 01:13:17 localhost kernel: CR2: 000000348a339d50 CR3: 000000003772b000 CR
4: 00000000000006f0
Jan 12 01:13:17 localhost kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR
2: 0000000000000000
Jan 12 01:13:17 localhost kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR
7: 0000000000000400
Jan 12 01:13:17 localhost kernel: Call Trace:
Jan 12 01:13:17 localhost kernel: <IRQ>  [<ffffffff8102f52d>] ? lapic_next_event
+0x1d/0x30
Jan 12 01:13:17 localhost kernel: [<ffffffff81073bd7>] ? __do_softirq+0xb7/0x1e0
Jan 12 01:13:17 localhost kernel: [<ffffffff81095a50>] ? hrtimer_interrupt+0x140
/0x250
Jan 12 01:13:17 localhost kernel: [<ffffffff81366a80>] ? sata_sff_hardreset+0x0/
0x80
Jan 12 01:13:17 localhost kernel: [<ffffffff810142cc>] ? call_softirq+0x1c/0x30
Jan 12 01:13:17 localhost kernel: [<ffffffff81015f35>] ? do_softirq+0x65/0xa0
Jan 12 01:13:17 localhost kernel: [<ffffffff810739d5>] ? irq_exit+0x85/0x90
Jan 12 01:13:17 localhost kernel: [<ffffffff814cfa01>] ? smp_apic_timer_interrup
t+0x71/0x9c
Jan 12 01:13:17 localhost kernel: [<ffffffff813677e0>] ? ata_sff_softreset+0x0/0
x180
Jan 12 01:13:17 localhost kernel: [<ffffffff81013c93>] ? apic_timer_interrupt+0x
13/0x20
Jan 12 01:13:17 localhost kernel: <EOI>  [<ffffffff814caa87>] ? _spin_unlock_irq
restore+0x17/0x20
Jan 12 01:13:17 localhost kernel: [<ffffffff81366700>] ? ata_sff_error_handler+0
xd0/0x240
Jan 12 01:13:17 localhost kernel: [<ffffffff81363736>] ? ata_scsi_error+0x656/0x
910
Jan 12 01:13:17 localhost kernel: [<ffffffff81347846>] ? scsi_error_handler+0x12
6/0x630
Jan 12 01:13:17 localhost kernel: Clocksource tsc unstable (delta = -2728019381 
ns)
Jan 12 01:13:17 localhost kernel: [<ffffffff8105c4a2>] ? default_wake_function+0
x12/0x20
Jan 12 01:13:17 localhost kernel: [<ffffffff81347720>] ? scsi_error_handler+0x0/
0x630
Jan 12 01:13:17 localhost kernel: [<ffffffff81091936>] ? kthread+0x96/0xa0
Jan 12 01:13:17 localhost kernel: [<ffffffff810141ca>] ? child_rip+0xa/0x20
Jan 12 01:13:17 localhost kernel: [<ffffffff810918a0>] ? kthread+0x0/0xa0
Jan 12 01:13:17 localhost kernel: [<ffffffff810141c0>] ? child_rip+0x0/0x20


bcao--->alex
Does this above related to this bug?

Comment 29 Alex Williamson 2011-01-12 16:14:11 UTC
There are a couple new options you can use to try to diagnose if this is related to the new tx bottom half handler:

tx=timer|bh

"bh" is the new default, and enables the bottom half handler, "timer" re-enables the old code.  example:

-device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:00:01:00:00:01,tx=timer

If you find that the problem goes away with the timer, you can switch back to the "bh" handler, and use this option:

x-txburst=

This controls how many packets can be sent on a single flush of the tx queue.  The default is 256.  Reducing this value ensures we don't hog too much time in the bottom half handler.  example:

-device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:00:01:00:00:01,x-txburst=128

Which netperf test case was causing this?  We really should not be able to hold off one of the guest CPUs for so long with the defaults.  Does not pinning the interrupt in the guest make a difference?

Comment 30 Mike Cao 2011-01-13 07:45:00 UTC
(In reply to comment #29)
> There are a couple new options you can use to try to diagnose if this is
> related to the new tx bottom half handler:
> 
> tx=timer|bh
> 
> "bh" is the new default, and enables the bottom half handler, "timer"
> re-enables the old code.  example:
> 
> -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:00:01:00:00:01,tx=timer

When running netperf and ping www.redhat.com in the guest call trace occured(Tried lots of times,only happened 1 times ,the other times guest works fine ,no stalls occurs)

# dmesg
BUG: soft lockup - CPU#0 stuck for 88s! [udisks-daemon:2106]
Modules linked in: fuse autofs4 sunrpc ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log uinput ppdev parport_pc parport sg virtio_balloon virtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk sr_mod cdrom virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: speedstep_lib]
CPU 0:
Modules linked in: fuse autofs4 sunrpc ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log uinput ppdev parport_pc parport sg virtio_balloon virtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk sr_mod cdrom virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: speedstep_lib]
Pid: 2106, comm: udisks-daemon Not tainted 2.6.32-71.el6.x86_64 #1 KVM
RIP: 0010:[<ffffffff814caa87>]  [<ffffffff814caa87>] _spin_unlock_irqrestore+0x17/0x20
RSP: 0018:ffff880001e03d00  EFLAGS: 00000282
RAX: 0000000000000000 RBX: ffff880001e03d00 RCX: 0000000000000000
RDX: ffffffff81c984e0 RSI: 0000000000000282 RDI: 0000000000000282
RBP: ffffffff81013c93 R08: ffff88007ba92000 R09: ffff880001e03ee0
R10: 0000000000000228 R11: 0000000000000282 R12: ffff880001e03c80
R13: 0000000000000000 R14: 0000000000000010 R15: ffffffff814cf9fc
FS:  00007f1b3ed067a0(0000) GS:ffff880001e00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000348a277900 CR3: 000000007bb7e000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>  [<ffffffff813ae6ee>] ? i8042_interrupt+0x10e/0x3a0
 [<ffffffff8103ccd8>] ? pvclock_clocksource_read+0x58/0xd0
 [<ffffffff810d8740>] ? handle_IRQ_event+0x60/0x170
 [<ffffffff810dae26>] ? handle_edge_irq+0xc6/0x160
 [<ffffffff810670c1>] ? check_preempt_wakeup+0x41/0x3c0
 [<ffffffff81015fb9>] ? handle_irq+0x49/0xa0
 [<ffffffff814cf90c>] ? do_IRQ+0x6c/0xf0
 [<ffffffff81013ad3>] ? ret_from_intr+0x0/0x11
 [<ffffffff81073b8b>] ? __do_softirq+0x6b/0x1e0
 [<ffffffff81095a50>] ? hrtimer_interrupt+0x140/0x250
 [<ffffffff810142cc>] ? call_softirq+0x1c/0x30
 [<ffffffff81015f35>] ? do_softirq+0x65/0xa0
 [<ffffffff810739d5>] ? irq_exit+0x85/0x90
 [<ffffffff814cfa01>] ? smp_apic_timer_interrupt+0x71/0x9c
 [<ffffffff81013c93>] ? apic_timer_interrupt+0x13/0x20
 <EOI>  [<ffffffff814caa87>] ? _spin_unlock_irqrestore+0x17/0x20
 [<ffffffff81341bcc>] ? scsi_dispatch_cmd+0x1bc/0x340
 [<ffffffff81349535>] ? scsi_request_fn+0x425/0x5c0
 [<ffffffff81239c2e>] ? elv_insert+0x4e/0x1f0
 [<ffffffff8123fff2>] ? __generic_unplug_device+0x32/0x40
 [<ffffffff8124640d>] ? blk_execute_rq_nowait+0x6d/0xc0
 [<ffffffff812464e4>] ? blk_execute_rq+0x84/0xf0
 [<ffffffff812423a5>] ? blk_get_request+0x75/0xa0
 [<ffffffff8134ab5c>] ? scsi_execute+0xfc/0x160
 [<ffffffff8134adb6>] ? scsi_execute_req+0xb6/0x190
 [<ffffffffa005d2eb>] ? sr_test_unit_ready+0x6b/0xf0 [sr_mod]
 [<ffffffff81156bb5>] ? kmem_cache_alloc_notrace+0x115/0x130
 [<ffffffffa005e2bb>] ? sr_media_change+0xab/0x350 [sr_mod]
 [<ffffffff8109205c>] ? remove_wait_queue+0x3c/0x50
 [<ffffffff811810c6>] ? free_poll_entry+0x26/0x30
 [<ffffffffa004c0b0>] ? media_changed+0x70/0xc0 [cdrom]
 [<ffffffffa004c137>] ? cdrom_media_changed+0x37/0x40 [cdrom]
 [<ffffffffa005d3d9>] ? sr_block_media_changed+0x19/0x20 [sr_mod]
 [<ffffffff811a3553>] ? check_disk_change+0x33/0x80
 [<ffffffffa004f308>] ? cdrom_open+0x248/0x600 [cdrom]
 [<ffffffff8125afca>] ? kobject_get+0x1a/0x30
 [<ffffffff81248a1d>] ? get_disk+0x7d/0xf0
 [<ffffffff8125afca>] ? kobject_get+0x1a/0x30
 [<ffffffffa005d599>] ? sr_block_open+0x79/0xd0 [sr_mod]
 [<ffffffff8132a3c9>] ? get_device+0x19/0x20
 [<ffffffff811a4393>] ? __blkdev_get+0xc3/0x3c0
 [<ffffffff811a46b0>] ? blkdev_open+0x0/0xc0
 [<ffffffff811a46a0>] ? blkdev_get+0x10/0x20
 [<ffffffff811a4721>] ? blkdev_open+0x71/0xc0
 [<ffffffff81169b80>] ? __dentry_open+0x110/0x370
 [<ffffffff81208442>] ? selinux_inode_permission+0x72/0xb0
 [<ffffffff8120051f>] ? security_inode_permission+0x1f/0x30
 [<ffffffff81169ef7>] ? nameidata_to_filp+0x57/0x70
 [<ffffffff8117d313>] ? do_filp_open+0x5f3/0xd40
 [<ffffffff81186330>] ? iput+0x30/0x70
 [<ffffffff8103ccd8>] ? pvclock_clocksource_read+0x58/0xd0
 [<ffffffff8103ccd8>] ? pvclock_clocksource_read+0x58/0xd0
 [<ffffffff811892f2>] ? alloc_fd+0x92/0x160
 [<ffffffff81169929>] ? do_sys_open+0x69/0x140
 [<ffffffff81014645>] ? math_state_restore+0x45/0x60
 [<ffffffff81169a40>] ? sys_open+0x20/0x30
 [<ffffffff81013172>] ? system_call_fastpath+0x16/0x1b
BUG: soft lockup - CPU#1 stuck for 109s! [swapper:0]
Modules linked in: fuse autofs4 sunrpc ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log uinput ppdev parport_pc parport sg virtio_balloon virtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk sr_mod cdrom virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: speedstep_lib]
CPU 1:
Modules linked in: fuse autofs4 sunrpc ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log uinput ppdev parport_pc parport sg virtio_balloon virtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk sr_mod cdrom virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: speedstep_lib]
Pid: 0, comm: swapper Not tainted 2.6.32-71.el6.x86_64 #1 KVM
RIP: 0010:[<ffffffff8107dde6>]  [<ffffffff8107dde6>] run_timer_softirq+0x176/0x340
RSP: 0018:ffff880001f03e50  EFLAGS: 00000246
RAX: ffff880001f03e90 RBX: ffff880001f03ed0 RCX: 0000000000000f00
RDX: ffff880001f03e90 RSI: dead000000200200 RDI: 0000000000000000
RBP: ffffffff81013c93 R08: ffff88007ccd8f20 R09: 0000000000000000
R10: 0000000000000000 R11: 000000000000000f R12: ffff880001f03dd0
R13: ffff880079491df0 R14: ffff88007ccd8000 R15: ffffffff814cf9fc
FS:  0000000000000000(0000) GS:ffff880001f00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 000000348a2da060 CR3: 000000007ba54000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>  [<ffffffff81073bd7>] ? __do_softirq+0xb7/0x1e0
 [<ffffffff81095a50>] ? hrtimer_interrupt+0x140/0x250
 [<ffffffff810142cc>] ? call_softirq+0x1c/0x30
 [<ffffffff81015f35>] ? do_softirq+0x65/0xa0
 [<ffffffff810739d5>] ? irq_exit+0x85/0x90
 [<ffffffff814cfa01>] ? smp_apic_timer_interrupt+0x71/0x9c
 [<ffffffff81013c93>] ? apic_timer_interrupt+0x13/0x20
 <EOI>  [<ffffffff8103be8b>] ? native_safe_halt+0xb/0x10
 [<ffffffff8101bd3d>] ? default_idle+0x3d/0x90
 [<ffffffff81011e96>] ? cpu_idle+0xb6/0x110
 [<ffffffff814c17c8>] ? start_secondary+0x1fc/0x23f
ata2: lost interrupt (Status 0x58)
ata2: drained 6 bytes to clear DRQ.
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
sr 1:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
ata2.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
         res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
ata2.00: status: { DRDY }
ata2: soft resetting link
ata2.01: NODEV after polling detection
ata2.00: configured for MWDMA2
ata2: EH complete
ip_tables: (C) 2000-2006 Netfilter Core Team


> 
> If you find that the problem goes away with the timer, you can switch back to
> the "bh" handler, and use this option:
> 
> x-txburst=
> 
> This controls how many packets can be sent on a single flush of the tx queue. 
> The default is 256.  Reducing this value ensures we don't hog too much time in
> the bottom half handler.  example:
> 
> -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=00:00:01:00:00:01,x-txburst=128
> 

guest responds always slowly when running nerperf,sometimes guest,sometimes guest hang even cpu soft stuck.

> Which netperf test case was causing this?  We really should not be able to hold
> off one of the guest CPUs for so long with the defaults.  Does not pinning the
> interrupt in the guest make a difference?

netperf command:
for b in 32 64 128 256 512 1024 2048 4096 8192 16834 32768 65507; do
./netperf -t TCP_STREAM -f m -H 192.162.0.1 -P 0 -l 10 -- -m $b; done

I tried all the testing without pinning interrupt in this comment.

Comment 31 Alex Williamson 2011-01-13 16:13:58 UTC
(In reply to comment #30)
> (In reply to comment #29)
> > There are a couple new options you can use to try to diagnose if this is
> > related to the new tx bottom half handler:
> > 
> > tx=timer|bh
> > 
> > "bh" is the new default, and enables the bottom half handler, "timer"
> > re-enables the old code.  example:
> > 
> > -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:00:01:00:00:01,tx=timer
> 
> When running netperf and ping www.redhat.com in the guest call trace
> occured(Tried lots of times,only happened 1 times ,the other times guest works
> fine ,no stalls occurs)

If the problem still occurs with tx=timer, then chances are good it's not related to this change.  Perhaps you can bisect the revisions between 113 and 129 to figure out more precisely when the problem was introduced.

Comment 32 Chris Wright 2011-01-13 16:27:35 UTC
Also, can you verify the host cpu load when this softlockup happens?  It looks like the vcpu is preempted and not run for quite some time.

Comment 33 Mike Cao 2011-01-14 02:22:06 UTC
(In reply to comment #32)
> Also, can you verify the host cpu load when this softlockup happens?  It looks
> like the vcpu is preempted and not run for quite some time.

no softlockup on the host.

Comment 36 Alex Williamson 2011-01-14 18:07:53 UTC
(In reply to comment #28)
> Jan 12 01:13:17 localhost rtkit-daemon[1905]: The canary thread is apparently
> starving. Taking action.
> Jan 12 01:13:17 localhost rtkit-daemon[1905]: Demoting known real-time threads.
> Jan 12 01:13:17 localhost rtkit-daemon[1905]: Demoted 0 threads.
> Jan 12 01:13:17 localhost rtkit-daemon[1905]: The canary thread is apparently
> st
> arving. Taking action.
> Jan 12 01:13:17 localhost rtkit-daemon[1905]: Demoting known real-time threads.
> Jan 12 01:13:17 localhost rtkit-daemon[1905]: Demoted 0 threads.
> Jan 12 01:13:17 localhost kernel: BUG: soft lockup - CPU#1 stuck for 186s!
...

What is rtkit doing here?  Can you reproduce this without it?  Letting rtkit set real time priorities definitely seems like it could be contributing.

(In reply to comment #30)
...
>  [<ffffffff81011e96>] ? cpu_idle+0xb6/0x110
>  [<ffffffff814c17c8>] ? start_secondary+0x1fc/0x23f
> ata2: lost interrupt (Status 0x58)
> ata2: drained 6 bytes to clear DRQ.
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
> sr 1:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
> ata2.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
>          res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
> ata2.00: status: { DRDY }
> ata2: soft resetting link
> ata2.01: NODEV after polling detection
> ata2.00: configured for MWDMA2
> ata2: EH complete
> ip_tables: (C) 2000-2006 Netfilter Core Team

This softlockup looks like it happened during boot, before even starting the netperf test.  Can you confirm?

I've run well over 100GB of netperf tests through a guest on qemu-kvm 128.  The guest definitely gets sluggish during small packet UDP_STREAM guest->external tests, but I haven't seen any soft lockups.  The txburt limiter we added during the development of this patch is specifically to prevent such things.  I'm suspicious that something else is going on here.

Comment 37 Mike Cao 2011-01-19 07:41:10 UTC
> What is rtkit doing here?  Can you reproduce this without it?  Letting rtkit
> set real time priorities definitely seems like it could be contributing.
> 

I install RHEL6 default with gnome ,Could you tell me how to stop it ?


> This softlockup looks like it happened during boot, before even starting the
> netperf test.  Can you confirm?

I am sure the softlockup was happened after running nerperf because the softup was collected by abrtd service .
But I can not reproduce it with a newly installed rhel6 image.only find guest gets sluggish when netperf is running.

Maybe something wrong with the previous image .

Comment 38 Alex Williamson 2011-01-19 16:21:07 UTC
(In reply to comment #37)
> > What is rtkit doing here?  Can you reproduce this without it?  Letting rtkit
> > set real time priorities definitely seems like it could be contributing.
> > 
> 
> I install RHEL6 default with gnome ,Could you tell me how to stop it ?

Looks like you can use 'rtkitctl -k' to have it exit.  I don't know that it's related, but it's suspicious that it seems to be actively working in close approximation of the softlockups.  It could also be triggered by the lack of progress as a result of the softlockup.

> > This softlockup looks like it happened during boot, before even starting the
> > netperf test.  Can you confirm?
> 
> I am sure the softlockup was happened after running nerperf because the softup
> was collected by abrtd service .

The part that draws my attention is that ip_tables is starting up after the softlockup.  That module is typically loaded via the init scripts during bootup.  Perhaps abrtd is reporting a softlockup that occurred earlier?

> But I can not reproduce it with a newly installed rhel6 image.only find guest
> gets sluggish when netperf is running.
> 
> Maybe something wrong with the previous image .

That matches my experience.  The guest gets sluggish, especially when netperf is using small packets, but there are no softlockups.

Comment 39 Mike Cao 2011-01-30 15:12:23 UTC
Based on comment #37 & comment #38, this issue has been fixed

Comment 40 Eduardo Habkost 2011-05-05 16:30:03 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,33 +1,7 @@
-virtio network device packet transmission algorithms
-The virtio network device has two available algorithms for transmitting packets. The default is to use an asynchronous bottom half transmitter which typically shows good performance for all workloads. The alternate implementation uses a timer to delay transmit in an attempt to batch multiple packets together. The timer approach typically results higher latency, but may improve overall efficiency. To change from the default algorithm to the timer based approach, use the following procedure to create a wrapper script around qemu-kvm and specify it as the emulator for guests that require it.
-create the wrapper script
-$ cat > /usr/libexec/qemu-kvm.txtimer << EOF
-#!/bin/sh
-exec /usr/libexec/qemu-kvm \`echo "\$@" | sed
-'s|virtio-net-pci|virtio-net-pci,tx=timer|g'\`
-EOF
-Make script executable
-$ chmod 755 /usr/libexec/qemu-kvm.txtimer
-Set selinux permisions
-$ restorecon /usr/libexec/qemu-kvm.txtimer
-Create selinux module
-$ cat > qemutxtimer.te << EOF
-policy_module(qemutxtimer, 1.0)
+Cause: update, virtio-net used a packet transmission algorithm that was using a timer to delay a transmission in an attempt to batch multiple packets together
 
-gen_require(\`
- attribute virt_domain;
- type qemu_exec_t;
-')
+Consequence: higher virtio-net transmission latency.
 
-can_exec(virt_domain, qemu_exec_t)
+Fix: the default algorithm has been changed to use an asynchronous bottom half transmitter
-EOF
+
-Build selinux module
+Result: improved virtio-net transmission latency.-$ make -f /usr/share/selinux/devel/Makefile
-Install selinux module
-$ semodule -i qemutxtimer.pp # May later be uninstalled with -r
-Update guest XML to use qemu-kvm wrapper
-$ virsh edit $GUEST
-Replace:
-    <emulator>/usr/libexec/qemu-kvm</emulator>
-With:
-    <emulator>/usr/libexec/qemu-kvm.txtimer</emulator>

Comment 41 errata-xmlrpc 2011-05-19 11:22:12 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0534.html

Comment 42 errata-xmlrpc 2011-05-19 12:47:27 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0534.html