|Summary:||Lost the network in a KVM VM on top of 5.4|
|Product:||Red Hat Enterprise Linux 5||Reporter:||Herbert Xu <herbert.xu>|
|Component:||kernel||Assignee:||Herbert Xu <herbert.xu>|
|Status:||CLOSED ERRATA||QA Contact:||Red Hat Kernel QE team <kernel-qe>|
|Version:||5.4||CC:||bruno.cornec, cward, david.jericho, herbert.xu, jean-marc.andre, khong, llim, markmc, mwagner, nsprei, orenault, riek, syeghiay, tburke, todayyang, virt-maint, ykaul|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|:||589766 589897 (view as bug list)||Environment:|
|Last Closed:||2010-03-30 07:15:57 UTC||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
|Bug Depends On:||524651|
|Bug Blocks:||528898, 589766, 589897|
Description Herbert Xu 2010-01-10 11:26:38 UTC
+++ This bug was initially created as a clone of Bug #524651 +++ Created an attachment (id=361965) SOSreport Hyperviseur Description of problem: HP is trying their application ( OCMP ) within a 5.4 VM using KVM. The application is using udp as the network protocol. They are using virtio for the network driver. Unfortunately after a while ( eg 10min )when the network is heavily stressed, the network drop. There is no logs available within the VM nor the hypervisor. If you try to restart the network it will failed. The only way to get back the network is to remove the module ( virtio_net ) and to reload it. Version-Release number of selected component (if applicable): Hypervisor is RHEL5.4 fully updated VM is RHEL 5.3 ( we have tryed with the kernel of 5.4 and 5.3 and we have got the same behaviour ) How reproducible: Always Steps to Reproduce: 1. Start the VM 2. Stress test the VM and wait 3. Actual results: The network stop Expected results: The network should be able to carry on Additional info: I have grabbed an sosreport for the Hypervisor and VM. The Hypervisor is tmobilehv1, the VM is tmobileocmp1 --- Additional comment from firstname.lastname@example.org on 2009-09-21 11:49:17 EDT --- Created an attachment (id=361966) SOSreport VM --- Additional comment from email@example.com on 2009-09-21 12:19:21 EDT --- Here's the similar report from upstream: http://firstname.lastname@example.org/msg06774.html The closest I got to figuring it out was here: http://email@example.com/msg07006.html Does doing this in the guest fix the issue too? $> ip link set eth0 down $> ip link set eth0 up --- Additional comment from firstname.lastname@example.org on 2009-09-22 10:08:40 EDT --- We tried /etc/init.d/network restart without efect. However rmmod virtio_net and then restarting the network works. tcpdump in the guest shows nothing. We also have overruns on the interface. We will try your command to see, when the driver is hung, what happens. It really seems the virtio_net driver is completely out of order at that moment. We are trying to reproduce the issue with a less aplpication dependant context. If you want us to try a debug kernel or newer one, let us know. --- Additional comment from email@example.com on 2009-09-22 13:18:03 EDT --- We moved the virtual machines on another server and we installed a 4 ports Intel Gigabits network card (e1000e module). Still using the virtio network driver, it seems global network performance is really better and network bandwidth more stable. It was not the case with the previous setup using bnx2x module (burst of packets one second and almost nothing the next second). But we still loose network connectivity with the VM even if the load was far more greater and the test lasted almost 1 hour this time. On the failed VM, the behavior is also different: It is not possible to ping the VM but multicast and broadcast packets are received. I also tried to ping an external server. I received "Destination Host unreachable" for some time ans then a very strange message: "ping: sendmsg: No buffer space available" The suggested commands did not restore network connectivity: $> ip link set eth0 down $> ip link set eth0 up --- Additional comment from firstname.lastname@example.org on 2009-09-29 13:22:08 EDT --- Created an attachment (id=363042) strace ping command The result of a strace command when a VM's network hangs. If I increase the values in /proc/sys/net/core/wmem_*, the 'No buffer space available' message disappears for some time (the time the buffer fills up again I guess) and then comes back. --- Additional comment from email@example.com on 2009-10-05 09:26:24 EDT --- Some questions: - How is networking configured in the host? e.g. in the sosreport, I don't see anything bridging the guest to the 10.3.248.0/21 network - How is the guest launched? e.g. 'virsh dumpxml $guest' and the contents of /var/log/libvirt/qemu/$guest.log - Have you confirmed this only happens with virtio, e.g. have you tried model=e1000 ? - The "No buffer space" message is from running ping in the guest? I *think* that can be ignored as merely a symptom of the virtio interface not sending any packets - /proc/net/snmp in the host and guest might be interesting. As might 'tc -s qdisc' in the host --- Additional comment from firstname.lastname@example.org on 2009-10-05 09:45:02 EDT --- It would be useful to strace qemu to see if we're hitting the limit on the tun socket. Thanks! --- Additional comment from email@example.com on 2009-10-05 11:42:11 EDT --- (In reply to comment #6) > Some questions: > > - How is networking configured in the host? e.g. in the sosreport, I don't > see anything bridging the guest to the 10.3.248.0/21 network ocmp1 is a bridge. eth7 is connected to that bridge and to 10.3.248.0/21. The guest is also connected to that bridge > > - How is the guest launched? e.g. 'virsh dumpxml $guest' and the contents > of /var/log/libvirt/qemu/$guest.log [root@tmobilehv ~]# virsh dumpxml OCMP1 <domain type='kvm'> <name>OCMP1</name> <uuid>85fda6b8-3f79-f403-471c-8c3c860da2ba</uuid> <memory>8388608</memory> <currentMemory>8388608</currentMemory> <vcpu>8</vcpu> <os> <type arch='x86_64' machine='pc'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <pae/> </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <source file='/var/lib/libvirt/images/OCMP1.img'/> <target dev='vda' bus='virtio'/> </disk> <interface type='bridge'> <mac address='54:52:00:22:60:30'/> <source bridge='ocmp1'/> <model type='virtio'/> </interface> <serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target port='0'/> </console> <input type='mouse' bus='ps2'/> <graphics type='vnc' port='-1' autoport='yes' keymap='en-us'/> </devices> </domain> I cannot provide /var/log/libvirt/qemu/$guest.log right now. The guest has been restarted since then. I'll provide it later when it hangs again. > > - Have you confirmed this only happens with virtio, e.g. have you tried > model=e1000 ? I does not happen with e1000 driver. We ran the exact same test and the VM was still up after 2 days. > > - The "No buffer space" message is from running ping in the guest? I *think* > that can be ignored as merely a symptom of the virtio interface not sending > any packets > > - /proc/net/snmp in the host and guest might be interesting. As might > 'tc -s qdisc' in the host Same as for /var/log/libvirt/qemu/$guest.log. I'll provide them when it hangs again. --- Additional comment from firstname.lastname@example.org on 2009-10-05 11:52:49 EDT --- (In reply to comment #7) > It would be useful to strace qemu to see if we're hitting the limit on the tun > socket. Thanks! Do you think we can limit strace to *network* system calls? The average network traffic is around 30MB/s. The capture file will be huge. --- Additional comment from email@example.com on 2009-10-05 20:02:41 EDT --- Well you can limit it to read/write/select. But since strace only shows the first few bytes of each call, 30MB/s shouldn't be an issue. --- Additional comment from firstname.lastname@example.org on 2009-10-30 09:34:33 EDT --- It hanged again. Here is the content of the guest /proc/net/snmp: Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests OutDiscards OutNoRoutes ReasmTimeout ReasmReqds ReasmOKs ReasmFails FragOKs FragFails FragCreates Ip: 2 64 2972397 0 1 0 0 0 2964429 2578209 1645 0 0 8912 4456 0 100 34 288 Icmp: InMsgs InErrors InDestUnreachs InTimeExcds InParmProbs InSrcQuenchs InRedirects InEchos InEchoReps InTimestamps InTimestampReps InAddrMasks InAddrMaskReps OutMsgs OutErrors OutDestUnreachs OutTimeExcds OutParmProbs OutSrcQuenchs OutRedirects OutEchos OutEchoReps OutTimestamps OutTimestampReps OutAddrMasks OutAddrMaskReps Icmp: 387 140 387 0 0 0 0 0 0 0 0 0 0 703 0 694 0 0 0 0 9 0 0 0 0 0 IcmpMsg: InType3 OutType3 OutType8 IcmpMsg: 387 694 9 Tcp: RtoAlgorithm RtoMin RtoMax MaxConn ActiveOpens PassiveOpens AttemptFails EstabResets CurrEstab InSegs OutSegs RetransSegs InErrs OutRsts Tcp: 1 200 120000 -1 2109 1685 629 4 11 812836 384977 1254 0 644 Udp: InDatagrams NoPorts InErrors OutDatagrams Udp: 2120817 25358 1229 2185641 The content of the host /proc/net/snmp: Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests OutDiscards OutNoRoutes ReasmTimeout ReasmReqds ReasmOKs ReasmFails FragOKs FragFails FragCreates Ip: 1 64 387348 0 16 0 0 0 379356 186067 0 0 1 10093 5002 1 5002 0 10092 Icmp: InMsgs InErrors InDestUnreachs InTimeExcds InParmProbs InSrcQuenchs InRedirects InEchos InEchoReps InTimestamps InTimestampReps InAddrMasks InAddrMaskReps OutMsgs OutErrors OutDestUnreachs OutTimeExcds OutParmProbs OutSrcQuenchs OutRedirects OutEchos OutEchoReps OutTimestamps OutTimestampReps OutAddrMasks OutAddrMaskReps Icmp: 31 0 31 0 0 0 0 0 0 0 0 0 0 31 0 31 0 0 0 0 0 0 0 0 0 0 IcmpMsg: InType3 OutType3 IcmpMsg: 31 31 Tcp: RtoAlgorithm RtoMin RtoMax MaxConn ActiveOpens PassiveOpens AttemptFails EstabResets CurrEstab InSegs OutSegs RetransSegs InErrs OutRsts Tcp: 1 200 120000 -1 13 111 3 0 9 374440 182233 3314 0 3 Udp: InDatagrams NoPorts InErrors OutDatagrams Udp: 457 21 0 495 And the output of 'tc -s qdisc' on the host: [root@tmobilehv ~]# tc -s qdisc qdisc pfifo_fast 0: dev eth0 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 1015135694 bytes 167399 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc pfifo_fast 0: dev eth1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 555090013 bytes 2330497 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc pfifo_fast 0: dev vnet0 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 1483696670 bytes 2900326 pkt (dropped 354126, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 vnet0 is the tap interface assigned to the VM. vnet0 and eth1 are connected to the same bridge. --- Additional comment from email@example.com on 2009-10-30 09:35:57 EDT --- Created an attachment (id=366797) libvirt log --- Additional comment from firstname.lastname@example.org on 2009-10-30 09:54:10 EDT --- Created an attachment (id=366799) Output of strace on qemu Here is the command I ran on the host: strace -e read,write,select -p 10978 -o virtio_hangs --- Additional comment from email@example.com on 2009-10-30 10:43:15 EDT --- OK, the strace shows that there was no attempt to write to tuntap at all so we can rule out the tun driver. --- Additional comment from firstname.lastname@example.org on 2009-12-01 02:57:43 EDT --- Please rebuild the virtio_net module with DEBUG defined. That way we may get some clue as to what state the guest is in when this happens. Also if you can arrange remote access for me it would really help in resolving this. Thanks! --- Additional comment from email@example.com on 2009-12-01 04:48:07 EDT --- Hi Herbert, I have requested access ( via email ). You should get a reply with your access / info on how to connect. Could you provide me a how to rebuild virtio_net with DEBUG ? Regards Olivier --- Additional comment from firstname.lastname@example.org on 2009-12-01 07:44:54 EDT --- Thanks Olivier! To build virtio_net with DEBUG, you need to get the srpm of the same version that's being used on the machine, apply the following patch, and then build the kernel. If you've already built the kernel then make SUBDIRS=drivers/net would be sufficient since we only need the virtio_net module. --- Additional comment from email@example.com on 2009-12-01 07:45:49 EDT --- Created an attachment (id=375048) Enable debugging in virtio_net --- Additional comment from firstname.lastname@example.org on 2010-01-07 07:42:46 EDT --- Created an attachment (id=382219) virtio_net: Fix tx wakeup race condition virtio_net: Fix tx wakeup race condition We free completed TX requests in xmit_tasklet but do not wake the queue. This creates a race condition whereupon the queue may be emptied by xmit_tasklet and yet it remains in the stopped state. This patch fixes this by waking the queue after freeing packets in xmit_tasklet. Signed-off-by: Herbert Xu <email@example.com> --- Additional comment from firstname.lastname@example.org on 2010-01-07 09:04:55 EDT --- Changing into the kernel component
Comment 1 Herbert Xu 2010-01-10 11:34:51 UTC
This bug will be used to deal with the RX component of the problem while the original will be for TX only.
Comment 2 RHEL Program Management 2010-01-10 12:11:32 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Comment 3 Herbert Xu 2010-01-24 00:55:07 UTC
Created attachment 386393 [details] virtio: net refill on out-of-memory This is a back-port of virtio: net refill on out-of-memory If we run out of memory, use keventd to fill the buffer. There's a report of this happening: "Page allocation failures in guest", Message-ID: <email@example.com> Signed-off-by: Rusty Russell <firstname.lastname@example.org> Signed-off-by: David S. Miller <email@example.com>
Comment 4 Keqin Hong 2010-02-03 10:39:17 UTC
Created attachment 388497 [details] socket test programs (srv.c clt.c) srv.c -- server end clt.c -- client end README -- read me file srv, clt -- binary executable on x86_64
Comment 5 Keqin Hong 2010-02-03 11:08:28 UTC
Summary: Guest still lost network with virtio net, while worked fine with e1000 net . Steps: 1. boot a guest with the CLI listed below. 2. check your guest network using ifconfig and make sure it works well (here we mark guest ip as $guest_ip). 3. run ./srv on guest (comment 4 attachment) 4. run multiple "./stress.sh $guestip" from elsewhere, i.e. on other hosts, till no more connections could be established. note ./stress.sh calls clt program, trying to establish 500 connections to srv. 5. ping $guest_ip to see the network status. addtionally: 6. run ./clear_clt.sh on client ends to kill all "./clt" processes. 7. ping $guest_ip again to see result. (if possible, you may kill all ./srv processes inside guest, and ping again) Expected results: after step 5 and step 7, guest network still keeps live. CLI: /usr/libexec/qemu-kvm -m 768M -smp 2 -drive file=RHEL5.4-64-4k.qcow2,if=virtio,cache=off,boot=on -net nic,model=virtio,vlan=1,macaddr=76:00:40:3F:20:10 -net tap,vlan=1,script=/etc/qemu-ifup -boot c -uuid 17644ecc-d3a1-4d3c-a386-12daf50015f1 -usbdevice tablet -no-hpet -rtc-td-hack -no-kvm-pit-reinjection -monitor stdio -notify all -cpu qemu64,+sse2 -balloon none -startdate now -vnc :1 -name 176-guest1 Actual results: host: 2.6.18-164.10.1, kvm-83-105.el5_4.19 -------------------------------------------------------------------- guest | net model | connections | network status 2.6.18-164.11.1.el5PAE | virtio | 2187 | lost 2.6.18-185.el5 x86_64 | virtio | 1220 | lost -------------------------------------------------------------------- note: lost network could be brought up again by ifdown, ifup. host: 2.6.18-186 x86_64, kvm-83-155.el5 -------------------------------------------------------------------- guest | net model | connections | network status 2.6.18-185.el5 x86_64 | virtio | 1178 | lost 2.6.18-185.el5 x86_64 | e1000 | 3574 | ok -------------------------------------------------------------------- Note with e1000, even if we could not make more connections to the "srv" program running inside guest, we could still ping the guest. [root@dhcp-91-175 ~]# ping 10.66.91.51 PING 10.66.91.51 (10.66.91.51) 56(84) bytes of data. 64 bytes from 10.66.91.51: icmp_seq=1 ttl=64 time=58.1 ms 64 bytes from 10.66.91.51: icmp_seq=2 ttl=64 time=33.4 ms 64 bytes from 10.66.91.51: icmp_seq=3 ttl=64 time=21.0 ms 64 bytes from 10.66.91.51: icmp_seq=4 ttl=64 time=1.40 ms 64 bytes from 10.66.91.51: icmp_seq=5 ttl=64 time=91.5 ms 64 bytes from 10.66.91.51: icmp_seq=6 ttl=64 time=0.341 ms 64 bytes from 10.66.91.51: icmp_seq=7 ttl=64 time=0.807 ms ...
Comment 6 Herbert Xu 2010-02-03 11:18:58 UTC
Thanks for testing. Please let me know whether this problem still exists after applying the patch in this bugzilla entry plus the patch in the bug from which this is cloned.
Comment 7 Dor Laor 2010-02-16 15:26:24 UTC
(In reply to comment #5) > Summary: > Guest still lost network with virtio net, while worked fine with e1000 net . Keqin, was your tests with Herbert's fix? Do you need a new rpm for the guest kernel?
Comment 9 Lawrence Lim 2010-02-17 03:59:08 UTC
Adjusting Needinfo flag has been set to the wrong person. llim->Herbert: could you please provide us with a scratch build of the patch attached in Bugzilla? llim->sly, The bug will be updated before Mon, 22 Feb after the holiday in China once the scratch build from Herbert is available.
Comment 10 Herbert Xu 2010-02-17 04:59:02 UTC
Sorry, but I have no time to produce a scratch build. Someone else will need to take care of this. Thanks!
Comment 11 Naphtali Sprei 2010-02-18 09:26:08 UTC
Here's a link to the brew build: https://brewweb.devel.redhat.com/taskinfo?taskID=2265105 Please let me know if any issues.
Comment 12 Keqin Hong 2010-02-23 05:43:56 UTC
Tested with guest kernel-2.6.18-189.el5.x86_64, but virtio-net still lost. The test methods and results were similar to comment #5. (In reply to comment #5) > Keqin, was your tests with Herbert's fix? Do you need a new rpm for the guest kernel? "Patch24891: linux-2.6-net-virtio_net-fix-tx-wakeup-race-condition.patch" is included as of kernel 2.6.18-184.el5, but I couldn't see patch "virtio: net refill on out-of-memory (see comment #3)". Keqin->Naphtali, has patch "virtio: net refill on out-of-memory (see comment #3)" been applied?
Comment 17 Herbert Xu 2010-03-11 10:49:46 UTC
Created attachment 399313 [details] virtio: net refill on out-of-memory As fixing cancel_rearming_delayed_work in RHEL5 is non-trivial, and in order to maintain the ability to unload the virtio_net module, I'm switching the refill work to a timer.
Comment 18 Herbert Xu 2010-03-11 11:03:32 UTC
Created attachment 399317 [details] virtio: net refill on out-of-memory The last version was bogus as we can't sleep in timers. This one simply uses the normal poll path to do the refill.
Comment 26 Keqin Hong 2010-03-16 07:20:46 UTC
Tested on guest kernel 2.6.18-193.el5 that a temporary OOM condition just caused virtio network down shortly which could be restored later. (steps are similar to comment 5)
Comment 27 Jarod Wilson 2010-03-17 15:53:09 UTC
in kernel-2.6.18-194.el5 You can download this test kernel from http://people.redhat.com/jwilson/el5 Please update the appropriate value in the Verified field (cf_verified) to indicate this fix has been successfully verified. Include a comment with verification details.
Comment 29 David Jericho 2010-03-19 01:20:13 UTC
I've been running 2.6.18-194.el5 on x86_64 for over 24 hours now with no repeat of the problems mentioned in this bug. Previously they'd appear within 10 minutes of the host starting service. I'm not sure if it's related as I can't see any obvious changes in the patch attached, but I'll report it anyway. Under 2.6.18-194.el5 on the guest, ethernet frames larger than 4096 bytes won't make it to the guest when using the e1000 interface type. Rebooting using the 2.6.18-164.el5 kernel, jumbo frames work correctly with the e1000. Jumbo frames under 2.6.18-194.el5 using the virtio interface for the guest work as expected. Watching traffic on the host bridge, the incoming packets are appearing, but the guest never sees the packet. 4096 byte frame limit verified using ping. 4096 bytes - 20 for ip header - 14 for ethernet frame header - 8 for ICMP control, gives 4054 byte maximum payload. ping -M do -s 4055 <jumbo set router interface> fails with the e1000 interface type. We were using e1000 and virtio interface types for dual interfaced guests as it seemed to help delay the onset of this bug.
Comment 31 errata-xmlrpc 2010-03-30 07:15:57 UTC
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0178.html
Comment 32 todayyang 2012-12-11 13:57:05 UTC
hi guys, 1:I can not get into the bug 528898.so i just update here 2:maybe this issue solved by: http://lists.gnu.org/archive/html/qemu-devel/2012-04/msg03587.html. but i am not sure.