Bug 588015
Summary: | x86_64 host on Nehalem-EX machines will panic when installing a 4.8 GA kvm guest | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Igor Zhang <yugzhang> | ||||||
Component: | kernel | Assignee: | Herbert Xu <herbert.xu> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Network QE <network-qe> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | urgent | ||||||||
Version: | 5.5 | CC: | benny.won, cvantuin, cww, dhoward, hjia, knoel, lihuang, mjenner, qcai, sandy.garza, tao, tburke, virt-maint | ||||||
Target Milestone: | rc | Keywords: | ZStream | ||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | |||||||||
: | 594561 (view as bug list) | Environment: | |||||||
Last Closed: | 2011-01-13 21:30:22 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 580949, 594561, 616845, 648938 | ||||||||
Attachments: |
|
Description
Igor Zhang
2010-05-02 07:29:07 UTC
----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at drivers/net/tun.c:476 invalid opcode: 0000 [1] SMP last sysfs file: /class/net/lo/ifindex CPU 52 Modules linked in: tun nls_utf8 nfs fscache nfs_acl ipt_MASQUERADE iptable_nat ip_nat xt_state ip_conntrack nfnetlink ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc cpufreq_ondemand acpi_cpufreq freq_table be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i cxgb3 libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi loop dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport ksm(U) kvm_intel(U) kvm(U) joydev sr_mod cdrom sg igb i2c_i801 8021q i2c_core pcspkr dca dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ahci libata shpchp megaraid_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 15689, comm: qemu-kvm Tainted: G 2.6.18-194.2.1.el5 #1 RIP: 0010:[<ffffffff887967d9>] [<ffffffff887967d9>] :tun:tun_chr_readv+0x2b1/0x3a6 RSP: 0018:ffff810c75fd7e48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff810c75fd7e98 RCX: 0000000010015101 RDX: ffff81046d478700 RSI: ffff810c75fd7e9e RDI: ffff810c75fd7e92 RBP: 0000000000010ff6 R08: 0000000000000000 R09: 0000000000000001 R10: ffff810c75fd7e94 R11: 00000000ffffffff R12: ffff81047ed87280 R13: ffff810472cd3d00 R14: 0000000000000000 R15: ffff810c75fd7ef8 FS: 00002acc01e25080(0000) GS:ffff81087ff95840(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002ab832aea490 CR3: 000000107beff000 CR4: 00000000000026e0 Process qemu-kvm (pid: 15689, threadinfo ffff810c75fd6000, task ffff810c7f9db860) Stack: ffff81047d903ea0 ffff81047d1441c0 0000000000000000 ffff810c7f9db860 ffffffff8008d087 ffff810472cd3d30 ffff810472cd3d30 ffff81047e60f3d8 000005a805ea0000 0000000000000000 000043b6503e1600 0000000000000000 Call Trace: [<ffffffff8008d087>] default_wake_function+0x0/0xe [<ffffffff887968e8>] :tun:tun_chr_read+0x1a/0x1f [<ffffffff8000b681>] vfs_read+0xcb/0x171 [<ffffffff80011bd2>] sys_read+0x45/0x6e [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Code: 0f 0b 68 f0 74 79 88 c2 dc 01 f6 42 0a 08 74 0c 80 4c 24 41 RIP [<ffffffff887967d9>] :tun:tun_chr_readv+0x2b1/0x3a6 RSP <ffff810c75fd7e48> <0>Kernel panic - not syncing: Fatal exception [-- MARK -- Fri Apr 30 00:05:00 2010] What's the NIC? It's probably producing LRO packets which is incompatible with bridging. Created attachment 415556 [details]
gro: Fix bogus gso_size on the first fraglist entry
When GRO produces fraglist entries, and the resulting skb hits
an interface that is incapable of TSO but capable of FRAGLIST,
we end up producing a bogus packet with gso_size non-zero.
This was reported in the field with older versions of KVM that
did not set the TSO bits on tuntap.
This patch fixes that.
Reported-by: Igor Zhang <yugzhang>
Signed-off-by: Herbert Xu <herbert.org.au>
Created attachment 415571 [details]
gro: Fix illegal merging of trailer trash
gro: Fix illegal merging of trailer trash
When we've merged skb's with page frags, and subsequently receive
a trailer skb (< MSS) that is not completely non-linear (this can
occur on Intel NICs if the packet size falls below the threshold),
GRO ends up producing an illegal GSO skb with a frag_list.
This is harmless unless the skb is then forwarded through an
interface that requires software GSO, whereupon the GSO code
will BUG.
This patch detects this case in GRO and avoids merging the
trailer skb.
Reported-by: Mark Wagner <mwagner>
Signed-off-by: Herbert Xu <herbert.org.au>
Signed-off-by: David S. Miller <davem>
x86_64 host on Nehalem (E5504) machines with 5.4 kernels will also surely panic when file transfer to a RHEL4.8 kvm guest if enable virtio driver on guest OS. This will occer on Intel 82576 NIC . If replace Intel 82576 with BCM5709 NIC , it works OK . If we apply the patch to RHEL5.4 kernel source and rebuild the kernel , it resolved the issue . in kernel-2.6.18-206.el5 You can download this test kernel from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed. *** Bug 619255 has been marked as a duplicate of this bug. *** *** Bug 549743 has been marked as a duplicate of this bug. *** An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0017.html ------- Comment From linuxram.com 2010-04-20 17:51 EDT------- It is rhel5 host. O well. I see the confusion. This problem is seen with virtio, not with vhost. the qemu command is /usr/libexec/qemu-kvm -name rhel5 -drive file=rhel5.img,boot=on,if=virtio -net nic,macaddr=54:52:00:46:26:80,model=virtio -net tap,script=/etc/qemu-if,ifname=vnet0 -m 512 ------- Comment From 2010-08-09 04:06 EDT------- Redhat, Any updates on this bug? Is this going to be fixed in RHEL5.6? Thanks Muni ------- Comment From coschult.com 2011-02-15 19:50 EDT------- I have verified that this issue is not present in rhel 5.6 RC1. |