Bug 518057 - Host panic after scp file to a broken guest.
Summary: Host panic after scp file to a broken guest.
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.5
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Neil Horman
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-08-18 16:01 UTC by lihuang
Modified: 2010-06-28 11:57 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-06-28 11:57:17 UTC


Attachments (Terms of Use)
dmidecode (59.79 KB, text/plain)
2009-08-18 16:01 UTC, lihuang
no flags Details

Description lihuang 2009-08-18 16:01:37 UTC
Created attachment 357823 [details]
dmidecode

Description of problem:
have 4 vms ran in the host.After scp a rpm file to the 3rd guest. host panic.
then I checked the images. 'qemu-img info' show the 3rd vm is full. but reboot  and login to the guest. I found it has 14G free space. (so the image is broken ? )

If only boot the broken guest. can not reproduce this issue.
If do the same test with 4 normal guest images. can not reproduce this issue.

[root@localhost ~]# qemu-img info /data/image/image/rhel4/RHEL-4.8-32-virtio.raw 
image: /data/image/image/rhel4/RHEL-4.8-32-virtio.raw
file format: raw
virtual size: 20G (21474836480 bytes)
disk size: 20G
[root@localhost ~]# 

CLI:
 qemu-kvm RHEL-4.8-64-virtio.raw -smp 2 -m 1024 -name host1 -usbdevice tablet -net nic,vlan=0,macaddr=00:11:81:02:f6:fe,model=virtio -net tap,vlan=0,script=/data/ovirtkvm -vnc :2& 
 qemu-kvm RHEL-Server-5.4-64-virtio.raw -smp 2 -m 1024 -name host2 -usbdevice tablet -net nic,vlan=0,macaddr=00:11:81:03:f6:fe,model=virtio -net tap,vlan=0,script=/data/ovirtkvm1 -vnc :3& 
 qemu-kvm RHEL-4.8-32-virtio.raw -smp 2 -m 1024 -name host3 -usbdevice tablet -net nic,vlan=0,macaddr=00:11:81:04:f6:fe,model=virtio -net tap,vlan=0,script=/data/ovirtkvm2 -vnc :4& 
 qemu-kvm RHEL-Server-5.4-32-virtio.raw -smp 2 -m 1024 -name host4 -usbdevice tablet -net nic,vlan=0,macaddr=00:11:81:05:f6:fe,model=virtio -net tap,vlan=0,script=/data/ovirtkvm3 -vnc :5& 

Version-Release number of selected component (if applicable):
[root@localhost ~]# cat /etc/redhat-release 
Red Hat Enterprise Virtualization Hypervisor release 5.4-2.0.99 (15)
[root@localhost ~]# rpm -q kvm
kvm-83-105.el5
[root@localhost ~]# rpm -q kernel
kernel-2.6.18-162.el5


How reproducible:
100%

Steps to Reproduce:
1.
2.
3.
  
Actual results:
 localhost.localdomain.englab.nay.redhat.com login: ----------- [cut here ] --------- [please bite here ] ---------
 Kernel BUG at drivers/net/tun.c:476
 invalid opcode: 0000 [1] SMP 
 last sysfs file: /devices/pci0000:7f/0000:7f:00.0/irq
 CPU 1 
 Modules linked in: nfs fscache nfs_acl bonding tun lockd sunrpc ipt_REJECT xt_state ip_conntrack nfnetlink xt_multiport iptable_filter ip_tables xt_physdev bridge ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i cxgb3 libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_i
 scsi dm_multipath scsi_dh floppy ksm(U) kvm_intel(U) kvm(U) sg igb tg3 8021q shpchp squashfs dm_snapshot ext3 jbd dm_mod sd_mod ehci_hcd mptsas mptscsih ahci libata uhci_hcd mptbase scsi_transport_sas loop sr_mod scsi_mod cdrom
 Pid: 6269, comm: qemu-kvm Tainted: G      2.6.18-162.el5 #1
 RIP: 0010:[<ffffffff8859f7d9>]  [<ffffffff8859f7d9>] :tun:tun_chr_readv+0x2b1/0x3a6
 RSP: 0018:ffff810304be7e48  EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff810304be7e98 RCX: 0000000011004510
 RDX: ffff8102302b1700 RSI: ffff810304be7e9e RDI: ffff810304be7e92
 RBP: 0000000000010ff6 R08: 0000000000000000 R09: 0000000000000001
 R10: ffff810304be7e94 R11: 0000000000000048 R12: ffff81032c1bf680
 R13: ffff810306269500 R14: 0000000000000000 R15: ffff810304be7ef8
 FS:  00002b8b87164080(0000) GS:ffff81010b38a7c0(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000002a9556c000 CR3: 0000000304429000 CR4: 00000000000026e0
 Process qemu-kvm (pid: 6269, threadinfo ffff810304be6000, task ffff810329c0e040)
 Stack:  ffff8103097e4b00 ffff810315501dc0 0000000000000000 ffff810329c0e040
  ffffffff8008be55 ffff810306269528 ffff810306269528 ffff81032d0dfd20
  000005a805ea0000 0000000000000000 0000fef604811100 0000000000000000
 Call Trace:
  [<ffffffff8008be55>] default_wake_function+0x0/0xe
  [<ffffffff8859f8e8>] :tun:tun_chr_read+0x1a/0x1f
  [<ffffffff8000b695>] vfs_read+0xcb/0x171
  [<ffffffff

Expected results:


Additional info:
[root@localhost ~]# brctl show
bridge name     bridge id               STP enabled     interfaces
breth0          8000.001f29038603       no              tap0
                                                        eth0
breth1          8000.001f29038602       no              tap1
                                                        eth1
breth2          8000.001b21398b18       no              tap2
                                                        eth2
breth3          8000.001b21398b19       no              eth3


[root@localhost ~]# lspci
00:00.0 Host bridge: Intel Corporation 5520 I/O Hub to ESI Port (rev 13)
00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 13)
00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 3 (rev 13)
00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7 (rev 13)
00:10.0 PIC: Intel Corporation 5520/5500/X58 Physical and Link Layer Registers Port 0 (rev 13)
00:10.1 PIC: Intel Corporation 5520/5500/X58 Routing and Protocol Layer Registers Port 0 (rev 13)
00:11.0 PIC: Intel Corporation 5520/5500 Physical and Link Layer Registers Port 1 (rev 13)
00:11.1 PIC: Intel Corporation 5520/5500 Routing & Protocol Layer Register Port 1 (rev 13)
00:14.0 PIC: Intel Corporation 5520/5500/X58 I/O Hub System Management Registers (rev 13)
00:14.1 PIC: Intel Corporation 5520/5500/X58 I/O Hub GPIO and Scratch Pad Registers (rev 13)
00:14.2 PIC: Intel Corporation 5520/5500/X58 I/O Hub Control Status and RAS Registers (rev 13)
00:15.0 PIC: Intel Corporation 5520/5500/X58 Trusted Execution Technology Registers (rev 13)
00:1a.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4
00:1a.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #5
00:1a.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #6
00:1a.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2
00:1b.0 Audio device: Intel Corporation 82801JI (ICH10 Family) HD Audio Controller
00:1c.0 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 1
00:1c.4 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 5
00:1c.5 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 6
00:1d.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1
00:1d.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2
00:1d.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3
00:1d.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90)
00:1f.0 ISA bridge: Intel Corporation 82801JIR (ICH10R) LPC Interface Controller
00:1f.2 RAID bus controller: Intel Corporation 82801 SATA RAID Controller
01:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5764M Gigabit Ethernet PCIe (rev 10)
02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5764M Gigabit Ethernet PCIe (rev 10)
0f:00.0 VGA compatible controller: nVidia Corporation G96 [Quadro FX 380] (rev a1)
28:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
28:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
37:09.0 FireWire (IEEE 1394): Agere Systems FW322/323 (rev 70)
40:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 3 (rev 13)
40:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7 (rev 13)
40:09.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 9 (rev 13)
40:10.0 PIC: Intel Corporation 5520/5500/X58 Physical and Link Layer Registers Port 0 (rev 13)
40:10.1 PIC: Intel Corporation 5520/5500/X58 Routing and Protocol Layer Registers Port 0 (rev 13)
40:11.0 PIC: Intel Corporation 5520/5500 Physical and Link Layer Registers Port 1 (rev 13)
40:11.1 PIC: Intel Corporation 5520/5500 Routing & Protocol Layer Register Port 1 (rev 13)
40:14.0 PIC: Intel Corporation 5520/5500/X58 I/O Hub System Management Registers (rev 13)
40:14.1 PIC: Intel Corporation 5520/5500/X58 I/O Hub GPIO and Scratch Pad Registers (rev 13)
40:14.2 PIC: Intel Corporation 5520/5500/X58 I/O Hub Control Status and RAS Registers (rev 13)
40:15.0 PIC: Intel Corporation 5520/5500/X58 Trusted Execution Technology Registers (rev 13)
41:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08)
7f:00.0 Host bridge: Intel Corporation Xeon 5500/Core i7 QuickPath Architecture Generic Non-Core Registers (rev 05)
7f:00.1 Host bridge: Intel Corporation Xeon 5500/Core i7 QuickPath Architecture System Address Decoder (rev 05)
7f:02.0 Host bridge: Intel Corporation Xeon 5500/Core i7 QPI Link 0 (rev 05)
7f:02.1 Host bridge: Intel Corporation Xeon 5500/Core i7 QPI Physical 0 (rev 05)
7f:02.4 Host bridge: Intel Corporation Xeon 5500/Core i7 QPI Link 1 (rev 05)
7f:02.5 Host bridge: Intel Corporation Xeon 5500/Core i7 QPI Physical 1 (rev 05)
7f:03.0 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller (rev 05)
7f:03.1 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Target Address Decoder (rev 05)
7f:03.2 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller RAS Registers (rev 05)
7f:03.4 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Test Registers (rev 05)
7f:04.0 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 0 Control Registers (rev 05)
7f:04.1 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 0 Address Registers (rev 05)
7f:04.2 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 0 Rank Registers (rev 05)
7f:04.3 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 0 Thermal Control Registers (rev 05)
7f:05.0 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 1 Control Registers (rev 05)
7f:05.1 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 1 Address Registers (rev 05)
7f:05.2 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 1 Rank Registers (rev 05)
7f:05.3 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 1 Thermal Control Registers (rev 05)
7f:06.0 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 2 Control Registers (rev 05)
7f:06.1 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 2 Address Registers (rev 05)
7f:06.2 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 2 Rank Registers (rev 05)
7f:06.3 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 2 Thermal Control Registers (rev 05)




ovirtkvm$num
#!/bin/sh
switch=breth$num
/sbin/ifconfig $1 0.0.0.0 up
/usr/sbin/brctl addif ${switch} $1

Comment 1 Neil Horman 2010-01-22 20:03:28 UTC
I'm sorry, I'm trying to clarify you're summary above.  It says that you cannot reproduce the issue if you run with 4 normal guests (by which I assume you mean, non-broken images).  Is that correct?  If so, then I would say this is not a bug.  Unpredictable results will occur if you use corrupted guest images.  Or can you clarify your summary?

FWIW, the bug is the result of getting a gso type value that tun doesn't recognize (likely a udp type).  The additional check exists upstream, so there may be a path that can help us, but I don't want to backport it if its not the problem, and it sounds like there might not be any problem at all.

Comment 2 lihuang 2010-01-24 17:54:51 UTC
Hi Neil.
   sorry, didn't update the bug in time...
   we did more testing after the bug,and found there has nothing to do with the "broken image",can reproduce with other good image. (but not 100%reproducible).

   It should be hardware related,we have reproduce it in breth2/3 which is set on the intel 82576 nic,while on breth0/1 is OK.


Thanks.
Lijun Huang

Comment 3 Neil Horman 2010-01-25 00:42:25 UTC
does the problem reproduce if you use ethtool to disable gso and gro on the physical interfaces on the system to which the guests are attached?

Comment 4 lihuang 2010-01-25 17:19:19 UTC
Hi Neil 
   the Host which has the intel 82576 NIC is not available today, I am retest in the week. before retesting, want to confirm with you :
1. the command line is to turn off gso/gro is "ethtool --offload brethX gso off gro off " ?
2. besides 1,any other precondition ?

Thanks
Lijun Huang

Comment 5 Neil Horman 2010-01-25 17:50:04 UTC
yup, that should be it, assuming that brethX is the physical interface (it won't work through the virt interfaces).

Comment 6 lihuang 2010-01-29 06:52:42 UTC
Hi, Neil 

Disable gso and gro doesn't help. Got the same oops.

the stable steps to reproduce:
1.boot host with the intel 82576 NIC ( has two ports )

2.config bridge.
/etc/sysconfig/network-scripts/ifcfg-ethX
DEVICE=ethX
BRIDGE=brethX
HWADDR=xx:xx:xx:xx:xx:xx
ONBOOT=yes

/etc/sysconfig/network-scripts/ifcfg-brethX
DEVICE=brethX
TYPE=Bridge
BOOTPROTO=dhcp
ONBOOT=yes

/etc/qemu-ifupX
#!/bin/sh
switch=brethX
/sbin/ifconfig $1 0.0.0.0 up
/usr/sbin/brctl addif ${switch} $1

3.boot two kvm RHEL4u8 guest (maybe anything single guest should ok)
/usr/libexec/qemu-kvm -m 2048 -smp 2 -name 4.8.X -usbdevice tablet -net nic,vlan=0,macaddr=00:11:81:03:f6:fX,model=virtio -net tap,vlan=0,script=/etc/qemu-ifup1 -vnc :X -drive file=/data/images/RHEL4.8.X.qcow2 -monitor stdio -uuid a68fe625-409c-4ac9-8a4f-5ae0e5706d9X

4.scp something to guestX.

Comment 7 Neil Horman 2010-03-17 16:39:28 UTC
I don't suppose you already have this set up on a system in beaker that you can loan to me do you? It would save me some time.  Thanks!

Comment 9 Neil Horman 2010-06-12 13:01:10 UTC
ok, let me know

Comment 11 Neil Horman 2010-06-28 11:14:44 UTC
Ok, well, this would indicate that something changed between kernel -162 and kernel -194 which would have fixed this problem.  I could take the time to bisect and figure out exactly what happened, or we can just close this as CURRENTRELEASE.  I'm inclined to do the latter since I've got lots of other issues waiting, but if you have a need, I'll dig in to figure out the correcting commit.  Let me know what you want to do.

Comment 12 lihuang 2010-06-28 11:57:17 UTC
closing.


Note You need to log in before you can comment on or make changes to this bug.