Bug 603193 - Host OOPS when running using vhost anf guest transmits to host
Summary: Host OOPS when running using vhost anf guest transmits to host
Status: CLOSED DUPLICATE of bug 602927
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel   
(Show other bugs)
Version: 6.0
Hardware: All
OS: Linux
Target Milestone: rc
: ---
Assignee: Herbert Xu
QA Contact: Red Hat Kernel QE team
Depends On:
TreeView+ depends on / blocked
Reported: 2010-06-11 19:58 UTC by Mark Wagner
Modified: 2013-01-09 22:43 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2010-06-15 12:04:45 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
screen shot from Avocent of stace trace (121.86 KB, image/jpeg)
2010-06-11 19:58 UTC, Mark Wagner
no flags Details

Description Mark Wagner 2010-06-11 19:58:35 UTC
Created attachment 423384 [details]
screen shot from Avocent of stace trace

Description of problem:
Running a guest with vhost_net.  Trying to run a netperf TCP_STREAM from the guest to the host causes the host to panic.  I have not observed this when I am not using vhost_net

Version-Release number of selected component (if applicable):
[root@perf22 xen4]# uname -a
Linux perf22.lab.bos.redhat.com 2.6.32-33.el6.x86_64 #1 SMP Thu Jun 3 13:00:03 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
qemu-img.x86_64   2:  
qemu-kvm.x86_64   2:
[root@dhcp47-46 np2.4]# uname -a
Linux dhcp47-46.lab.bos.redhat.com 2.6.32-33.el6.x86_64 #1 SMP Thu Jun 3 13:00:03 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:
Every time 

Steps to Reproduce:
1.use vhost net (modprobe vhost_net)
2.bring up guest, I use the folowing line:
/usr/libexec/qemu-kvm  -m 8192 -smp 4 -name kvm1 -uuid f71920c6-5da4-44f3-b675-355708c9a8ae -usbdevice tablet -cpu qemu64,+sse2,+cx16,+ssse3,+popcnt -monitor pty -boot c -drive file=/xen4/rhel6kvm.img,if=ide,index=0,boot=on -net nic,macaddr=00:16:3e:13:24:8a,vlan=0,model=virtio -net tap,script=/etc/qemu-ifup3,vlan=0,ifname=enet13  -net nic,macaddr=00:16:3e:07:9b:48,vlan=1,model=virtio -net tap,script=/etc/qemu-ifup2,vlan=1,ifname=enet12  -netdev tap,script=/etc/qemu-ifup5,id=10g,vhost=on -device virtio-net-pci,netdev=10g,id=enet15,mac=00:16:3e:24:ce:09,bus=pci.0,addr=0x6 -usb -vnc :10 -k en-us &

3.run a netperf from the guest to the host
./netperf -l 3000 -H -D -- -m 64
4. Usually crashes within a minute or two

Actual results:
Host crashes

Expected results:

Things just work?

Additional info:
Have a 24G vmcore file if you want it.  Also a screen shot from the Avocent. Here is the stack trace from the vmcore file

      KERNEL: vmlinux                           
    DUMPFILE: vmcore
        CPUS: 8
        DATE: Fri Jun 11 16:54:37 2010
      UPTIME: 00:03:35
LOAD AVERAGE: 0.33, 0.19, 0.07
       TASKS: 212
    NODENAME: perf22.lab.bos.redhat.com
     RELEASE: 2.6.32-33.el6.x86_64
     VERSION: #1 SMP Thu Jun 3 13:00:03 EDT 2010
     MACHINE: x86_64  (2933 Mhz)
      MEMORY: 24 GB
       PANIC: "Oops: 0000 [#1] SMP " (check log for details)
         PID: 1991
     COMMAND: "netserver"
        TASK: ffff88061e526080  [THREAD_INFO: ffff880621e5e000]
         CPU: 6

crash> bt
PID: 1991   TASK: ffff88061e526080  CPU: 6   COMMAND: "netserver"
 #0 [ffff880621e5f520] machine_kexec at ffffffff8103689b
 #1 [ffff880621e5f580] crash_kexec at ffffffff810b8538
 #2 [ffff880621e5f650] oops_end at ffffffff814db3c0
 #3 [ffff880621e5f680] no_context at ffffffff8104545b
 #4 [ffff880621e5f6d0] __bad_area_nosemaphore at ffffffff810456e5
 #5 [ffff880621e5f720] bad_area_nosemaphore at ffffffff810457b3
 #6 [ffff880621e5f730] do_page_fault at ffffffff814dcec8
 #7 [ffff880621e5f780] page_fault at ffffffff814da735
    [exception RIP: __br_deliver+100]
    RIP: ffffffffa044d824  RSP: ffff880621e5f838  RFLAGS: 00010292
    RAX: 0000000000000000  RBX: ffff88061dba86c0  RCX: ffff880620ac82c0
    RDX: ffff880622a8629c  RSI: 0000000000000286  RDI: ffff880622a8629c
    RBP: ffff880621e5f858   R8: ffff880622a8629c   R9: ffff880621e5f7c0
    R10: 0000000000000001  R11: 0000000000000004  R12: ffff8805ec8c16c0
    R13: ffff8805ec8c16f8  R14: ffff88060337f4ce  R15: ffff88061dba8000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffff880621e5f860] br_deliver at ffffffffa044d8d5
 #9 [ffff880621e5f870] br_dev_xmit at ffffffffa044c5ac
#10 [ffff880621e5f8a0] dev_hard_start_xmit at ffffffff81412848
#11 [ffff880621e5f8f0] dev_queue_xmit at ffffffff81415d86
#12 [ffff880621e5f940] ip_finish_output at ffffffff81451d9c
#13 [ffff880621e5f980] ip_output at ffffffff81452028
#14 [ffff880621e5f9b0] ip_local_out at ffffffff81450fb5
#15 [ffff880621e5f9d0] ip_queue_xmit at ffffffff81451800
#16 [ffff880621e5fa80] tcp_transmit_skb at ffffffff814664b1
#17 [ffff880621e5faf0] tcp_send_ack at ffffffff81467dc9
#18 [ffff880621e5fb10] tcp_cleanup_rbuf at ffffffff81458b86
#19 [ffff880621e5fb30] tcp_recvmsg at ffffffff8145bac9
#20 [ffff880621e5fc40] sock_common_recvmsg at ffffffff81402b89
#21 [ffff880621e5fc80] sock_recvmsg at ffffffff81400533
#22 [ffff880621e5fe40] sys_recvfrom at ffffffff81400881
#23 [ffff880621e5ff80] system_call_fastpath at ffffffff81013172
    RIP: 00007f6a186802d2  RSP: 00007fff84a67480  RFLAGS: 00010217
    RAX: 000000000000002d  RBX: ffffffff81013172  RCX: 00007f6a186802d2
    RDX: 0000000000015554  RSI: 0000000001851df0  RDI: 0000000000000008
    RBP: 0000000000000008   R8: 0000000000000000   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000246  R12: 0000000000624d60
    R13: 00007fff84a696d0  R14: 0000000000000004  R15: 00000000000e4c15
    ORIG_RAX: 000000000000002d  CS: 0033  SS: 002b
crash> q

Comment 2 RHEL Product and Program Management 2010-06-11 20:12:57 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for

Comment 3 Dor Laor 2010-06-13 09:37:20 UTC
Michael thinks it is a clone of bz 602927

Comment 4 Herbert Xu 2010-06-15 12:04:45 UTC
Yes this is a duplicate.

*** This bug has been marked as a duplicate of bug 602927 ***

Note You need to log in before you can comment on or make changes to this bug.