Bug 1074869

Summary: guest crash when do s3 after netperf stress with vhost
Product: Red Hat Enterprise Linux 7 Reporter: Qian Guo <qiguo>
Component: qemu-kvm-rhevAssignee: Vlad Yasevich <vyasevic>
Status: CLOSED WONTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: medium    
Version: 7.0CC: amit.shah, chayang, hhuang, juzhang, knoel, michen, rbalakri, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-27 22:06:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 923626    

Description Qian Guo 2014-03-11 07:22:45 UTC
Description of problem:
Rhel7 guest crashed when do S3(pm-suspend) after netperf stress if guest with vhost=on.

Version-Release number of selected component (if applicable):
# uname -r
3.10.0-99.el7.x86_64
# rpm -q qemu-kvm-rhev
qemu-kvm-rhev-1.5.3-50.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Boot guest with virtio-net and vhost=on 
# /usr/libexec/qemu-kvm -cpu Penryn -m 4G -smp 4 -M pc -enable-kvm  -name testovs  -drive file=/home/rhel7basecp2.qcow2_v3,if=none,format=qcow2,werror=stop,rerror=stop,media=disk,id=drive-blk0-disk0 -device virtio-blk-pci,drive=drive-blk0-disk0,id=virtio-disk0 -nodefaults -nodefconfig -monitor stdio   -netdev tap,id=hostdev0,script=/etc/ovs-ifup,downscript=/etc/ovs-ifdown,vhost=on -device virtio-net,netdev=hostdev0,mac=54:52:1a:46:0b:01,id=vnet0 -spice port=5900,disable-ticketing -global qxl-vga.vram_size=67108864 -vga qxl -boot menu=on  -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0  -monitor stdio

2.In guest do netperf test, host act the netserver
# netperf -H 2001::1e6f:65ff:fe06:b42e -l 30

3.In guest, do S3
# pm-suspend

Actual results:
Guest crashed:
# crash /usr/lib/debug/lib/modules/3.10.0-99.el7.x86_64/vmlinux vmcore
...
This GDB was configured as "x86_64-unknown-linux-gnu"...

      KERNEL: /usr/lib/debug/lib/modules/3.10.0-99.el7.x86_64/vmlinux
    DUMPFILE: vmcore  [PARTIAL DUMP]
        CPUS: 4
        DATE: Tue Mar 11 15:13:06 2014
      UPTIME: 00:01:47
LOAD AVERAGE: 0.91, 0.44, 0.17
       TASKS: 337
    NODENAME: localhost.localdomain
     RELEASE: 3.10.0-99.el7.x86_64
     VERSION: #1 SMP Fri Feb 28 13:26:10 EST 2014
     MACHINE: x86_64  (2826 Mhz)
      MEMORY: 4 GB
       PANIC: "Oops: 0000 [#1] SMP " (check log for details)
         PID: 20
     COMMAND: "migration/1"
        TASK: ffff880139b70000  [THREAD_INFO: ffff880139b6c000]
         CPU: 1
       STATE: TASK_RUNNING (PANIC)
crash> bt 
PID: 20     TASK: ffff880139b70000  CPU: 1   COMMAND: "migration/1"
 #0 [ffff880139b6d958] machine_kexec at ffffffff8103f3e2
 #1 [ffff880139b6d9a8] crash_kexec at ffffffff810c75e3
 #2 [ffff880139b6da70] oops_end at ffffffff815cb4e8
 #3 [ffff880139b6da98] no_context at ffffffff815bc5f1
 #4 [ffff880139b6dae0] __bad_area_nosemaphore at ffffffff815bc671
 #5 [ffff880139b6db28] bad_area_nosemaphore at ffffffff815bc7db
 #6 [ffff880139b6db38] __do_page_fault at ffffffff815ce1de
 #7 [ffff880139b6dc30] do_page_fault at ffffffff815ce3ea
 #8 [ffff880139b6dc58] do_async_page_fault at ffffffff815cdaa9
 #9 [ffff880139b6dc70] async_page_fault at ffffffff815ca7b8
#10 [ffff880139b6dd58] native_cpu_disable at ffffffff81037882
#11 [ffff880139b6dd70] take_cpu_down at ffffffff815a70b3
#12 [ffff880139b6dd88] multi_cpu_stop at ffffffff810d669e
#13 [ffff880139b6ddb8] cpu_stopper_thread at ffffffff810d6868
#14 [ffff880139b6de80] smpboot_thread_fn at ffffffff81087fff
#15 [ffff880139b6ded0] kthread at ffffffff8107fc10
#16 [ffff880139b6df50] ret_from_fork at ffffffff815d2b2c


Expected results:
no crash and can access S3 state

Additional info:
Just hit this with vhost, when vhost=off, can not hit this issue.

did not enable zerocopy:
# cat /sys/module/vhost_net/parameters/experimental_zcopytx 
0

Comment 3 Ronen Hod 2014-09-03 18:29:11 UTC
Tried to modify the component to qemu-kvm-rhev. Bugzilla didn't let me.

Comment 4 Vlad Yasevich 2014-09-04 19:45:28 UTC
Moved to qemu-kvm-rhev.

Comment 7 Vlad Yasevich 2015-03-27 22:06:00 UTC
S3 support is deprioritized and all s3 related bugs are beeing closed as WONTFIX.
Closing.

-vlad