Bug 1005016

Summary: Qemu w/ macvtap backend virtual network hangs when the fd is over 1024
Product: Red Hat Enterprise Linux 6 Reporter: Qian Guo <qiguo>
Component: qemu-kvmAssignee: Vlad Yasevich <vyasevic>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.5CC: chayang, famz, jen, jgalipea, juzhang, knoel, michen, mkenneth, qzhang, rbalakri, rpacheco, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-0.12.1.2-2.462.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-07-22 06:03:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1024684, 1125735, 1199873    

Description Qian Guo 2013-09-06 03:09:51 UTC
Description of problem:
Boot a guest w/ macvtap, and set fd as 1025 (over 1024), qemu hangs, and from top in host, the qemu-kvm process exhaust 100% cpu resource.

when set fd=1024, qemu-kvm does not hang, but qemu process exhaust 100% cpu resource, and guest can not boot up.

when fd < 1024, all are ok.


Version-Release number of selected component (if applicable):
# uname -r
2.6.32-416.el6.x86_64
# rpm -q qemu-kvm
qemu-kvm-0.12.1.2-2.400.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Open a shell and ulimit to 10240
# ulimit -n 10240

2. In the same shell, launch guest w/ macvtap and fd > 1024
# /usr/libexec/qemu-kvm -cpu SandyBridge -m 2G -smp 2,sockets=1,cores=2,threads=1 -enable-kvm  -name testovs -rtc base=localtime,clock=host,driftfix=slew -drive file=/home/RHEL-Server-6.4-64-virtio.qcow2,if=none,format=qcow2,werror=stop,rerror=stop,cache=none,media=disk,id=drive-scsi0-disk0 -device virtio-scsi-pci,id=scsi0,addr=0x4 -device scsi-hd,scsi-id=0,lun=0,bus=scsi0.0,drive=drive-scsi0-disk0,id=virtio-disk0,bootindex=1 -nodefaults -nodefconfig -monitor stdio   -netdev tap,id=macvtap_netdev,fd=1025 -device virtio-net-pci,netdev=macvtap_netdev,mac=92:65:21:bf:09:3c 1025<>/dev/tap9 -vnc :20 -vga std -boot menu=on
3.

Actual results:
Guest start booting, and in shot time, qemu and guest hangs.

Expected results:
No hang and error.

Additional info:

Comment 2 Ronen Hod 2013-09-11 18:32:37 UTC
Deferring to 6.6 since it is an old bug (not a regression), and the backport is not trivial. Nobody really needs 1024 FDs.
See https://bugzilla.redhat.com/show_bug.cgi?id=892977

Comment 4 Chao Yang 2014-07-22 09:49:06 UTC
*** Bug 1121589 has been marked as a duplicate of this bug. ***

Comment 6 Ronen Hod 2014-08-20 18:32:37 UTC
A big (24 patches + tests) and sensitive fix. (pulls in all the async I/O that is used by storage/networking ...)
Needs a lot of QA, so we have to push it to 6.7, and maybe a Z-stream.

Comment 8 Fam Zheng 2015-02-27 06:58:57 UTC
*** Bug 1196955 has been marked as a duplicate of this bug. ***

Comment 9 Jeff Nelson 2015-03-12 22:01:16 UTC
Patches posted to rhvirt-patches

Comment 10 Jeff Nelson 2015-03-25 23:06:20 UTC
Fix included in qemu-kvm-0.12.1.2-2.462.el6

Comment 12 Chao Yang 2015-03-27 09:42:24 UTC
Reproduced with qemu-kvm-0.12.1.2-2.458.el6.x86_64.

Steps:
1. set limit of open files to 10240
2. set macvtap up
3. start a qemu-kvm instance with fd > 1024

Actual Result:

(qemu) qemu-kvm: /builddir/build/BUILD/qemu-kvm-0.12.1.2/vl.c:4042: main_loop_wait: Assertion `ioh->fd < 1024' failed.

Program received signal SIGABRT, Aborted.
0x00007ffff4a5e625 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff4a5e625 in raise () from /lib64/libc.so.6
#1  0x00007ffff4a5fe05 in abort () from /lib64/libc.so.6
#2  0x00007ffff4a5774e in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007ffff4a57810 in __assert_fail () from /lib64/libc.so.6
#4  0x00007ffff7db2cd6 in main_loop_wait (timeout=1000) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4042
#5  0x00007ffff7dd622a in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2258
#6  0x00007ffff7db74a7 in main_loop (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>)
    at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4285
#7  main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>)
    at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:6742




Verified pass with qemu-kvm-0.12.1.2-2.462.el6.x86_64. Neither crash nor hang happened. Guest owned an available IP.

CLI:

/usr/libexec/qemu-kvm -name sriov-test -S -M rhel6.6.0  ... -netdev tap,id=hostnet0,vhost=on,fd=1025 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=b2:8b:46:58:ab:f1,bus=pci.0,addr=0x3 1025<>/dev/tap11 

As per above, this issue has fixed correctly.

Moving to VERIFIED.

Comment 14 errata-xmlrpc 2015-07-22 06:03:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1275.html