Bug 892977
| Summary: | qemu crashed when using macvtap with fd number is over 1024( already ulimit "open files" to 10240) | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Qian Guo <qiguo> | ||||
| Component: | qemu-kvm | Assignee: | Amos Kong <akong> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 7.0 | CC: | ailan, chayang, hhuang, jasowang, juli, juzhang, michen, mrezanin, mst, pbonzini, rhod, virt-maint | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | qemu-kvm-1.5.0-1.el7 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2014-06-13 09:32:01 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
For there's limitation in systems like this # ulimit -a|grep open open files (-n) 1024 so I modified this limitation like this : # ulimit -n 10240 And then try to launch a guest with a macvtap witch fd=1032, the qemu crashed. like this: # /usr/libexec/qemu-kvm -cpu SandyBridge -m 2G -smp 2,sockets=1,cores=2,threads=1 -enable-kvm -name testovs -rtc base=localtime,clock=host,driftfix=slew -drive file=/home/rhel6u4_64.qcow2,if=none,format=qcow2,werror=stop,rerror=stop,cache=none,media=disk,id=drive-scsi0-disk0 -device virtio-scsi-pci,id=scsi0,addr=0x4 -device scsi-hd,scsi-id=0,lun=0,bus=scsi0.0,drive=drive-scsi0-disk0,id=virtio-disk0,bootindex=1 -nodefaults -nodefconfig -monitor stdio -netdev tap,id=macvtap_netdev,fd=1032 -device virtio-net-pci,netdev=macvtap_netdev,mac=da:da:d3:7d:60:55 1032<>/dev/tap1032 -vnc :10 -vga std -boot menu=on *** buffer overflow detected ***: /usr/libexec/qemu-kvm terminated ======= Backtrace: ========= /lib64/libc.so.6(__fortify_fail+0x37)[0x7fb2c15374a7] /lib64/libc.so.6(+0x3805b08620)[0x7fb2c1535620] /lib64/libc.so.6(+0x3805b0a417)[0x7fb2c1537417] /usr/libexec/qemu-kvm(+0x19b0dd)[0x7fb2c6afe0dd] /usr/libexec/qemu-kvm(+0x1a9238)[0x7fb2c6b0c238] /usr/libexec/qemu-kvm(main+0x1029)[0x7fb2c69e1379] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7fb2c144ea05] /usr/libexec/qemu-kvm(+0x827a9)[0x7fb2c69e57a9] ======= Memory map: ======== 7fb22a818000-7fb22a82d000 r-xp 00000000 fd:01 1049991 /usr/lib64/libgcc_s-4.7.2-20121109.so.1 7fb22a82d000-7fb22aa2c000 ---p 00015000 fd:01 1049991 /usr/lib64/libgcc_s-4.7.2-20121109.so.1 ....... 7fff74e33000-7fff74e54000 rw-p 00000000 00:00 0 [stack] 7fff74ebc000-7fff74ebd000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] Aborted (core dumped) I will attach this crash message as a txt file. Created attachment 674733 [details]
qemu crashes when ulimit "fd" to 10240
Qemu does not have direct support for macvtap, so we the tun/tap configuration interface. And Fd range is not related with the character device(when create macvtap network interfaces, corresponding created these char devices) :/dev/tapN( with N corresponding to the number of network interface index of the new macvtap endpoint) *** If boot a guest like this, no issur occurs <qemu-kvm>... -device virtio-net-pci,netdev=macvtap_netdev,mac=da:da:d3:7d:60:59 -netdev tap,id=macvtap_netdev,fd=10 10<>/dev/tap5010 ... *** So this bug is just one related with "file descriptor", so I edit the bug summary to "qemu crashed when using macvtap with fd number is over 1024( already ulimit "open files" to 10240)" Create _one_ macvtap device
# ip link add link eth0 name vepa1 type macvtap mode vepa
# ip -d link show vepa1
134: vepa1@eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 500
link/ether f2:24:ee:69:4e:ec brd ff:ff:ff:ff:ff:ff
macvtap mode vepa
# ls /dev/tap*
/dev/tap134 /dev/tap7
# file /dev/tap*
/dev/tap134: character special
/dev/tap7: empty
# qemu-kvm -device virtio-net-pci,netdev=macvtap_netdev,mac=f2:24:ee:69:4e:ec -netdev tap,id=macvtap_netdev,fd=1023 1023<>/dev/tap134 -vnc :0
(fine)
# qemu-kvm -device virtio-net-pci,netdev=macvtap_netdev,mac=f2:24:ee:69:4e:ec -netdev tap,id=macvtap_netdev,fd=1024 1024<>/dev/tap134 -vnc :0
bash: 1024: Bad file descriptor
# ulimit -a
open files (-n) 1024
max user processes (-u) 1024
# ulimit -n 1025
# ulimit -u 1025
# qemu-kvm -device virtio-net-pci,netdev=macvtap_netdev,mac=f2:24:ee:69:4e:ec -netdev tap,id=macvtap_netdev,fd=1024 1024<>/dev/tap134 -vnc :0
(fine)
# qemu-kvm -device virtio-net-pci,netdev=macvtap_netdev,mac=f2:24:ee:69:4e:ec -netdev tap,id=macvtap_netdev,fd=1025 1025<>/dev/tap134 -vnc :0
bash: 1025: Bad file descriptor
fd=1024, '1024' the index of fd table. so number you used should smaller than open files limitation (ulimit -n)
I tried to reproduce the crash in comment #2: - fedora 18 : qemu-kvm-1.2.0-23.fc18.x86_64 (can reproduce) - rhel6: qemu-kvm-0.12.1.2-2.351.el6.x86_64 (couldn't reproduce) - upstream qemu: (4b274b1603e1d15ef51aedc8b6b7ebbae0b555ce) (could not reproduce) - rhel7: git://git.app.eng.bos.redhat.com/virt/rhel7/qemu-kvm.git (couldn't reproduce) >>> rhel7: qemu-kvm-1.2.0-21.el7.x86_64 (officinal?) (can reproduce) (gdb) bt #0 0x00007ffff3356ba5 in raise () from /lib64/libc.so.6 #1 0x00007ffff3358358 in abort () from /lib64/libc.so.6 #2 0x00007ffff33963eb in __libc_message () from /lib64/libc.so.6 #3 0x00007ffff342b4a7 in __fortify_fail () from /lib64/libc.so.6 #4 0x00007ffff3429620 in __chk_fail () from /lib64/libc.so.6 #5 0x00007ffff342b417 in __fdelt_warn () from /lib64/libc.so.6 #6 0x000055555564725d in qemu_iohandler_poll (readfds=readfds@entry=0x555556009b60 <rfds>, writefds=writefds@entry=0x555556009be0 <wfds>, xfds=xfds@entry= 0x555556009c60 <xfds>, ret=ret@entry=2) at iohandler.c:156 #7 0x00005555556ecae8 in main_loop_wait (nonblocking=<optimized out>) at main-loop.c:497 #8 0x00005555555cb6e3 in main_loop () at /usr/src/debug/qemu-kvm-1.2.0/vl.c:1643 #9 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at /usr/src/debug/qemu-kvm-1.2.0/vl.c:3790 (gdb) frame 6 #6 0x000055555564725d in qemu_iohandler_poll (readfds=readfds@entry=0x555556009b60 <rfds>, writefds=writefds@entry=0x555556009be0 <wfds>, xfds=xfds@entry= 0x555556009c60 <xfds>, ret=ret@entry=2) at iohandler.c:156 156 if (!ioh->deleted && ioh->fd_read && FD_ISSET(ioh->fd, readfds)) { (gdb) l 151 { 152 if (ret > 0) { 153 IOHandlerRecord *pioh, *ioh; 154 155 QLIST_FOREACH_SAFE(ioh, &io_handlers, next, pioh) { 156 if (!ioh->deleted && ioh->fd_read && FD_ISSET(ioh->fd, readfds)) { 157 ioh->fd_read(ioh->opaque); 158 } 159 if (!ioh->deleted && ioh->fd_write && FD_ISSET(ioh->fd, writefds)) { 160 ioh->fd_write(ioh->opaque); problem can be reproduced in qemu-upstream. # ./configure --target-list='x86_64-softmmu' QEMU adds the tap fd to a set for synchronous IO, the fd should be less than MAX FD_SETSIZE (1024 for linux platform) glibc defination of __fdelt_warn: http://felix-lang.org/$/usr/include/x86_64-linux-gnu/bits/select2.h So the solution for this bug is added limitations when init tap device and set fd handler. Posted a patch: http://marc.info/?l=qemu-devel&m=135910170408260&w=3 The crash is due to the fixed size of the fd_set type used for select(2) event polling. Stefan posted a series to convert select() to g_poll(). http://marc.info/?l=qemu-devel&m=135962966729930&w=3 http://marc.info/?l=qemu-devel&m=136135632516801 [PATCH v4 00/10] main-loop: switch to g_poll(3) on POSIX hosts Patchset were applied by upstream. Build in qemu-kvm-1.5.0-1.el7 Reproduce this bug:
Version-Release number of selected component (if applicable):
qemu-kvm-1.4.0-4.el7.x86_64
3.10.0-65.el7.x86_64
---
1.create macvtap devices, until fetch /dev/tap1024.
I create 5000 macvtap devices, with the script:
#!/bin/sh
for i in $(seq 5000)
do
ip link add link p4096p4 name vepa$i type macvtap mode vepa
echo $i
done
And under /dev/ there're corresponding tap devices.
One macvtap device(fd=1029) list below
# ip -d link show vepa1020
1029: vepa1020@p4096p4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 500
link/ether ea:ab:13:6f:1e:9f brd ff:ff:ff:ff:ff:ff promiscuity 0
macvtap mode vepa
2.start a qemu-kvm with macvtap device, cli like this:
<cli>:
# gdb --args /usr/libexec/qemu-kvm -cpu SandyBridge -m 2G -smp 2,sockets=1,cores=2,threads=1 -enable-kvm -name testovs -rtc base=localtime,clock=host,driftfix=slew -drive file=/home/RHEL-Server-7.0-64.qcow2_v3,if=none,format=qcow2,werror=stop,rerror=stop,cache=none,media=disk,id=drive-scsi0-disk0 -device virtio-scsi-pci,id=scsi0,addr=0x4 -device scsi-hd,scsi-id=0,lun=0,bus=scsi0.0,drive=drive-scsi0-disk0,id=virtio-disk0,bootindex=1 -nodefaults -nodefconfig -monitor stdio -netdev tap,id=macvtap_netdev,fd=1029 -device virtio-net-pci,netdev=macvtap_netdev,mac=ea:ab:13:6f:1e:9f 1029<>/dev/tap1029 -vnc :10 -vga std -boot menu=on
---
After step 2, qemu-kvm core dump.
(gdb) bt
#0 0x00007ffff3b35979 in raise () from /lib64/libc.so.6
#1 0x00007ffff3b37088 in abort () from /lib64/libc.so.6
#2 0x00007ffff3b76127 in __libc_message () from /lib64/libc.so.6
#3 0x00007ffff3c0db07 in __fortify_fail () from /lib64/libc.so.6
#4 0x00007ffff3c0bcd0 in __chk_fail () from /lib64/libc.so.6
#5 0x00007ffff3c0da77 in __fdelt_warn () from /lib64/libc.so.6
#6 0x00005555556a6069 in qemu_iohandler_poll ()
#7 0x00005555556ab62e in main_loop_wait ()
#8 0x00005555555bba6d in main ()
-------
So reproduce this issue.
---------
Verify this bug:
Version-Release number of selected component (if applicable):
qemu-kvm-1.5.3-31.el7.x86_64
3.10.0-65.el7.x86_64
steps as "reproduce this bug", after step 2, login guest, guest can got ip.
guest and host work well.
As above show, this bug has been verified.
This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request. |
Description of problem: Can not start a qemu-kvm, when the cli using "... -netdev tap,fd= ..."with fd>=1024, it prompt this -bash: 1024: Bad file descriptor Version-Release number of selected component (if applicable): # uname -r 3.6.0-0.29.el7.x86_64 #rpm -q qemu-kvm qemu-kvm-1.3.0-3.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1.create macvtap devices, until fetch /dev/tap1024. I create 5000 macvtap devices, with the script: #!/bin/sh for i in $(seq 5000) do ip link add link em1 name vepa$i type macvtap mode vepa echo $i done And under /dev/ there're corresponding tap devices. One macvtap device(fd=1024) list below # ip -d link show vepa1020 1024: vepa1020@em1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 500 link/ether 52:cd:8d:ce:64:35 brd ff:ff:ff:ff:ff:ff macvtap mode vepa 2.start a qemu-kvm with macvtap device, cli like this #...-netdev tap,id=macvtap_netdev,fd=1024 -device virtio-net-pci,netdev=macvtap_netdev,mac=52:cd:8d:ce:64:35 1024<>/dev/tap1024 ... Actual results: could not start this qemu process, and after step2, returns this: -bash: 1024: Bad file descriptor Expected results: can launch guest with tap when fd >=1024 Additional info: 1.If I delete all the macvtap (and corresponding tap devices),then recreate one macvtap, and verify its fd number is still greater than 1024, will hit same issue. so it is not related with the quantity of the tap devices. 2.when fd<1024, guest can run well . BTW,is there quantity boundary values of macvtaps/taps per physical nic and per host?