Bug 1003535 - qemu-kvm core dump when boot vm with more than 32 virtio disks/nics
qemu-kvm core dump when boot vm with more than 32 virtio disks/nics
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
7.0
x86_64 Linux
high Severity high
: rc
: ---
Assigned To: Marcel Apfelbaum
Virtualization Bugs
:
: 1025680 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-02 06:01 EDT by Xu Han
Modified: 2014-06-17 23:35 EDT (History)
14 users (show)

See Also:
Fixed In Version: qemu-kvm-1.5.3-39.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-06-13 05:41:44 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
cli - boot vm (15.23 KB, text/plain)
2013-09-02 06:07 EDT, Xu Han
no flags Details

  None (edit)
Description Xu Han 2013-09-02 06:01:17 EDT
Description of problem:
qemu-kvm core dump when boot vm with 88 virtio disks, both linux and windows guest hit the same issue.

Version-Release number of selected component (if applicable):
kernel: 3.10.0-11.el7.x86_64
qemu: qemu-kvm-1.5.2-3.el7.x86_64

How reproducible:
80%

Steps to Reproduce:
1.boot vm
check cli in attachment 

Actual results:
QEMU 1.5.2 monitor - type 'help' for more information
(qemu) [New Thread 0x7fffdb7bb700 (LWP 3638)]
[New Thread 0x7fffdafba700 (LWP 3639)]
[New Thread 0x7fffd91ff700 (LWP 3640)]

(qemu) [Thread 0x7fffeb316700 (LWP 3637) exited]

(qemu) qemu-kvm: /builddir/build/BUILD/qemu-1.5.2/exec.c:748: register_subpage: Assertion `existing->mr->subpage || existing->mr == &io_mem_unassigned' failed.

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffdb7bb700 (LWP 3638)]
0x00007ffff32e4999 in raise () from /lib64/libc.so.6
(gdb) bt 
#0  0x00007ffff32e4999 in raise () from /lib64/libc.so.6
#1  0x00007ffff32e60a8 in abort () from /lib64/libc.so.6
#2  0x00007ffff32dd906 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007ffff32dd9b2 in __assert_fail () from /lib64/libc.so.6
#4  0x000055555573b25c in register_subpage ()
#5  0x000055555573b482 in mem_add ()
#6  0x000055555578c032 in address_space_update_topology_pass.isra.5 ()
#7  0x000055555578ce8d in memory_region_transaction_commit ()
#8  0x0000555555681afc in pci_default_write_config ()
#9  0x00005555556bbfaa in virtio_write_config ()
#10 0x000055555578a8b2 in access_with_adjusted_size ()
#11 0x000055555578bd87 in memory_region_iorange_write ()
#12 0x000055555578962d in kvm_cpu_exec ()
#13 0x0000555555734545 in qemu_kvm_cpu_thread_fn ()
#14 0x00007ffff625dde3 in start_thread () from /lib64/libpthread.so.0
#15 0x00007ffff33a50ad in clone () from /lib64/libc.so.6


Expected results:
vm can boot with no error

Additional info:
rhel6 host hit this bug -> Bug 753692
Comment 1 Xu Han 2013-09-02 06:07:06 EDT
Created attachment 792786 [details]
cli - boot vm
Comment 3 Amos Kong 2013-09-26 05:36:55 EDT
qemu-kvm-rhel6 & rhel7 guest : can't reproduce
qemu-upstream & rhel7 guest : can reproduce
qemu-upstream & rhel6 guest : can't reproduce

Problem occurs when adding the 33th disk, qemu crash should be fixed
Comment 4 Amos Kong 2013-09-26 11:33:49 EDT
This crash is caused by that the physical section number is unexpectedly larger than TARGET_PAGE_SIZE (4096).

The assert check was added in following commit:

commit 68f3f65b09a1ce8c82fac17911ffc3bb6031ebe4
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Tue May 7 11:30:23 2013 +0200

    memory: assert that PhysPageEntry's ptr does not overflow
    
    While sized to 15 bits in PhysPageEntry, the ptr field is ORed into the
    iotlb entries together with a page-aligned pointer.  The ptr field must
    not overflow into this page-aligned value, assert that it is smaller than
    the page size.
    
    Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

diff --git a/exec.c b/exec.c
index 1355661..8562fca 100644
--- a/exec.c
+++ b/exec.c
@@ -713,6 +713,12 @@ static void destroy_all_mappings(AddressSpaceDispatch *d)
 
 static uint16_t phys_section_add(MemoryRegionSection *section)
 {
+    /* The physical section number is ORed with a page-aligned
+     * pointer to produce the iotlb entries.  Thus it should
+     * never overflow into the page-aligned value.
+     */
+    assert(phys_sections_nb < TARGET_PAGE_SIZE);
+
     if (phys_sections_nb == phys_sections_nb_alloc) {
         phys_sections_nb_alloc = MAX(phys_sections_nb_alloc * 2, 16);
         phys_sections = g_renew(MemoryRegionSection, phys_sections,
Comment 5 Amos Kong 2013-09-26 22:01:14 EDT
Can reproduce this bug by launching guest with 33 virtio-net nics.

/home/devel/qemu/x86_64-softmmu/qemu-system-x86_64 --enable-kvm -m 2000 /images/RHEL-Server-6.4-64-virtio.qcow2  \
-monitor stdio \
-netdev tap,id=net-virtio0-0-1 -device virtio-net-pci,netdev=net-virtio0-0-8,id=virti0-0-1,multifunction=on,addr=0x04.0 \
.....
-netdev tap,id=net-virtio0-0-33 -device virtio-net-pci,netdev=net-virtio0-0-33,id=virti0-0-33,multifunction=on,addr=0x08.0 \
Comment 6 Amos Kong 2013-09-26 23:29:51 EDT
I tested with upstream kernel (v2.6.12 v 2.6.22 v2.6.32 v2.6.38 v3.0 v 3.1 .. v 3.10), this issue can be hit 100%

It's strange that we can't hit this problem with rhel6 guest (2.6.32-419.el6)
Comment 7 Amos Kong 2013-09-26 23:52:22 EDT
Paolo, any thoughts?
Comment 8 Paolo Bonzini 2013-09-27 06:06:34 EDT
Looks like there are too many BARs.

You could do something like

    if (tcg_enabled()) {
        /* The physical section number is ORed with a page-aligned
          * pointer to produce the iotlb entries.  Thus it should
          * never overflow into the page-aligned value.
          */
        assert(phys_sections_nb < TARGET_PAGE_SIZE);
    } else {
        /* For KVM or Xen we can use the full range of the ptr field
         * in PhysPageEntry.
         */
        assert(phys_sections_nb <= SHRT_MAX);
    }
Comment 9 Paolo Bonzini 2013-09-27 06:07:40 EDT
This should bring the limit up by a factor of 8 (32767 / 4096), i.e. 32*8 = 256.  Some care is still necessary when you have bridges, but it should be much better.
Comment 10 Amos Kong 2013-11-01 05:47:46 EDT
*** Bug 1025680 has been marked as a duplicate of this bug. ***
Comment 11 Amos Kong 2013-11-18 07:50:46 EST
hi xuhan,

Can you reproduce this bug with latest qemu-kvm-rhel7?
I can't reproduce it with qemu-upstream (1.6.0, 1.5.0, 1.5.1, 1.5.2, 1.5.3).
Thanks
Comment 12 Xu Han 2013-11-18 21:17:39 EST
hi amos,

Tested 2 times with qemu-kvm-rhev-1.5.3-19.el7.x86_64 .

Hit this issue while attached 87 virtio disks.
qemu-kvm: /builddir/build/BUILD/qemu-1.5.3/exec.c:762: register_subpage: Assertion `existing->mr->subpage || existing->mr == &io_mem_unassigned' failed.
core: line 90:  5852 Aborted                 (core dumped)

But not hit while attached 51 disks.
# ls /dev/vd* | wc -l
51

best regards,
xuhan
Comment 13 Amos Kong 2013-11-19 05:52:26 EST
(In reply to xuhan from comment #12)
> hi amos,
> 
> Tested 2 times with qemu-kvm-rhev-1.5.3-19.el7.x86_64 .

Thanks for your confirm.

I tested with RHEL 6 guest in Comment #11. I can reproduce with RHEL 7 guest (both latest qemu-upstream & qemu-kvm-rhel7)

Internal qemu crash in this point:
 register_subpage: Assertion `existing->mr->subpage || existing->mr == &io_mem_unassigned' failed.

Upstream qemu has a new check in commit 68f3f65b (memory: assert that PhysPageEntry's ptr does not overflow),
it crashes in another point:
  phys_section_add: Assertion `next_map.sections_nb < (1 << 12)' failed.
Comment 14 Amos Kong 2013-11-19 06:13:22 EST
tested with latest guest kernel (3.10.0-rc5)

1. have assert() in phys_section_add:
   assert(next_map.sections_nb < TARGET_PAGE_SIZE);

   crash occurred.

2. have another assert() in phys_section_add:
   assert(next_map.sections_nb < SHRT_MAX);

   crash occurred.

3. without this assert of next_map.sections_nb in phys_section_add

   crash occurred at register_subpage():

   assert(existing->mr->subpage || existing->mr == &io_mem_unassigned);
Comment 15 Paolo Bonzini 2013-11-19 06:43:54 EST
> 3. without this assert of next_map.sections_nb in phys_section_add
>    crash occurred at register_subpage():
>    assert(existing->mr->subpage || existing->mr == &io_mem_unassigned);

What is the backtrace here?

The cause could be INT_MAX instead of UINT_MAX in hw/i386/pc_piix.c:

        memory_region_init(pci_memory, "pci", INT64_MAX);

and similarly in pc_q35.c.
Comment 16 Amos Kong 2013-11-27 10:24:44 EST
(In reply to Paolo Bonzini from comment #15)
> > 3. without this assert of next_map.sections_nb in phys_section_add
> >    crash occurred at register_subpage():
> >    assert(existing->mr->subpage || existing->mr == &io_mem_unassigned);
> 
> What is the backtrace here?

I applied your fix ([PATCH] extend limit of physical sections number)
then qemu crash at exec.c:802: register_subpage

qemu-system-x86_64: /home/devel/qemu/exec.c:802: register_subpage: Assertion `existing->mr->subpage || existing->mr == &io_mem_unassigned' failed.

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffebfff700 (LWP 25608)]
0x00007ffff4391a19 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff4391a19 in raise () from /lib64/libc.so.6
#1  0x00007ffff4393128 in abort () from /lib64/libc.so.6
#2  0x00007ffff438a986 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007ffff438aa32 in __assert_fail () from /lib64/libc.so.6
#4  0x0000555555846786 in register_subpage (d=0x7fffe4052600, section=0x7fffebffe430) at /home/devel/qemu/exec.c:802
#5  0x0000555555846aba in mem_add (listener=0x555563934c68, section=0x7fffebffe5f0) at /home/devel/qemu/exec.c:842
#6  0x00005555558b9e33 in address_space_update_topology_pass (as=0x555563934c30, old_view=0x555565928d20, new_view=0x7fffe793d000, adding=true) at /home/devel/qemu/memory.c:735
#7  0x00005555558ba418 in address_space_update_topology (as=0x555563934c30) at /home/devel/qemu/memory.c:764
#8  0x00005555558ba587 in memory_region_transaction_commit () at /home/devel/qemu/memory.c:799
#9  0x00005555558bcf84 in memory_region_set_enabled (mr=0x555568c6c158, enabled=true) at /home/devel/qemu/memory.c:1503
#10 0x000055555571b28e in pci_default_write_config (d=0x555568c6be60, addr=4, val=0, l=2) at hw/pci/pci.c:1189
#11 0x0000555555781aea in virtio_write_config (pci_dev=0x555568c6be60, address=4, val=7, len=2) at hw/virtio/virtio-pci.c:459
#12 0x0000555555720094 in pci_host_config_write_common (pci_dev=0x555568c6be60, addr=4, limit=256, val=7, len=2) at hw/pci/pci_host.c:57
#13 0x00005555557201e4 in pci_data_write (s=0x555556418c10, addr=2147513092, val=7, len=2) at hw/pci/pci_host.c:84
#14 0x00005555557203a0 in pci_host_data_write (opaque=0x555556416640, addr=0, val=7, len=2) at hw/pci/pci_host.c:137
#15 0x00005555558b86af in memory_region_write_accessor (mr=0x555556418a30, addr=0, value=0x7fffebffeaa8, size=2, shift=0, mask=65535) at /home/devel/qemu/memory.c:440
#16 0x00005555558b87ec in access_with_adjusted_size (addr=0, value=0x7fffebffeaa8, size=2, access_size_min=1, access_size_max=4, access=
    0x5555558b861f <memory_region_write_accessor>, mr=0x555556418a30) at /home/devel/qemu/memory.c:477
#17 0x00005555558badb9 in memory_region_dispatch_write (mr=0x555556418a30, addr=0, data=7, size=2) at /home/devel/qemu/memory.c:984
#18 0x00005555558be040 in io_mem_write (mr=0x555556418a30, addr=0, val=7, size=2) at /home/devel/qemu/memory.c:1748
#19 0x000055555584949c in address_space_rw (as=0x5555561ef780 <address_space_io>, addr=3324, buf=0x7ffff7ff2000 "\a", len=2, is_write=true) at /home/devel/qemu/exec.c:1904
#20 0x00005555558b5075 in kvm_handle_io (port=3324, data=0x7ffff7ff2000, direction=1, size=2, count=1) at /home/devel/qemu/kvm-all.c:1542
#21 0x00005555558b5632 in kvm_cpu_exec (cpu=0x5555563fc3e0) at /home/devel/qemu/kvm-all.c:1680
#22 0x000055555583c3c0 in qemu_kvm_cpu_thread_fn (arg=0x5555563fc3e0) at /home/devel/qemu/cpus.c:872
#23 0x00007ffff625dc53 in start_thread () from /lib64/libpthread.so.0
#24 0x00007ffff4451e1d in clone () from /lib64/libc.so.6

> 
> The cause could be INT_MAX instead of UINT_MAX in hw/i386/pc_piix.c:
> 
>         memory_region_init(pci_memory, "pci", INT64_MAX);


After this change, guest can add more than about 20 devices, but it still crash at same point.

> 
> and similarly in pc_q35.c.
Comment 17 Amos Kong 2013-11-27 18:59:55 EST
It's a TCG memory related issue, reassign to Marcel as talked in IRC
Thanks.
Comment 20 Miroslav Rezanina 2014-01-21 04:16:48 EST
Fix included in qemu-kvm-1.5.3-39.el7
Comment 23 Jun Li 2014-02-10 02:57:04 EST
Reproduce this bug:
Version-Release number of selected component (if applicable):
qemu-kvm-1.5.3-37.el7.x86_64
3.10.0-79.el7.x86_64
---
Boot guest using the following script:
# cat bug1003535-mutifunction-on.sh 
#! /bin/sh
CLI="gdb --args /usr/libexec/qemu-kvm -M pc-i440fx-rhel7.0.0 -monitor stdio -enable-kvm -m 5G -smp 2,sockets=1,cores=2,threads=1 -name RHEL-Server-7.0-64 -boot c \
-drive file=/home/juli/rhel7.0.qcow2,if=none,id=drive-ide0-0-0,format=qcow2 -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -net none -spice disable-ticketing,port=5931 -vga qxl -serial unix:/tmp/virtio,server,nowait" 

for ((m=3;m<=13;m++)); do
    for ((i=0;i<=7;i++)); do
        k=`printf "%02x" $m`
        echo $k
        num=$(($i+($m-3)*8))
        echo $num
        CLI="$CLI -drive file=/home/disk/disk$num,if=none,id=drive-virtio0-0-$num,format=qcow2"
        CLI="$CLI -device virtio-blk-pci,drive=drive-virtio0-0-$num,id=virti0-0-$num,multifunction=on,addr=0x$k.$i"
    done
done

$CLI
------
(gdb) bt
#0  0x00007ffff2c9c979 in raise () from /lib64/libc.so.6
#1  0x00007ffff2c9e088 in abort () from /lib64/libc.so.6
#2  0x00007ffff2c958e6 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007ffff2c95992 in __assert_fail () from /lib64/libc.so.6
#4  0x0000555555781aac in register_subpage ()
#5  0x0000555555781cd2 in mem_add ()
#6  0x00005555557d4cf2 in address_space_update_topology_pass.isra.5 ()
#7  0x00005555557d5b4d in memory_region_transaction_commit ()
#8  0x00005555556c1abc in pci_default_write_config ()
#9  0x00005555556f7afa in virtio_write_config ()
#10 0x00005555557d3572 in access_with_adjusted_size ()
#11 0x00005555557d4a47 in memory_region_iorange_write ()
#12 0x00005555557d2355 in kvm_cpu_exec ()
#13 0x000055555577a8c5 in qemu_kvm_cpu_thread_fn ()
#14 0x00007ffff604fde3 in start_thread () from /lib64/libpthread.so.0
#15 0x00007ffff2d5d25d in clone () from /lib64/libc.so.6
-------
based on above test, this issue has been reproduced.
==================
Verified this bug:
Version-Release number of selected component (if applicable):
qemu-kvm-1.5.3-45.el7.x86_64
------
Steps as followings:

1,Boot guest using the following script:
# cat bug1003535-mutifunction-on.sh 
#! /bin/sh
CLI="gdb --args /usr/libexec/qemu-kvm -M pc-i440fx-rhel7.0.0 -monitor stdio -enable-kvm -m 5G -smp 2,sockets=1,cores=2,threads=1 -name RHEL-Server-7.0-64 -boot c \
-drive file=/home/juli/rhel7.0.qcow2,if=none,id=drive-ide0-0-0,format=qcow2 -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -net none -spice disable-ticketing,port=5931 -vga qxl -serial unix:/tmp/virtio,server,nowait" 

for ((m=3;m<=13;m++)); do
    for ((i=0;i<=7;i++)); do
        k=`printf "%02x" $m`
        echo $k
        num=$(($i+($m-3)*8))
        echo $num
        CLI="$CLI -drive file=/home/disk/disk$num,if=none,id=drive-virtio0-0-$num,format=qcow2"
        CLI="$CLI -device virtio-blk-pci,drive=drive-virtio0-0-$num,id=virti0-0-$num,multifunction=on,addr=0x$k.$i"
    done
done

$CLI
---------
2, check these 88 disks inside guest.
# ls /dev/vd* |wc -l
88
-----
Based on above test, this issue has been verified.
Comment 25 Ludek Smid 2014-06-13 05:41:44 EDT
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Note You need to log in before you can comment on or make changes to this bug.