Bug 895436

Summary: qemu-kvm core dump when guest do S3/S4 with max(232) virtio block devices (multifunction=on)
Product: Red Hat Enterprise Linux 7 Reporter: Sibiao Luo <sluo>
Component: qemu-kvmAssignee: Marcel Apfelbaum <marcel>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: high    
Version: 7.0CC: amit.shah, chayang, flang, hhuang, juli, juzhang, knoel, kwolf, marcel, michen, mrezanin, pbonzini, qzhang, rbalakri, sluo, virt-bugs, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-1.5.3-39.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-05 08:00:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 923626    
Attachments:
Description Flags
qemu-kvm command lines. none

Description Sibiao Luo 2013-01-15 08:49:57 UTC
Description of problem:
boot guest with mmax(232) virito block devices(multifunction=on), then resume guest from S3, but the qemu-kvm will core dump.
BTW, I also tried the rhel6.4 host, it have no such issue.

Version-Release number of selected component (if applicable):
host info:
kernel-3.6.0-0.29.el7.x86_64
qemu-kvm-1.3.0-3.el7.x86_64
spice-server-0.12.0-1.el7.x86_64
virt-viewer-0.5.4-2.el7.x86_64
guest info:
RHEL6.4
kernel-2.6.32-353.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1.boot guest with max(232) virito block devices(multifunction=on).
e.g: I will attach my qemu-kvm command line.
2.connect the VM via remote-viewer.
# remote-viewer spice://$host_ip:$port
3.check all blocks can be recognized in guest/HMP monitor.
guest] ls -l /dev/vd* | wc -l
(qemu) info block
4.Do S3.
# pm-suspend
5.resume guest from S3.
e.g: press any keyboard.
  
Actual results:
after step 3, both guest and HMP monitor was 232 disk.
after step 5, qemu-kvm will core dump.
(qemu) qemu-kvm: /builddir/build/BUILD/qemu-1.3.0/exec.c:2273: register_subpage: Assertion `existing->mr->subpage || existing->mr == &io_mem_unassigned' failed.
multifunction_with_virtio_blk.sh: line 234:  4094 Aborted                 (core dumped)

Expected results:
it can resume guest from S3 with max(232) virito block devices(multifunction=on) without any core dump.

Additional info:

Comment 1 Sibiao Luo 2013-01-15 08:52:00 UTC
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/libexec/qemu-kvm -M pc-1.2 -cpu SandyBridge -enable-kvm -m 4096 -smp 4,soc'.
Program terminated with signal 6, Aborted.
#0  0x00007f13478d1ba5 in raise () from /lib64/libc.so.6

(gdb) bt
#0  0x00007f13478d1ba5 in raise () from /lib64/libc.so.6
#1  0x00007f13478d3358 in abort () from /lib64/libc.so.6
#2  0x00007f13478ca972 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007f13478caa22 in __assert_fail () from /lib64/libc.so.6
#4  0x00007f134cff90a6 in register_subpage (d=d@entry=0x7f134e918760, section=section@entry=0x7f133adfb6f0) at /usr/src/debug/qemu-1.3.0/exec.c:2273
#5  0x00007f134cff916c in mem_add (listener=0x7f134e918768, section=<optimized out>) at /usr/src/debug/qemu-1.3.0/exec.c:2313
#6  0x00007f134d048c3c in address_space_update_topology_pass (as=as@entry=0x7f134e911fc0, adding=adding@entry=true, old_view=..., new_view=...)
    at /usr/src/debug/qemu-1.3.0/memory.c:711
#7  0x00007f134d049a9a in address_space_update_topology (as=0x7f134e911fc0) at /usr/src/debug/qemu-1.3.0/memory.c:726
#8  memory_region_transaction_commit () at /usr/src/debug/qemu-1.3.0/memory.c:750
#9  memory_region_transaction_commit () at /usr/src/debug/qemu-1.3.0/memory.c:739
#10 0x00007f134cf181cf in pci_default_write_config (d=d@entry=0x7f134e911da0, addr=addr@entry=4, val=0, l=l@entry=2) at hw/pci.c:1079
#11 0x00007f134cf67325 in virtio_write_config (pci_dev=0x7f134e911da0, address=4, val=<optimized out>, len=2) at hw/virtio-pci.c:456
#12 0x00007f134d047322 in access_with_adjusted_size (addr=addr@entry=0, value=value@entry=0x7f133adfbb38, size=2, access_size_min=<optimized out>, 
    access_size_max=<optimized out>, access=access@entry=0x7f134d047940 <memory_region_write_accessor>, opaque=opaque@entry=0x7f134e844528)
    at /usr/src/debug/qemu-1.3.0/memory.c:364
#13 0x00007f134d048997 in memory_region_iorange_write (iorange=<optimized out>, offset=0, width=2, data=263) at /usr/src/debug/qemu-1.3.0/memory.c:439
#14 0x00007f134d0457c6 in kvm_handle_io (count=1, size=2, direction=1, data=<optimized out>, port=3324) at /usr/src/debug/qemu-1.3.0/kvm-all.c:1426
#15 kvm_cpu_exec (env=env@entry=0x7f134e805e00) at /usr/src/debug/qemu-1.3.0/kvm-all.c:1571
#16 0x00007f134cff26d1 in qemu_kvm_cpu_thread_fn (arg=0x7f134e805e00) at /usr/src/debug/qemu-1.3.0/cpus.c:757
#17 0x00007f134b047d15 in start_thread () from /lib64/libpthread.so.0
#18 0x00007f134798e27d in clone () from /lib64/libc.so.6
(gdb) q

Comment 2 Sibiao Luo 2013-01-15 08:54:36 UTC
Created attachment 678676 [details]
qemu-kvm command lines.

Comment 3 Sibiao Luo 2013-01-15 09:15:55 UTC
qemu-kvm also will core dump when guest do S4 with max(232) virito block devices(multifunction=on), the bt log is the same as comment #1.

(qemu) qemu-kvm: /builddir/build/BUILD/qemu-1.3.0/exec.c:2273: register_subpage: Assertion `existing->mr->subpage || existing->mr == &io_mem_unassigned' failed.
multifunction_with_virtio_blk.sh: line 234:  4362 Aborted                 (core dumped)

Comment 4 Sibiao Luo 2013-01-16 07:59:36 UTC
Hi all,

   I tried to cut-off point (e.g. works for n, but fails with n+1).

29*8=232  disk ------> core dump
28*8=224  disk ------> core dump
27*8=216  disk ------> core dump
13*8=104  disk ------> core dump
12*8=96   disk ------> core dump
11*8=88   disk ------> core dump
10*8+7=87 disk ------> core dump  <---------more than 87 core dump
10*8+6=86 disk ------> S3 success <---------less than 86 is OK
10*8+5=85 disk ------> S3 success 
10*8+4=84 disk ------> S3 success 
10*8+3=83 disk ------> S3 success 
10*8+2=82 disk ------> S3 success 
10*8+1=81 disk ------> S3 success
10*8=80   disk ------> S3 success
8*8=64    disk ------> S3 success
5*8=40    disk ------> S3 success
4*8=32    disk ------> S3 success
3*8=24    disk ------> S3 success
1*8=8     disk ------> S3 success

Best Regards.
sluo

Comment 5 Jun Li 2013-08-29 09:40:22 UTC
When hot-plugging many virtio devices with multifunction=on to win8-64 guest inside Rhel7.0 host, hit this issue, too.
(gdb) bt
#0  0x00007ffff32e4999 in raise () from /lib64/libc.so.6
#1  0x00007ffff32e60a8 in abort () from /lib64/libc.so.6
#2  0x00007ffff32dd906 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007ffff32dd9b2 in __assert_fail () from /lib64/libc.so.6
#4  0x000055555573b81c in register_subpage ()
#5  0x000055555573ba42 in mem_add ()
#6  0x000055555578ca8c in address_space_update_topology_pass.isra.5 ()
#7  0x000055555578d61d in memory_region_transaction_commit ()
#8  0x0000555555681c3c in pci_default_write_config ()
#9  0x00005555556bc12a in virtio_write_config ()
#10 0x000055555578b042 in access_with_adjusted_size ()
#11 0x000055555578c517 in memory_region_iorange_write ()
#12 0x0000555555789dbd in kvm_cpu_exec ()
#13 0x0000555555734ab5 in qemu_kvm_cpu_thread_fn ()
#14 0x00007ffff625dde3 in start_thread () from /lib64/libpthread.so.0
#15 0x00007ffff33a50ad in clone () from /lib64/libc.so.6
----
qemu-kvm version is :
   qemu-kvm-1.5.2-4.el7.x86_64
My command line :
/usr/libexec/qemu-kvm -S -M pc-i440fx-rhel7.0.0 -cpu SandyBridge -enable-kvm -m 4G -smp 4,sockets=2,cores=2,threads=1 -name juli -uuid 355a2475-4e03-4cdd-bf7b-5d6a59edaa61 -rtc base=localtime,clock=host,driftfix=slew \
-drive file=/mnt/win8-64.raw,if=none,cache=none,aio=native,format=raw,id=drive0  \
-device virtio-blk-pci,bus=pci.0,addr=0x8,drive=drive0,scsi=off,config-wce=off,bootindex=0 \
-device virtio-balloon-pci,id=ballooning,addr=0x6 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 \
-netdev tap,id=hostnet0,vhost=off,queues=4,script=/etc/qemu-ifup \
-device virtio-net-pci,mq=on,vectors=17,netdev=hostnet0,id=virtio-net-pci0,mac=24:be:05:14:0d:52,addr=0x7,bootindex=2 \
-k en-us -boot menu=on,reboot-timeout=-1,strict=on \
-qmp tcp:0:4445,server,nowait -serial unix:/tmp/ttyS0,server,nowait -vnc :3 -spice port=5932,disable-ticketing -vga cirrus \
-monitor stdio -monitor tcp:0:7445,server,nowait -monitor unix:/tmp/monitor1,server,nowait \
-drive file=/mnt/virtio-win-prewhql-0.1-68/virtio-win-prewhql-0.1.iso,if=none,media=cdrom,format=raw,aio=native,id=drive-ide1-0-2 -device ide-drive,drive=drive-ide1-0-2,id=ide1-0-2,bus=ide.1,unit=1

Comment 7 Paolo Bonzini 2013-10-15 09:28:31 UTC
This issue is not related to virtio-blk, it is in the core of QEMU (where it builds the memory map).  The virtio-blk reproducer is still very useful.  Thanks!

Comment 8 Marcel Apfelbaum 2013-12-01 15:50:27 UTC
It is probably a duplicate of BZ 1003535.
QE, can you please attach the Qemu command line? (or attach the config file)

Comment 9 Sibiao Luo 2013-12-02 02:44:51 UTC
(In reply to Marcel Apfelbaum from comment #8)
> It is probably a duplicate of BZ 1003535.
> QE, can you please attach the Qemu command line? (or attach the config file)
please refer to the attachment 678676 [details], it can be triggered by 'S3/S4 + multifunction=on' or 'multifunction=on + hotpluging', not sure weather is the same, but my bug is reported more earlier than bug 1003535. Thanks in advance.

Best Regards,
sluo

Comment 12 Sibiao Luo 2014-01-10 07:59:45 UTC
Reproduce this issue with hot-plugging many virtio devices with multifunction=on.
host info:
# uname -r && rpm -q qemu-kvm-rhev
3.10.0-66.el7.x86_64.debug
qemu-kvm-rhev-1.5.3-31.el7.x86_64
guest info:
# uname -r
3.10.0-66.el7.x86_64.debug

qemu-kvm command line:
# /usr/libexec/qemu-kvm -M pc -S -cpu SandyBridge -enable-kvm -m 4096 -smp 4,sockets=2,cores=2,threads=1 -no-kvm-pit-reinjection -drive file=iscsi://10.66.90.100:3260/iqn.2001-05.com.equallogic:0-8a0906-4c41f7d03-453f49b421052a57-s2-sluo-270305-1/0,if=none,id=drive-system-disk,cache=none,aio=native -iscsi id=iqn0 -device ide-drive,bus=ide.0,unit=0,drive=drive-system-disk,id=system-disk,bootindex=1 -k en-us -boot menu=on -qmp tcp:0:4444,server,nowait -serial unix:/tmp/ttyS0,server,nowait -vnc :1 -spice disable-ticketing,port=5931 -net none -monitor stdio -monitor unix:/tmp/monitor2,server,nowait

# cat repeat_hotplug_multifunction.sh
for i in `seq 3 9` a b c d e f 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f;do
#for i in `seq 5 5`;do
for j in `seq 1 7` 0;do
qemu-img create /tmp/resize$i$j.qcow2 1M -f qcow2
sleep 2
echo __com.redhat_drive_add id=drv$i$j,file=/tmp/resize$i$j.qcow2
echo __com.redhat_drive_add id=drv$i$j,file=/tmp/resize$i$j.qcow2 | nc -U /tmp/monitor2
#echo drive_add $i.$j id=drv$i$j,file=/tmp/resize$i$j.qcow2,if=none
#echo drive_add $i.$j id=drv$i$j,file=/tmp/resize$i$j.qcow2,if=none | nc -U /tmp/monitor1
sleep 2
echo device_add virtio-blk-pci,id=dev$i$j,drive=drv$i$j,addr=0x$i.$j,multifunction=on
echo device_add virtio-blk-pci,id=dev$i$j,drive=drv$i$j,addr=0x$i.$j,multifunction=on | nc -U /tmp/monitor2
done
done

Results:
after hotplug for a while, QEMU will core dumped.
(qemu) qemu-kvm: /builddir/build/BUILD/qemu-1.5.3/exec.c:762: register_subpage: Assertion `existing->mr->subpage || existing->mr == &io_mem_unassigned' failed.
Aborted (core dumped)

(gdb) bt
#0  0x00007fe03a680979 in raise () from /lib64/libc.so.6
#1  0x00007fe03a682088 in abort () from /lib64/libc.so.6
#2  0x00007fe03a6798e6 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007fe03a679992 in __assert_fail () from /lib64/libc.so.6
#4  0x00007fe03fbf9a1c in register_subpage (d=d@entry=0x7fe041fc8e10, section=section@entry=0x7fe032586730)
    at /usr/src/debug/qemu-1.5.3/exec.c:762
#5  0x00007fe03fbf9c42 in mem_add (listener=0x7fe041fc8e18, section=<optimized out>)
    at /usr/src/debug/qemu-1.5.3/exec.c:822
#6  0x00007fe03fc4cc6c in address_space_update_topology_pass (as=as@entry=0x7fe051a48be0, adding=adding@entry=true, 
    old_view=..., new_view=...) at /usr/src/debug/qemu-1.5.3/memory.c:697
#7  0x00007fe03fc4d7fd in address_space_update_topology (as=0x7fe051a48be0) at /usr/src/debug/qemu-1.5.3/memory.c:726
#8  memory_region_transaction_commit () at /usr/src/debug/qemu-1.5.3/memory.c:750
#9  0x00007fe03fb3872c in pci_default_write_config (d=d@entry=0x7fe044617f70, addr=addr@entry=4, val=0, l=l@entry=2)
    at hw/pci/pci.c:1167
#10 0x00007fe03fb6e24a in virtio_write_config (pci_dev=0x7fe044617f70, address=4, val=<optimized out>, len=2)
    at hw/virtio/virtio-pci.c:464
#11 0x00007fe03fc4b222 in access_with_adjusted_size (addr=addr@entry=0, value=value@entry=0x7fe032586b58, size=2, 
    access_size_min=<optimized out>, access_size_max=<optimized out>, 
    access=access@entry=0x7fe03fc4b7e0 <memory_region_write_accessor>, opaque=opaque@entry=0x7fe041ecb548)
    at /usr/src/debug/qemu-1.5.3/memory.c:364
#12 0x00007fe03fc4c6f7 in memory_region_iorange_write (iorange=<optimized out>, offset=0, width=2, data=7)
    at /usr/src/debug/qemu-1.5.3/memory.c:439
#13 0x00007fe03fc4a005 in kvm_handle_io (count=1, size=2, direction=1, data=<optimized out>, port=3324)
    at /usr/src/debug/qemu-1.5.3/kvm-all.c:1512
#14 kvm_cpu_exec (env=env@entry=0x7fe041ea1f60) at /usr/src/debug/qemu-1.5.3/kvm-all.c:1661
#15 0x00007fe03fbf28f5 in qemu_kvm_cpu_thread_fn (arg=0x7fe041ea1f60) at /usr/src/debug/qemu-1.5.3/cpus.c:793
#16 0x00007fe03da23de3 in start_thread () from /lib64/libpthread.so.0
#17 0x00007fe03a74125d in clone () from /lib64/libc.so.6
(gdb)

Comment 13 Sibiao Luo 2014-01-10 08:14:09 UTC
Tried the private build(qemu-kvm-1.5.3-35.el7.test.src.rpm, x86_64) with the same steps as comment #12.

host info:
# uname -r && rpm -q qemu-kvm-rhev
3.10.0-66.el7.x86_64.debug
qemu-kvm-1.5.3-35.el7.test.x86_64
guest info:
# uname -r
3.10.0-66.el7.x86_64.debug

Results:
It did not meet any QEMU core dumped, so this private build(qemu-kvm-1.5.3-35.el7.test.src.rpm, x86_64) has fixed this issue correctly.

Best Regards,
sluo

Comment 18 Sibiao Luo 2014-11-11 07:47:29 UTC
Verified this issue on both qemu-kvm-rhev-2.1.2-7.el7.x86_64 and qemu-kvm-1.5.3-57.el7.x86_64 with the same steps as comment #12, did not meet any qemu core dumped any more.

########qemu-kvm-1.5.3-57.el7.x86_64
host info:
# uname -r && rpm -q qemu-kvm
3.10.0-183.el7.x86_64
qemu-kvm-1.5.3-57.el7.x86_64

########qemu-kvm-rhev-2.1.2-7.el7.x86_64
host info:
# uname -r && rpm -q qemu-kvm-rhev
3.10.0-183.el7.x86_64
qemu-kvm-rhev-2.1.2-7.el7.x86_64

Base on above, move it to VERIFIED status, please correct me if any mistake, thansk.

Best Regards,
sluo

Comment 20 errata-xmlrpc 2015-03-05 08:00:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0349.html