Hide Forgot
Remark: The error I'm reporting here occurs in qemu 2.1.3 on Fedora 21. From my understanding the same issue should exist up to qemu 2.4 an therefore qemu 2.3.1 from OVirt repos on Redhat/Centos 7.2 should be affected too. Description of problem: Once in a three months one of our VM crashes with the following error. # dmesg -T ... [Fr Jan 29 03:30:34 2016] qemu-system-x86[4783]: segfault at 7f0de0000000 ip 00007f0dd66aa0fc sp 00007f0dc8f04878 error 4 in libc-2.18.so[7f0dd662a000+1b4000] # gdb core.4780.1454034416.dump ... (gdb) bt #0 0x00007f0dd66aa0fc in free () from /lib64/libc.so.6 #1 0x00007f0ddf8c4f7f in g_free () from /lib64/libglib-2.0.so.0 #2 0x00007f0de1630ad0 in machine_finalize (obj=<optimized out>) at hw/core/machine.c:294 #3 0x00007f0de16d34ea in object_deinit (type=<optimized out>, obj=0x7f0de3fcf1e0) at qom/object.c:408 #4 object_finalize (data=0x7f0de3fcf1e0) at qom/object.c:421 #5 object_unref (obj=0x7f0de3fcf1e0) at qom/object.c:729 #6 0x00007f0de14a2151 in phys_section_destroy (mr=0x7f0de3fd0a80) at /usr/src/debug/qemu-2.1.3/exec.c:917 #7 phys_sections_free (map=<optimized out>) at /usr/src/debug/qemu-2.1.3/exec.c:930 #8 mem_commit (listener=0x7f0de1bd14d8 <address_space_io+56>) at /usr/src/debug/qemu-2.1.3/exec.c:1879 #9 0x00007f0de14d7f81 in memory_region_transaction_commit () at /usr/src/debug/qemu-2.1.3/memory.c:812 #10 0x00007f0de14eb476 in vga_update_memory_access (s=0x7f0de4198148) at /usr/src/debug/qemu-2.1.3/hw/display/vga.c:207 #11 0x00007f0de14d569a in access_with_adjusted_size (addr=addr@entry=4, value=value@entry=0x7f0dc8f04a30, size=size@entry=2, access_size_min=<optimized out>, access_size_max=<optimized out>, access=0x7f0de14d5790 <memory_region_write_accessor>, mr=0x7f0de41259e0) at /usr/src/debug/qemu-2.1.3/memory.c:481 #12 0x00007f0de14da247 in memory_region_dispatch_write (size=2, data=3842, addr=4, mr=0x7f0de41259e0) at /usr/src/debug/qemu-2.1.3/memory.c:1143 #13 io_mem_write (mr=mr@entry=0x7f0de41259e0, addr=4, val=<optimized out>, size=2) at /usr/src/debug/qemu-2.1.3/memory.c:1976 #14 0x00007f0de14a4be3 in address_space_rw (as=0x7f0de1bd14a0 <address_space_io>, addr=addr@entry=964, buf=0x7f0de1a7d000 <error: Cannot access memory at address 0x7f0de1a7d000>, len=len@entry=2, is_write=is_write@entry=true) at /usr/src/debug/qemu-2.1.3/exec.c:2086 #15 0x00007f0de14d4ab0 in kvm_handle_io (count=1, size=2, direction=<optimized out>, data=<optimized out>, port=964) at /usr/src/debug/qemu-2.1.3/kvm-all.c:1599 #16 kvm_cpu_exec (cpu=cpu@entry=0x7f0de40684b0) at /usr/src/debug/qemu-2.1.3/kvm-all.c:1741 #17 0x00007f0de14c32d2 in qemu_kvm_cpu_thread_fn (arg=0x7f0de40684b0) at /usr/src/debug/qemu-2.1.3/cpus.c:883 #18 0x00007f0ddffafee5 in start_thread () from /lib64/libpthread.so.0 #19 0x00007f0dd671ed1d in clone () from /lib64/libc.so.6 Version-Release number of selected component (if applicable): qemu 2.1.3 from Fedora 21 How reproducible: very rare. Once every three months on one of ~100 running VMs. Steps to Reproduce: Unable to reproduce. Actual results: VM crashes. Not amusing if critical server is affected :( Expected results: VM should run on Additional info: If you think that https://lists.gnu.org/archive/html/qemu-devel/2015-12/msg00391.html fixes my issue it should be included into qemu 2.3.1 that is provided for OVirt 3.6 + Redhat/Centos 7.2 environments.
Here it is: commit 55b4e80b047300e1512df02887b7448ba3786b62 Author: Don Slutz <don.slutz> Date: Mon Nov 30 17:11:04 2015 -0500 exec: Stop using memory after free diff --git a/exec.c b/exec.c index de1cf19..0bf0a6e 100644 --- a/exec.c +++ b/exec.c @@ -1064,9 +1064,11 @@ static uint16_t phys_section_add(PhysPageMap *map, static void phys_section_destroy(MemoryRegion *mr) { + bool have_sub_page = mr->subpage; + memory_region_unref(mr); - if (mr->subpage) { + if (have_sub_page) { subpage_t *subpage = container_of(mr, subpage_t, iomem); object_unref(OBJECT(&subpage->iomem)); g_free(subpage);
Thanks Miroslav and Paolo. QE has confirmed the code of commit 55b4e80b047300e1512df02887b7448ba3786b62 is included in qemu-kvm-rhev-2.6.0-22.el7.src.rpm. # cat exec.c ... 1102 static void phys_section_destroy(MemoryRegion *mr) 1103 { 1104 bool have_sub_page = mr->subpage; 1105 1106 memory_region_unref(mr); 1107 1108 if (have_sub_page) { 1109 subpage_t *subpage = container_of(mr, subpage_t, iomem); 1110 object_unref(OBJECT(&subpage->iomem)); 1111 g_free(subpage); 1112 } 1113 } ...
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2673.html