Bug 1277482

Summary: qemu 2.1.3 abort in bdrv_error_action()
Product: [Fedora] Fedora Reporter: Markus Stockhausen <mst>
Component: qemuAssignee: Fedora Virtualization Maintainers <virt-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 22CC: amit.shah, berrange, cfergeau, crobinso, dwmw2, itamar, pbonzini, rjones, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-30 20:40:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Markus Stockhausen 2015-11-03 12:47:32 UTC
We are driving hypervisors on FC21 with qemu 2.1.3. Once a month one of our VM crashes. Thus we activated writing of cores and found the issue to be the following call stack.

#0  0x00007f0805cf5877 in raise () from /lib64/libc.so.6
#1  0x00007f0805cf6f68 in abort () from /lib64/libc.so.6
#2  0x00007f0805cee7d6 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007f0805cee882 in __assert_fail () from /lib64/libc.so.6
#4  0x00007f0810da7a38 in bdrv_error_action (bs=<optimized out>, action=action@entry=BLOCK_ERROR_ACTION_STOP, is_read=is_read@entry=true,
    error=error@entry=-1701061243) at block.c:3598
#5  0x00007f0810b7ac6a in virtio_blk_handle_rw_error (req=req@entry=0x7f08164668a0, error=error@entry=-1701061243, is_read=true)
    at /usr/src/debug/qemu-2.1.3/hw/block/virtio-blk.c:81
#6  0x00007f0810b7ad2a in virtio_blk_rw_complete (opaque=0x7f08164668a0, ret=1701061243) at /usr/src/debug/qemu-2.1.3/hw/block/virtio-blk.c:94
#7  0x00007f0810d9fa1e in bdrv_co_em_bh (opaque=0x7f07ec002e70) at block.c:4685
#8  0x00007f0810d9c1c7 in aio_bh_poll (ctx=ctx@entry=0x7f0812d79600) at async.c:82
#9  0x00007f0810daa416 in aio_poll (ctx=0x7f0812d79600, blocking=blocking@entry=false) at aio-posix.c:215
#10 0x00007f0810d9c050 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at async.c:212
#11 0x00007f080ef552a6 in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
#12 0x00007f0810da8fb8 in glib_pollfds_poll () at main-loop.c:190
#13 os_host_main_loop_wait (timeout=<optimized out>) at main-loop.c:235
#14 main_loop_wait (nonblocking=<optimized out>) at main-loop.c:484
#15 0x00007f0810b3276e in main_loop () at vl.c:2008
#16 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4540

After searching Redhat documentation I saw https://access.redhat.com/solutions/1459913 that perfectly matches our problem. Se the very error return value. It mentions BZ 1142857 that I have no access to. Maybe someone could give me a hint how this can be fixed.

What would be most important is to check if FC22 might show the same symptoms. If yes this bug should track the inclusion of fixes into qemu versions of both releases.

Comment 1 Markus Stockhausen 2015-11-03 13:14:35 UTC
Just to be precise: VMs are affected randomly - so not only a single VM.

Comment 2 Cole Robinson 2015-11-03 14:55:42 UTC
bug 1142857 is supposed to be fixed by this commit:

commit 3bbf572345c65813f86a8fc434ea1b23beb08e16
Author: Paolo Bonzini <pbonzini>
Date:   Wed Jun 3 14:21:20 2015 +0200

    atomics: add explicit compiler fence in __atomic memory barriers

Which is lacking in f21 and f22.

FYI though F21 is end-of-life in a month...

Comment 3 Fedora End Of Life 2015-11-04 15:49:50 UTC
This message is a reminder that Fedora 21 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 21. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '21'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 21 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 4 Fedora Update System 2015-12-07 21:29:58 UTC
qemu-2.3.1-8.fc22 has been submitted as an update to Fedora 22. https://bodhi.fedoraproject.org/updates/FEDORA-2015-686f289aa5

Comment 5 Fedora Update System 2015-12-08 23:51:16 UTC
qemu-2.3.1-8.fc22 has been pushed to the Fedora 22 testing repository. If problems still persist, please make note of it in this bug report.
If you want to test the update, you can install it with
$ su -c 'dnf --enablerepo=updates-testing update qemu'
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2015-686f289aa5

Comment 6 Cole Robinson 2015-12-30 20:40:18 UTC
Update is in stable now