Description of problem:
The issue is faced on RHEL 7.4, the qemu-kvm is getting crashed with following dump
Program terminated with signal 11, Segmentation fault.
#0 qemu_co_queue_run_restart (co=co@entry=0x5602801e8080) at util/qemu-coroutine-lock.c:83
Version-Release number of selected component (if applicable):
Red Hat Enterprise Linux Server release 7.4 (Maipo)
qemu-kvm-rhev-2.9.0-16.el7_4.3.x86_64
How reproducible:
There are no specific steps but the scenario seems high I/O load
Steps to Reproduce:
N/A
Actual results:
qemu-kvm-rhev is seg-faulting with followig backtrace:
--------------------------------------
0 qemu_co_queue_run_restart (co=co@entry=0x5602801e8080) at util/qemu-coroutine-lock.c:83
#1 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x5602801e8080) at util/qemu-coroutine.c:127
#2 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#3 0x000056027a8c5955 in qemu_co_queue_run_restart (co=co@entry=0x56027f36d480) at util/qemu-coroutine-lock.c:84
#4 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027f36d480) at util/qemu-coroutine.c:127
#5 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#6 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027cd281c0) at util/qemu-coroutine-lock.c:84
#7 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027cd281c0) at util/qemu-coroutine.c:127
#8 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#9 0x000056027a8c5955 in qemu_co_queue_run_restart (co=co@entry=0x56027d7eaf80) at util/qemu-coroutine-lock.c:84
#10 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027d7eaf80) at util/qemu-coroutine.c:127
#11 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#12 0x000056027a8c5955 in qemu_co_queue_run_restart (co=co@entry=0x56027f390080) at util/qemu-coroutine-lock.c:84
#13 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027f390080) at util/qemu-coroutine.c:127
#14 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#15 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027cd27cc0) at util/qemu-coroutine-lock.c:84
#16 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027cd27cc0) at util/qemu-coroutine.c:127
#17 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#18 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x5602801e9840) at util/qemu-coroutine-lock.c:84
#19 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x5602801e9840) at util/qemu-coroutine.c:127
#20 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#21 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027e9d0800) at util/qemu-coroutine-lock.c:84
#22 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027e9d0800) at util/qemu-coroutine.c:127
#23 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#24 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027d7eb700) at util/qemu-coroutine-lock.c:84
#25 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027d7eb700) at util/qemu-coroutine.c:127
#26 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#27 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027e9cdc00) at util/qemu-coroutine-lock.c:84
#28 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027e9cdc00) at util/qemu-coroutine.c:127
#29 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#30 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027fbc6bc0) at util/qemu-coroutine-lock.c:84
#31 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027fbc6bc0) at util/qemu-coroutine.c:127
#32 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#33 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x5602809bd7c0) at util/qemu-coroutine-lock.c:84
#34 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x5602809bd7c0) at util/qemu-coroutine.c:127
#35 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#36 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x5602809bca00) at util/qemu-coroutine-lock.c:84
#37 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x5602809bca00) at util/qemu-coroutine.c:127
#38 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#39 0x000056027a8c5955 in qemu_co_queue_run_restart (co=co@entry=0x5602801e8800) at util/qemu-coroutine-lock.c:84
#40 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x5602801e8800) at util/qemu-coroutine.c:127
#41 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#42 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x5602801e8f80) at util/qemu-coroutine-lock.c:84
#43 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x5602801e8f80) at util/qemu-coroutine.c:127
#44 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#45 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027f36bf40) at util/qemu-coroutine-lock.c:84
#46 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027f36bf40) at util/qemu-coroutine.c:127
#47 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#48 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027cd29200) at util/qemu-coroutine-lock.c:84
#49 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027cd29200) at util/qemu-coroutine.c:127
#50 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#51 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027cd28bc0) at util/qemu-coroutine-lock.c:84
#52 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027cd28bc0) at util/qemu-coroutine.c:127
#53 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#54 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027f38f040) at util/qemu-coroutine-lock.c:84
#55 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=ctx@entry=0x56027c8d9700, co=co@entry=0x56027f38f040) at util/qemu-coroutine.c:127
#56 0x000056027a8b0d78 in aio_co_enter (ctx=<optimized out>, co=0x56027f38f040) at util/async.c:472
#57 0x000056027a8b0dbc in aio_co_wake (co=<optimized out>) at util/async.c:456
#58 0x000056027a83b8cd in qemu_laio_process_completion (laiocb=<optimized out>) at block/linux-aio.c:103
#59 0x000056027a83b95c in qemu_laio_process_completions (s=s@entry=0x56027c8a39f0) at block/linux-aio.c:221
#60 0x000056027a83bbb9 in qemu_laio_process_completions_and_submit (s=0x56027c8a39f0) at block/linux-aio.c:236
#61 0x000056027a8b2c68 in aio_dispatch_handlers (ctx=ctx@entry=0x56027c8d9700) at util/aio-posix.c:399
---------------------------------------------------------------------
Expected results:
The qemu-kvm-rhev should not seg-fault.
Additional info:
I have found fix on qemu-devel, which seems to be fixed in 2.10 qemu-kvm-rhev version which is by default available on RHEL 7.5
https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg06570.html.
Can we backport the fix or we need to upgrade it to 7.5, customer right now is on 7.4.
I will be attaching coredump for further analysis shortly.
Thanks & Regards,
Nirav Dave
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2018:1104