Bug 1542045

Summary: qemu-kvm-rhev seg-faults at qemu_co_queue_run_restart (co=co@entry=0x5602801e8080) at util/qemu-coroutine-lock.c:83)
Product: Red Hat Enterprise Linux 7 Reporter: Nirav Dave <ndave>
Component: qemu-kvm-rhevAssignee: Stefan Hajnoczi <stefanha>
Status: CLOSED ERRATA QA Contact: Gu Nini <ngu>
Severity: high Docs Contact:
Priority: high    
Version: 7.4CC: chayang, coli, ddepaula, juzhang, knoel, lmiksik, michen, mrezanin, mtessun, ngu, qzhang, stefanha, toneata, virt-maint
Target Milestone: rcKeywords: Reopened, TestOnly, ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: QEMU 2.10.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1544333 (view as bug list) Environment:
Last Closed: 2018-04-11 01:01:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1544333    

Description Nirav Dave 2018-02-05 12:58:40 UTC
Description of problem:
The issue is faced on RHEL 7.4, the qemu-kvm is getting crashed with following dump 

Program terminated with signal 11, Segmentation fault.
#0  qemu_co_queue_run_restart (co=co@entry=0x5602801e8080) at util/qemu-coroutine-lock.c:83


Version-Release number of selected component (if applicable):
Red Hat Enterprise Linux Server release 7.4 (Maipo)
qemu-kvm-rhev-2.9.0-16.el7_4.3.x86_64

How reproducible:
There are no specific steps but the scenario seems high I/O load

Steps to Reproduce:
N/A


Actual results:
qemu-kvm-rhev is seg-faulting with followig backtrace:

--------------------------------------
0  qemu_co_queue_run_restart (co=co@entry=0x5602801e8080) at util/qemu-coroutine-lock.c:83
#1  0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x5602801e8080) at util/qemu-coroutine.c:127
#2  0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#3  0x000056027a8c5955 in qemu_co_queue_run_restart (co=co@entry=0x56027f36d480) at util/qemu-coroutine-lock.c:84
#4  0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027f36d480) at util/qemu-coroutine.c:127
#5  0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#6  0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027cd281c0) at util/qemu-coroutine-lock.c:84
#7  0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027cd281c0) at util/qemu-coroutine.c:127
#8  0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#9  0x000056027a8c5955 in qemu_co_queue_run_restart (co=co@entry=0x56027d7eaf80) at util/qemu-coroutine-lock.c:84
#10 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027d7eaf80) at util/qemu-coroutine.c:127
#11 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#12 0x000056027a8c5955 in qemu_co_queue_run_restart (co=co@entry=0x56027f390080) at util/qemu-coroutine-lock.c:84
#13 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027f390080) at util/qemu-coroutine.c:127
#14 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#15 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027cd27cc0) at util/qemu-coroutine-lock.c:84
#16 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027cd27cc0) at util/qemu-coroutine.c:127
#17 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#18 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x5602801e9840) at util/qemu-coroutine-lock.c:84
#19 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x5602801e9840) at util/qemu-coroutine.c:127
#20 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#21 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027e9d0800) at util/qemu-coroutine-lock.c:84
#22 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027e9d0800) at util/qemu-coroutine.c:127
#23 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#24 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027d7eb700) at util/qemu-coroutine-lock.c:84
#25 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027d7eb700) at util/qemu-coroutine.c:127
#26 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#27 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027e9cdc00) at util/qemu-coroutine-lock.c:84
#28 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027e9cdc00) at util/qemu-coroutine.c:127
#29 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#30 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027fbc6bc0) at util/qemu-coroutine-lock.c:84
#31 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027fbc6bc0) at util/qemu-coroutine.c:127
#32 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#33 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x5602809bd7c0) at util/qemu-coroutine-lock.c:84
#34 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x5602809bd7c0) at util/qemu-coroutine.c:127
#35 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#36 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x5602809bca00) at util/qemu-coroutine-lock.c:84
#37 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x5602809bca00) at util/qemu-coroutine.c:127
#38 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#39 0x000056027a8c5955 in qemu_co_queue_run_restart (co=co@entry=0x5602801e8800) at util/qemu-coroutine-lock.c:84
#40 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x5602801e8800) at util/qemu-coroutine.c:127
#41 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#42 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x5602801e8f80) at util/qemu-coroutine-lock.c:84
#43 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x5602801e8f80) at util/qemu-coroutine.c:127
#44 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#45 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027f36bf40) at util/qemu-coroutine-lock.c:84
#46 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027f36bf40) at util/qemu-coroutine.c:127
#47 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#48 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027cd29200) at util/qemu-coroutine-lock.c:84
#49 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027cd29200) at util/qemu-coroutine.c:127
#50 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#51 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027cd28bc0) at util/qemu-coroutine-lock.c:84
#52 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=<optimized out>, co=0x56027cd28bc0) at util/qemu-coroutine.c:127
#53 0x000056027a8c5835 in qemu_coroutine_enter (co=<optimized out>) at util/qemu-coroutine.c:144
#54 0x000056027a8c5974 in qemu_co_queue_run_restart (co=co@entry=0x56027f38f040) at util/qemu-coroutine-lock.c:84
#55 0x000056027a8c570f in qemu_aio_coroutine_enter (ctx=ctx@entry=0x56027c8d9700, co=co@entry=0x56027f38f040) at util/qemu-coroutine.c:127
#56 0x000056027a8b0d78 in aio_co_enter (ctx=<optimized out>, co=0x56027f38f040) at util/async.c:472
#57 0x000056027a8b0dbc in aio_co_wake (co=<optimized out>) at util/async.c:456
#58 0x000056027a83b8cd in qemu_laio_process_completion (laiocb=<optimized out>) at block/linux-aio.c:103
#59 0x000056027a83b95c in qemu_laio_process_completions (s=s@entry=0x56027c8a39f0) at block/linux-aio.c:221
#60 0x000056027a83bbb9 in qemu_laio_process_completions_and_submit (s=0x56027c8a39f0) at block/linux-aio.c:236
#61 0x000056027a8b2c68 in aio_dispatch_handlers (ctx=ctx@entry=0x56027c8d9700) at util/aio-posix.c:399
---------------------------------------------------------------------


Expected results:
The qemu-kvm-rhev should not seg-fault. 


Additional info:
I have found fix on qemu-devel, which seems to be fixed in 2.10 qemu-kvm-rhev version which is by default available on RHEL 7.5

https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg06570.html.

Can we backport the fix or we need to upgrade it to 7.5, customer right now is on 7.4.

I will be  attaching coredump for further analysis shortly.

Thanks & Regards,
Nirav Dave

Comment 18 errata-xmlrpc 2018-04-11 01:01:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1104