Bug 1256541

Summary:	qemu-img hangs forever in aio_poll when used to convert some images
Product:	Red Hat Enterprise Linux 7	Reporter:	Benjamin Schmaus <bschmaus>
Component:	qemu-kvm-rhev	Assignee:	Hanna Czenczek <hreitz>
Status:	CLOSED ERRATA	QA Contact:	Virtualization Bugs <virt-bugs>
Severity:	high	Docs Contact:
Priority:	urgent
Version:	7.1	CC:	aparsons, areis, berrange, bschmaus, dasmith, eglynn, huding, juzhang, kchamart, knoel, kwolf, ndipanov, pbrady, qiguo, rbryant, sbauza, sferdjao, sgordon, stefanha, tlavigne, virt-maint, vromanso, wlehman, xfu, yeylon
Target Milestone:	pre-dev-freeze
Target Release:	7.2
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	qemu-kvm-rhev-2.3.0-25.el7	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2015-12-04 16:54:32 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Benjamin Schmaus 2015-08-24 21:42:45 UTC

Description of problem:

Some qcow2 images that a tenant is deploying cause the qemu-img process to hang during the first deployment of that image.  The process will never terminate on its own. 

Manually converting the qcow2 image seems to work fine and additional attempts to deploy the same image eventually lead to successful deployments.

The qemu-img process is stuck in the ppoll() syscall and will not terminate on its own.  A stack trace reveals:

[root@slnec1cmp8 ~]# pstack 26628
#0  0x00007f9cbfc1dc3f in ppoll () from /lib64/libc.so.6
#1  0x00007f9cc4a63fdb in qemu_poll_ns ()
#2  0x00007f9cc4a64d30 in aio_poll ()
#3  0x00007f9cc4a5e7ed in bdrv_prwv_co ()
#4  0x00007f9cc4a5e8e3 in bdrv_rw_co ()
#5  0x00007f9cc4a551dc in img_convert ()
#6  0x00007f9cbfb53af5 in __libc_start_main () from /lib64/libc.so.6
#7  0x00007f9cc4a5194d in _start ()

And an strace:
[root@slnec1cmp8 ~]# strace -p 26628
Process 26628 attached
ppoll([{fd=5, events=POLLIN|POLLERR|POLLHUP}], 1, NULL, NULL, 8^CProcess 26628 detached
 <detached ...>

A possible related upstream commit to resolve the issue:
http://git.qemu.org/?p=qemu.git;a=commit;h=edec47cfef96209987cb7922286cb384916aae02


Version-Release number of selected component (if applicable):


How reproducible:
Sporadic but consistent enough


Steps to Reproduce:
1.for i in {1..10}; do echo $i; time qemu-img convert -p -O raw <qcow2> <outfile> && rm -f <outfile>; done


Actual results:
Hang that needs to be killed

Expected results:
qemu-img should exit

Additional info:

File used for testing:

The file uploaded to dropbox.redhat.com is named case-01497239-06a1d2bf16cb8e442a474b342eaf5cb32eee597f.qcow2.

Comment 8 Hanna Czenczek 2015-09-07 18:45:37 UTC

Cannot reproduce so far (done about 100 iterations on RHEL 7.1 in a VM, neither with qemu-img built from qemu-kvm-rhev-2.3.0-22.el7, nor qemu-kvm-rhev-2.1.2-23.el7_1.9, nor RHEL's qemu-kvm (1.5.3)).

I do see some small ~5 s hang-ups, but these seem to be related to caching effects, i.e. the OS cache filling up until that point and then having to be drained. When I disable caching with -T none -t none, these disappear, same when writing the output to tmpfs (the VM is running from an HDD).

(And by “~5 s hang-up” I mean that progress is stalled for about 5 seconds and then continues normally, until conversion is complete)

The hang-ups generally appear when a ppoll() without a timeout is executed, as stated in this BZ's report. Actually, there is a pretty large number of timeout-less ppoll() calls, which has been reduced upstream by 690c7301600162421b928c7f26fd488fd8fa464e. Backporting this does not “fix” these small hang-ups, though, as there are still some timeout-less ppoll() calls which will still hang for some seconds.

This seems intentional, however. qemu uses a timeout only if there is some event that will need attention then. During qemu-img convert, and thus without system emulation going on, there aren't many such events, therefore blocking (timeout-less) ppoll() calls are to be expected.

So unless I will yet be able to reproduce the issue, the best thing I can do is blindly backport the commit mentioned in this BZ's report.

Max

Comment 9 Hanna Czenczek 2015-09-07 20:05:12 UTC

Just saw this thread on the upstream list: http://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg06168.html

There, Paolo mentions that what the series edec47cfef96209987cb7922286cb384916aae02 is belonging to is trying to fix is a bug only visible on aarch64 hosts in very rare cases. He then adds that the bug fixed by the patch itself wasn't actually hit so far and thus seems to be theoretical only.

Peter Lieven then describes qemu-img sometimes hanging for him when using NFS (that is, libnfs). However, this was apparently a bug in libnfs and should be fixed now; also, Paolo says that edec47cfef cannot be related to qemu-img, as qemu-img does not use the main loop.

So, all in all, edec47cfef will most probably not resolve this issue.

Benjamin, do you have more information on how the images are deployed, that is, what the exact qemu-img command line is? Since you apparently have been able to reproduce the issue yourself, which medium did the qcow2 image come from in your reproducer line, and where was the output written to?


Thank you,

Max

Comment 10 Benjamin Schmaus 2015-09-08 14:33:34 UTC

Here was the simple command line test we did against a series of images:

qemu-img convert -p -O raw <qcow2> <outfile>

Executed in  a loop:

for i in {1..10}; do echo $i; time qemu-img convert -p -O raw <qcow2> <outfile> && rm -f <outfile>; done

These images were large greater then 50GB where we saw the behaviour.  Smaller images did not experience the issue.  

The images themselves were custom images and/or appliances.  I can try to get more detail from the client.

Comment 12 Hanna Czenczek 2015-09-08 21:50:05 UTC

Hi Benjamin,

So the qcow2 images are located on an HDD or SSD, and the output is written to an HDD or SSD, too?

Unless I can reproduce the issue myself, I'll try to create a Brew build with some upstream patches to qemu/qemu-img convert so you can try whether that fixes anything for you.

Max

Comment 14 Benjamin Schmaus 2015-09-09 12:44:40 UTC

Max - Yes image is read from HDD and written to HDD.

Comment 15 Hanna Czenczek 2015-09-09 17:06:28 UTC

Hi Benjamin,

OK. I made a brew build with the commit mentioned in your report and 690c7301600162421b928c7f26fd488fd8fa464e, an upstream patch rewriting the qemu-img convert logic: https://brewweb.devel.redhat.com/taskinfo?taskID=9816793 – can you please try whether the qemu-img from that build still suffers from the issue?


Thank you,

Max

Comment 17 Qian Guo 2015-09-10 09:30:49 UTC

components used by comment 16:
# uname -r
3.10.0-315.el7.x86_64
# rpm -q qemu-kvm-rhev
qemu-kvm-rhev-2.3.0-22.el7.x86_64
# rpm -q qemu-img-rhev
qemu-img-rhev-2.3.0-22.el7.x86_64

Comment 18 Benjamin Schmaus 2015-09-10 12:38:31 UTC

Max,

Which packages do I pull from that patch?

-Benjamin

Comment 19 Benjamin Schmaus 2015-09-10 13:05:55 UTC

Qian,

Do you have anything larger from a hardware perspective to test this on?  The guys in the field were doing converts in ~20secs whereas your converts are taking 9 minutes.  The reasoning is that the issue is hit when using more threads/cores.

-Benjamin

Comment 21 Hanna Czenczek 2015-09-10 18:25:35 UTC

Benjamin,

The qemu-img-rhev package, I presume, i.e. http://download.devel.redhat.com/brewroot/work/tasks/6793/9816793/qemu-img-rhev-2.3.0-12.el7.test.x86_64.rpm.

Max

Comment 22 Qian Guo 2015-09-11 03:04:20 UTC

(In reply to Benjamin Schmaus from comment #19)
> Qian,
> 
> Do you have anything larger from a hardware perspective to test this on? 
> The guys in the field were doing converts in ~20secs whereas your converts
> are taking 9 minutes.  The reasoning is that the issue is hit when using
> more threads/cores.
> 
> -Benjamin

Hi, Benjamin

Important infos, I will try to reserve such machine to try.

Thanks,
Qian.

Comment 23 Qian Guo 2015-09-11 05:01:26 UTC

(In reply to Benjamin Schmaus from comment #10)
> Here was the simple command line test we did against a series of images:
> 
> qemu-img convert -p -O raw <qcow2> <outfile>
> 
> Executed in  a loop:
> 
> for i in {1..10}; do echo $i; time qemu-img convert -p -O raw <qcow2>
> <outfile> && rm -f <outfile>; done
> 
> These images were large greater then 50GB where we saw the behaviour. 
> Smaller images did not experience the issue.  
> 
> The images themselves were custom images and/or appliances.  I can try to
> get more detail from the client.

Benjamin,

Could you help provide the qemu-img info of the original image, like:

# qemu-img info xxx(the guest image)

Thanks,
Qian

Comment 24 Kevin Wolf 2015-09-11 08:46:16 UTC

(In reply to Benjamin Schmaus from comment #19)
> Do you have anything larger from a hardware perspective to test this on? 
> The guys in the field were doing converts in ~20secs whereas your converts
> are taking 9 minutes.  The reasoning is that the issue is hit when using
> more threads/cores.

qemu-img doesn't parallelise things, so that's unlikely to make a difference.
Moreover, converting an image is I/O bound and not CPU bound. If you want it to
go faster, you need faster storage (can be emulated with a ramdisk), not a
faster CPU.

Comment 25 Benjamin Schmaus 2015-09-11 12:50:28 UTC

$ qemu-img info case-01497239-06a1d2bf16cb8e442a474b342eaf5cb32eee597f.qcow2
image: case-01497239-06a1d2bf16cb8e442a474b342eaf5cb32eee597f.qcow2
file format: qcow2
virtual size: 50G (53687091200 bytes)
disk size: 1.4G
cluster_size: 4096
Format specific information:
    compat: 0.10
    refcount bits: 16

Comment 26 Kevin Wolf 2015-09-11 13:35:30 UTC

Where does that image come from? Unless the option were intentionally set like
this, this looks like a very old image ("compat: 0.10" hints at pre-RHEL-7, and
a cluster size of 4k hasn't even been the default since RHEL 5.5).

If this was indeed created by (much) older versions of qemu, it may not even be
possible to reproduce the problem with images produced by newer versions.

Comment 27 Benjamin Schmaus 2015-09-11 13:46:31 UTC

@Kevin, it is a vendor provided appliance.  So it might be advantageous to have the client experiencing the problem go back to their vendor and press them to generate a new image with qemu tools?

Comment 28 Kevin Wolf 2015-09-11 14:03:42 UTC

There is nothing inherently wrong with such images, and if we can't handle them
correctly, that's a bug that must be fixed. I just haven't seen an image with
4k clusters in ages, so I was curious how this image was created. If you don't
already know, it's probably not worth investigating about this further, though.

Comment 30 Ademar Reis 2015-09-16 13:59:20 UTC

(In reply to Max Reitz from comment #21)
> Benjamin,
> 
> The qemu-img-rhev package, I presume, i.e.
> http://download.devel.redhat.com/brewroot/work/tasks/6793/9816793/qemu-img-
> rhev-2.3.0-12.el7.test.x86_64.rpm.
> 
> Max

Benjamin, any luck with the scratch build above?

Comment 31 Benjamin Schmaus 2015-09-16 14:12:31 UTC

@Ademar - Sorry responded on the case but not here - the scratch build rpm did fix the issue on the client side.  They were running with it in the environment.

Comment 33 Miroslav Rezanina 2015-09-23 13:26:46 UTC

Fix included in qemu-kvm-rhev-2.3.0-25.el7

Comment 37 juzhang 2015-09-25 03:14:28 UTC

Hi Benjamin,

Seems QE can not reproduce it, would you verify this issue by using qemu-kvm-rhev-2.3.0-25.el7 or qemu-kvm-rhev-2.3.0-26.el7 build?

Best Regards,
Junyi

Comment 39 Ademar Reis 2015-09-25 14:42:15 UTC

(In reply to juzhang from comment #37)
> Hi Benjamin,
> 
> Seems QE can not reproduce it, would you verify this issue by using
> qemu-kvm-rhev-2.3.0-25.el7 or qemu-kvm-rhev-2.3.0-26.el7 build?
> 

Benjamin, you can download the packages from here:

http://download.eng.bos.redhat.com/brewroot/packages/qemu-kvm-rhev/2.3.0/26.el7/x86_64/

Comment 42 juzhang 2015-10-13 10:11:46 UTC

According to comment38 and comment41, I'm going to set this issue as verified.

Hi Benjamin,

Free to update your testing result once you get.

Best Regards,
Junyi

Comment 43 Benjamin Schmaus 2015-10-20 12:56:47 UTC

Sorry for the delay having trouble getting client confirmation.

Comment 45 errata-xmlrpc 2015-12-04 16:54:32 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2546.html