Bug 665111 - libvirt snapshots use dd inefficiently
Summary: libvirt snapshots use dd inefficiently
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: libvirt
Version: 5.6
Hardware: Unspecified
OS: Unspecified
Target Milestone: rc
: 5.7
Assignee: Eric Blake
QA Contact: Virtualization Bugs
Depends On:
TreeView+ depends on / blocked
Reported: 2010-12-22 19:15 UTC by Eric Blake
Modified: 2011-04-22 16:26 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2011-04-22 16:26:10 UTC
Target Upstream Version:

Attachments (Terms of Use)

Description Eric Blake 2010-12-22 19:15:09 UTC
Description of problem:
libvirt uses the qemu monitor migration command with an argument of 'exec:compressor | { dd ... && dd bs=1M ...; }'.  This means that the second dd process is receiving input from a pipe, and since 1M is larger than PIPE_MAX, it is possible to get short reads regardless of the behavior of compressor.  If we instead use dd ibs=PIPE_MAX obs=1M, there's still the issue that a short write in the compressor will necessarily result in a short read in dd, due to the atomicity requirements of write/read on pipes.  Either way, the POSIX requirements on dd state that if dd gets a short read, it must pad the buffer with NUL and write the result, rather than aggregating short reads to the full output buffer size.  This, unfortunately, can result in data corruption in the snapshot image, and is more likely as system load increases the likelihood of a short read from the pipe.

Version-Release number of selected component (if applicable):

How reproducible:
data corruption is not 100% predictable - it all depends on whether the kernel ever has a load where dd can get a short read() because it is reading faster than the compressor is calling write().

Steps to Reproduce:
1. any action that creates a qemu snapshot, such as the libvirt-guests init script, or 'virsh snapshot-create'
Actual results:
If the short read happens, the snapshot file will contain blocks of NUL used to pad out the short read to the output block size rather than the expected data from qemu.  dd outputs to stderr whether any short reads happened, but libvirt is not even paying attention to that statistic.

Expected results:
The only way to safely use dd without the risk of short reads is to use the GNU extension of iflag=fullblock.

Additional info:
See upstream discussion:

Comment 1 Eric Blake 2010-12-22 19:31:56 UTC
Update - POSIX only requires short read padding when using 'dd bs=n'; when using 'dd ibs=x obs=y', POSIX requires short read aggregation.  Therefore, it is possible to resolve this bug without relying on the GNU extension of iflag=fullblock.

Comment 2 Eric Blake 2010-12-22 21:00:01 UTC
Further update - POSIX only specifies padding on short reads when using the conv=sync option alongside bs=; but since libvirt is not using conv=sync, the result is that dd ends up doing short writes, but does not truncate data.

Changing to dd ibs=x obs=y may improve efficiency, but as this is no longer a correctness problem, it no longer needs to be backported into 5.6.  Retargeting to 5.7.

Comment 4 RHEL Program Management 2011-01-11 20:45:37 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 5 RHEL Program Management 2011-01-11 23:00:18 UTC
This request was erroneously denied for the current release of
Red Hat Enterprise Linux.  The error has been fixed and this
request has been re-proposed for the current release.

Comment 6 Eric Blake 2011-04-22 16:26:10 UTC
This is not a correctness issue; the existing command works, and the difference in speed is negligible.  I'd rather avoid the risk of backporting the change.  Closing this as WONTFIX for RHEL 5.  Meanwhile, the fix is indeed upstream:

commit 8e42c50bd4e60ebcdcdafb195204383856fdaf2d
Author: Eric Blake <eblake@redhat.com>
Date:   Wed Dec 22 13:48:05 2010 -0700

    qemu: improve efficiency of dd during snapshots
but superceded by even newer commits that avoid dd altogether, by using fd: migration instead of exec: migration.

Note You need to log in before you can comment on or make changes to this bug.