Red Hat Bugzilla – Bug 223491
virt-install does not use O_DIRECT for creating non-sparse files, so it fills the pagecache
Last modified: 2007-11-30 17:07:40 EST
Description of problem:
The summary pretty much says it all; since virt-install doesn't use O_DIRECT
when you pass --non-sparse (which is now recommended), it fills the pagecache
with zeros, basically. Besides being useless, this also exacerbates another Xen
BZ 222467 describes a bug where the user gets a "Cannot allocate memory" error
from the hypervisor when trying to start a fully-virtualized domain. To get out
of this state the user generally has to reboot the system to get fully
virtualized guests booting again. Basically this bug boils down to the
Hypervisor reading uninitialized memory as valid page tables; most of the time
you get lucky and nothing bad comes of it, but sometimes you fail.
However, *this* bug exacerbates the problem in BZ 222467. I'm not 100%
sure what is going on, but by filling the page cache (and hence, possibly,
moving memory structures around, or zeroing out certain pages), it is *far* more
likely to hit 222467. So, because we are recommending that everyone use
non-sparse files (for performance and ENOSPC issues), anyone who follows our
recommendation while using virt-install has a better chance of failing to start
with "Cannot allocate memory".
Fixing this bug will just cause the "Cannot allocate memory" to be less
likely to happen; the real fix (as discussed in 222467) is another upstream
Created attachment 146255 [details]
Make virt-install use a dd with O_DIRECT to avoid filling the page cache
I'm fine including the patch from comment #4 in the RHEL-5 GA build of
python-virtinst, but only with the understanding that we will fix the HV
properly at the soonest opportunity. This patch isn't something we can carry
long-term because it will not play nicely with forthcoming UI improvements, such
as progress feedback while creating the files.
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release. Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release. This request is not yet committed for
QE ack for RHEL5. This creates lots of problems with our hardware cert suite.
Chris Lalancette has tested the patch in comment #4 and seen it dramatically
reduce the incidence of the 'Cannot allocate memory' bug described in comment #1.
The patch has also been reviewed by myself & hugh brock. Finally the QA guest
installation tests all use the virt-install tool. Currently they always create
sparse file, and they will shortly also create non-sparse files. Thus the QA
guest install tests will be able to quickly validate that this patch does not
introduce any regressions.
Fix committed to CVS:
* Wed Jan 24 2007 Daniel Berrange <email@example.com> - 0.99.0-2.el5
- use dd with o_direct to create non-sparse files (bz #223491)
And built in brew in the dist-5E-qu-candidate collection:
$ brew latest-pkg dist-5E-qu-candidate python-virtinst
Build Tag Built by
---------------------------------------- -------------------- ----------------
python-virtinst-0.99.0-2.el5 dist-5E-qu-candidate berrange
I have tested the following:
virt-install using sparse file
virt-install using non-sparse file
virt-manager using sparse file
virt-manager using non-sparse file
All four cases successfully completed file creation, and the non-sparse cases
correctly using O_DIRECT.
I'm reopening this issue based on the information in comment 13 and comment 14.
Marking as candidate for 5.1. This was marked as a blocker for RHEL5 GA, so
proposing for 5.1 blocker. QE ack for RHEL5.1.
The comments in #13 & #14 are not related to the problem reported in this bug -
in fact they confirm that virt-install is using O_DIRECT correctly. A new BZ
should be opened to track the different problem reported in comment #13/14 since
it is a hypervisor/kernel issue, rather than a tools issue.