Bug 1535637

Summary:	Free disk space check (for raw image) ignores sparseness
Product:	Red Hat OpenStack	Reporter:	Ian Pilcher <ipilcher>
Component:	openstack-ironic	Assignee:	Steve Baker <sbaker>
Status:	CLOSED ERRATA	QA Contact:	mlammon
Severity:	medium	Docs Contact:
Priority:	medium
Version:	16.1 (Train)	CC:	akaris, bfournie, jkreger, mburns, pweeks, rhel-osp-director-maint, sbaker, shdunne, srevivo
Target Milestone:	beta	Keywords:	Reopened, Triaged
Target Release:	17.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	openstack-ironic-17.0.4-0.20210828041812.43cc2b6.el8	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2022-09-21 12:07:40 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Ian Pilcher 2018-01-17 19:17:37 UTC

When using whole disk images, one generally wants the *virtual* size of the image to be close to the size of the overcloud nodes' physical disks (since no partition/logical volume/filesystem) expansion is done). As an example, one might end up with a 4GB overcloud image (QCOW file) that has a virtual size of 100GB or more.

Ironic uses qemu-img to convert this QCOW file to a "raw" image, which is a *sparse* file.  Currently ironic-conductor errors out if /var/lib/ironic/images doesn't have free space >= the *virtual* size of the image (100GB in this example).  In fact, the actual disk space consumed by the sparse raw image is fairly close to the size of the QCOW file.  For example:

$ qemu-img info overcloud-full-partitioned.qcow2
image: overcloud-full-partitioned.qcow2
file format: qcow2
virtual size: 100G (107374182400 bytes)
disk size: 7.0G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

$ qemu-img convert -O raw overcloud-full-partitioned.qcow2 overcloud-full-partitioned.img

$ ls -lh overcloud-full-partitioned.img
-rw-r--r--. 1 stack stack 100G Jan 17 14:13 overcloud-full-partitioned.img

$ du -h overcloud-full-partitioned.img
6.7G    overcloud-full-partitioned.img

Comment 1 Ian Pilcher 2018-01-17 19:44:57 UTC

Related doc bug - https://bugzilla.redhat.com/show_bug.cgi?id=1532745

Comment 2 Dmitry Tantsur 2018-01-18 15:56:40 UTC

Thanks for the report. To be honest, I'm not sure how to solve it. Do you by chance know a way to figure out the raw image size before converting it?

Comment 3 Ian Pilcher 2018-01-18 16:15:22 UTC

(In reply to Dmitry Tantsur from comment #2)
> Thanks for the report. To be honest, I'm not sure how to solve it. Do you by
> chance know a way to figure out the raw image size before converting it?

I don't think that there's any way to know *exactly* how much space the sparse raw image will use, but I think that the size of the QCOW file (assuming that's what we're dealing with) is a reasonable proxy.  Maybe check for 2X the size of the QCOW file and/or change from a fatal error to a warning?

Comment 4 Dmitry Tantsur 2018-01-23 11:27:41 UTC

Yeah, this may work.

Comment 5 Ilya Etingof 2018-02-14 16:12:47 UTC

> I don't think that there's any way to know *exactly* how much space the 
> sparse  raw image will use, but I think that the size of the QCOW file 
> (assuming that's what we're dealing with) is a reasonable proxy.  Maybe 
> check for 2X the size of the QCOW file and/or change from a fatal error 
> to a warning?

Just wanted to note some edge cases for us to consider:

1. Generally, image size growth coefficient might be dependent on the
   source and destination image formats. For instance, if our source image
   is already `raw` the coefficient would be 1X rather than 2X.
   
2. I am guessing that even for `qcow` format, the 2X growth coefficient may
   not always work. My concern is that if you take a freshly build `qcow` file
   and do a bunch of writes it will grow (because of copy-on-write stacking).
   However, when you convert such "used" `qcow` into `raw`, the once copied 
   data might get reduced out (only the latest version ends up in `raw` image).

3. Finally, `qcow` files can optionally be compressed what may influence
   their growth when dumped into `raw`.

4. Purely theoretically (noting just for posterity), not all filesystems/OSes
   support sparse files. Though modern filesystems and Linux seem to handle
   sparse files pretty well. On the other hand, Solaris and network filesystems
   might not be that advanced in that regard. Does this require further
   research?

Given all these complications I am thinking that may be we could just bite 
the bullet and try writing the destination image regardless of its expected
size while keep watching the free space being allocated. When we [somehow]
observe the free space if running low, we just abort the conversion and
clean up the remnants?

Comment 6 Ian Pilcher 2018-02-14 16:21:41 UTC

(In reply to Ilya Etingof from comment #5)
> Given all these complications I am thinking that may be we could just bite 
> the bullet and try writing the destination image regardless of its expected
> size while keep watching the free space being allocated. When we [somehow]
> observe the free space if running low, we just abort the conversion and
> clean up the remnants?

Makes sense to me.

Comment 7 Ilya Etingof 2018-02-15 10:33:18 UTC

The tricky part, however, is how to watch the growing space allocation on the file system. I can imagine SIGSTOP'ing running qemu-img periodically, checking out free space, clean up some caches and SIGCONT'ing qemu-img for some more time. Besides general weirdness of such design, it seems non-trivial to implement within the oslo process execution facilities.

Therefore, please welcome the quick and straightforward fix:

    https://review.openstack.org/#/c/544839/

Comment 8 Ian Pilcher 2018-02-15 17:24:00 UTC

(In reply to Ilya Etingof from comment #7)
> The tricky part, however, is how to watch the growing space allocation on
> the file system. I can imagine SIGSTOP'ing running qemu-img periodically,
> checking out free space, clean up some caches and SIGCONT'ing qemu-img for
> some more time. Besides general weirdness of such design, it seems
> non-trivial to implement within the oslo process execution facilities.

There's always the wait for it to die and clean up approach ...

> Therefore, please welcome the quick and straightforward fix:
> 
>     https://review.openstack.org/#/c/544839/

I, for one, welcome our new heuristic overlords!  ;-)

Comment 9 Bob Fournier 2019-05-28 15:41:30 UTC

Patches seem to have stalled, do we still plan on making this change?

Comment 11 Julia Kreger 2019-05-30 14:52:26 UTC

I've re-read the original bug filing, and I'm wondering if we're over thinking this issue.

The issue at hand is we have guard rails for the size check that don't account for virtual vs physical size.  At what point do we "really" need to have a size check beyond we have some percentage of free space?

I guess this kind of heads into the debate of "gracefully fail" vs "hard failure" vs "failure where corrective action can be taken or forward path identified (i.e. an alarm goes off in a monitoring system) and maybe a periodic task begins to delete old files."

Comment 13 Bob Fournier 2019-08-17 12:57:09 UTC

Upstream patches have stalled, we're closing this as not fix for now based on Juila's comment 11.

Comment 14 Andreas Karis 2020-12-14 17:10:16 UTC

I just encountered this in OSP 16.1 and the upstream patch changed recently, in October: https://review.opendev.org/c/openstack/ironic/+/544839

I am reopening this issue; feel free to close it again, but given that the upstream bug moved, it might make sense to keep this open?

Comment 15 Steve Baker 2020-12-14 19:27:45 UTC

I refreshed the upstream patch, we'll have a talk about what to do with this bug.

Comment 16 Yaniv Kaul 2021-06-15 08:45:16 UTC

Flag of RHOS-15 is probably wrong, but otherwise, what's missing here to move this bug to the next step?

Comment 17 Julia Kreger 2021-06-15 15:29:16 UTC

The community originally pushed back against the original code proposal and the patches stalled until someone else (sbaker in this case) picked it up and revised the check based on the discussion.

That patch landed in time for OSP17 so I don't believe any additional action is really required. I anticipate we would possibly get pushback if we tried to backport this upstream down to Train, as such I've updated the tag for 17, and I'm moving it to modified state and removing the triaged flag so the team revisits this item for discussion.

Comment 19 Steve Baker 2021-06-15 22:00:52 UTC

I'll set back to MODIFIED when I've dug up a package with the fix

Comment 20 pweeks 2021-08-18 16:29:22 UTC

Steve, reminder to follow up on this please.

Comment 21 Steve Baker 2021-09-22 22:44:23 UTC

I've confirmed this change is in the latest RHOSP-17.0 build package

Comment 29 errata-xmlrpc 2022-09-21 12:07:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:6543