1274065 – [RFE] Hosted-Engine: Use a qcow2 image for the appliance

Bug 1274065 - [RFE] Hosted-Engine: Use a qcow2 image for the appliance

Summary: [RFE] Hosted-Engine: Use a qcow2 image for the appliance

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	ovirt-hosted-engine-setup
Classification:	oVirt
Component:	RFEs
Sub Component:
Version:	---
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Yedidyah Bar David
QA Contact:	meital avital
Docs Contact:
URL:
Whiteboard:
Depends On:	1198507 1528987
Blocks:
TreeView+	depends on / blocked

Reported:	2015-10-21 19:51 UTC by Fabian Deutsch
Modified:	2019-04-28 14:22 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Clone Of:
Clones:	1528987 (view as bug list)
Environment:
Last Closed:	2018-02-05 10:56:53 UTC
oVirt Team:	Integration
Embargoed:
Dependent Products:
Flags:	fdeutsch: ovirt-future? rule-engine: planning_ack? rule-engine: devel_ack? rule-engine: testing_ack?

Attachments	(Terms of Use)

Description Fabian Deutsch 2015-10-21 19:51:56 UTC

Description of problem:
When using the appliance flow in HE setup, the extraction is taking a large portion of the setup time.

Version-Release number of selected component (if applicable):
3.6

How reproducible:
always

Steps to Reproduce:
1. Use appliance flow in
2.
3.

Actual results:
Extraction process takes long

Expected results:
Extraction process is quick

Additional info:

Comment 1 Fabian Deutsch 2015-10-22 11:35:22 UTC

Raising the priority, because the flow could be nice, but is really slowed down due to this bug.

Ryan, could you quantify the slow down compared to the whole installation duration?

Comment 2 Simone Tiraboschi 2015-10-22 11:42:03 UTC

(In reply to Fabian Deutsch from comment #1)
> Raising the priority, because the flow could be nice, but is really slowed
> down due to this bug.
> 
> Ryan, could you quantify the slow down compared to the whole installation
> duration?

It's not really a bug, it's an RFE: it could be faster but it's working.
The issue is about python not efficiently handling sparse files so the real gain depends just from the image sparseness.

Comment 3 Fabian Deutsch 2015-10-22 11:47:25 UTC

Agreed, it's an RFE.

Comment 4 Yaniv Kaul 2015-10-22 19:10:01 UTC

Are we running virt-sparsify and virt-sysprep on the image before packing it?

Comment 5 Fabian Deutsch 2017-01-27 09:51:28 UTC

Yes, IIRC.

The files should be sparse, but as Simone says: Python does not handle those efficiently when extracting tars.

Comment 6 Yaniv Kaul 2017-11-16 13:42:09 UTC

Simone, the new installation flow in 4.2 makes a difference here?

Comment 7 Simone Tiraboschi 2017-11-16 14:14:21 UTC

(In reply to Yaniv Kaul from comment #6)
> Simone, the new installation flow in 4.2 makes a difference here?

We are using system tar via ansible with --sparse option:
https://github.com/oVirt/ovirt-hosted-engine-setup/blob/master/src/ansible/bootstrap_local_vm.yml#L36

On my test system with a 7200 rpm disk is taking about 50 seconds to extract a 2.4 qcow2 sparse image from a 800M ova file:

 2017-11-16 12:14:51,291+0100 INFO otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.v2_playbook_on_task_start:164 TASK [Extract appliance to local vm dir]
 2017-11-16 12:15:41,756+0100 INFO otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.v2_runner_on_ok:120 changed: [localhost]
 
 [root@c74he20171031h1 ~]# ls -lh /usr/share/ovirt-engine-appliance/ovirt-engine-appliance-4.2-20171114.1.el7.centos.ova
 -rw-r--r--. 1 root root 808M 14 nov 14.59 /usr/share/ovirt-engine-appliance/ovirt-engine-appliance-4.2-20171114.1.el7.centos.ova
 [root@c74he20171031h1 ~]# du -h /var/tmp/localvm/images/d1746233-6cf3-42b8-9efb-b481954a8d3f/d7cdd432-9bc6-47f6-a6d8-73340e59647b
 2,4G	/var/tmp/localvm/images/d1746233-6cf3-42b8-9efb-b481954a8d3f/d7cdd432-9bc6-47f6-a6d8-73340e59647b
 [root@c74he20171031h1 ~]# file /var/tmp/localvm/images/d1746233-6cf3-42b8-9efb-b481954a8d3f/d7cdd432-9bc6-47f6-a6d8-73340e59647b
 /var/tmp/localvm/images/d1746233-6cf3-42b8-9efb-b481954a8d3f/d7cdd432-9bc6-47f6-a6d8-73340e59647b: QEMU QCOW Image (v3), 53687091200 bytes

If we want to improve, I think we should evaluate avoiding the RPM(OVA(QCOW2)) packaging and just ship the qcow2 disk in an rpm file using it in place with a snapshot to revert on issues.

Comment 8 Yaniv Kaul 2017-11-16 18:38:54 UTC

Alternatively we should use the RHEL cloud image, and virt-customize it on the spot with latest Engine, etc. 
Should take a lot more time, but would reduce the initial size and ensure we use the latest-greatest.

Comment 9 Yedidyah Bar David 2017-12-25 14:37:20 UTC

85710 is a patch to package the appliance also as a qcow2 image.

Keeping current bug for using this image.

Comment 10 Yedidyah Bar David 2017-12-25 14:42:02 UTC

Based on some internal discussions, this is the current idea:

1. Package the appliance (also) as a qcow2 image, to be extracted directly to the local disk, so that one can start a libvirt vm from it immediately, without further extraction/copying. This is bug 1528987.

2. Use this image in hosted-engine setup. Currently we'll only do this in "node-zero". The local engine vm will run from the qcow2 image (probably with a snapshot, so that we can easily revert). Still TBD how to create the final engine vm disk image, perhaps some variation(s) on qemu-img copying.

Comment 11 Yedidyah Bar David 2017-12-27 10:53:32 UTC

On a test deploy (on a vm on my laptop, with SSD), "Extract appliance to local vm dir" took 55 seconds, which is the time I hoped to save. Installing "ovirt-engine-appliance-qcow2-4.2-20171226.1.el7.centos.noarch.rpm", generated by jenkins for 85710, took 74 seconds. Installing the regular appliance took 9 seconds. So not sure we are saving much by this.

Comment 12 Yedidyah Bar David 2017-12-28 07:21:46 UTC

The patch is for the appliance, removing from here. Also moving to NEW, this might be a simple case of premature optimization - need to measure first on various different machines and see if we manage to find a flow with a significant saving, otherwise close the bug.

Note You need to log in before you can comment on or make changes to this bug.