Bug 1274065

Summary: [RFE] Hosted-Engine: Use a qcow2 image for the appliance
Product: [oVirt] ovirt-hosted-engine-setup Reporter: Fabian Deutsch <fdeutsch>
Component: RFEsAssignee: Yedidyah Bar David <didi>
Status: CLOSED NOTABUG QA Contact: meital avital <mavital>
Severity: medium Docs Contact:
Priority: high    
Version: ---CC: bugs, didi, fdeutsch, lsurette, lveyde, rbarry, sbonazzo, stirabos, ykaul
Target Milestone: ---Keywords: FutureFeature
Target Release: ---Flags: fdeutsch: ovirt-future?
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
: 1528987 (view as bug list) Environment:
Last Closed: 2018-02-05 10:56:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1198507, 1528987    
Bug Blocks:    

Description Fabian Deutsch 2015-10-21 19:51:56 UTC
Description of problem:
When using the appliance flow in HE setup, the extraction is taking a large portion of the setup time.

Version-Release number of selected component (if applicable):
3.6

How reproducible:
always

Steps to Reproduce:
1. Use appliance flow in
2.
3.

Actual results:
Extraction process takes long

Expected results:
Extraction process is quick

Additional info:

Comment 1 Fabian Deutsch 2015-10-22 11:35:22 UTC
Raising the priority, because the flow could be nice, but is really slowed down due to this bug.

Ryan, could you quantify the slow down compared to the whole installation duration?

Comment 2 Simone Tiraboschi 2015-10-22 11:42:03 UTC
(In reply to Fabian Deutsch from comment #1)
> Raising the priority, because the flow could be nice, but is really slowed
> down due to this bug.
> 
> Ryan, could you quantify the slow down compared to the whole installation
> duration?

It's not really a bug, it's an RFE: it could be faster but it's working.
The issue is about python not efficiently handling sparse files so the real gain depends just from the image sparseness.

Comment 3 Fabian Deutsch 2015-10-22 11:47:25 UTC
Agreed, it's an RFE.

Comment 4 Yaniv Kaul 2015-10-22 19:10:01 UTC
Are we running virt-sparsify and virt-sysprep on the image before packing it?

Comment 5 Fabian Deutsch 2017-01-27 09:51:28 UTC
Yes, IIRC.

The files should be sparse, but as Simone says: Python does not handle those efficiently when extracting tars.

Comment 6 Yaniv Kaul 2017-11-16 13:42:09 UTC
Simone, the new installation flow in 4.2 makes a difference here?

Comment 7 Simone Tiraboschi 2017-11-16 14:14:21 UTC
(In reply to Yaniv Kaul from comment #6)
> Simone, the new installation flow in 4.2 makes a difference here?

We are using system tar via ansible with --sparse option:
https://github.com/oVirt/ovirt-hosted-engine-setup/blob/master/src/ansible/bootstrap_local_vm.yml#L36

On my test system with a 7200 rpm disk is taking about 50 seconds to extract a 2.4 qcow2 sparse image from a 800M ova file:

 2017-11-16 12:14:51,291+0100 INFO otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.v2_playbook_on_task_start:164 TASK [Extract appliance to local vm dir]
 2017-11-16 12:15:41,756+0100 INFO otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.v2_runner_on_ok:120 changed: [localhost]
 
 [root@c74he20171031h1 ~]# ls -lh /usr/share/ovirt-engine-appliance/ovirt-engine-appliance-4.2-20171114.1.el7.centos.ova
 -rw-r--r--. 1 root root 808M 14 nov 14.59 /usr/share/ovirt-engine-appliance/ovirt-engine-appliance-4.2-20171114.1.el7.centos.ova
 [root@c74he20171031h1 ~]# du -h /var/tmp/localvm/images/d1746233-6cf3-42b8-9efb-b481954a8d3f/d7cdd432-9bc6-47f6-a6d8-73340e59647b
 2,4G	/var/tmp/localvm/images/d1746233-6cf3-42b8-9efb-b481954a8d3f/d7cdd432-9bc6-47f6-a6d8-73340e59647b
 [root@c74he20171031h1 ~]# file /var/tmp/localvm/images/d1746233-6cf3-42b8-9efb-b481954a8d3f/d7cdd432-9bc6-47f6-a6d8-73340e59647b
 /var/tmp/localvm/images/d1746233-6cf3-42b8-9efb-b481954a8d3f/d7cdd432-9bc6-47f6-a6d8-73340e59647b: QEMU QCOW Image (v3), 53687091200 bytes

If we want to improve, I think we should evaluate avoiding the RPM(OVA(QCOW2)) packaging and just ship the qcow2 disk in an rpm file using it in place with a snapshot to revert on issues.

Comment 8 Yaniv Kaul 2017-11-16 18:38:54 UTC
Alternatively we should use the RHEL cloud image, and virt-customize it on the spot with latest Engine, etc. 
Should take a lot more time, but would reduce the initial size and ensure we use the latest-greatest.

Comment 9 Yedidyah Bar David 2017-12-25 14:37:20 UTC
85710 is a patch to package the appliance also as a qcow2 image.

Keeping current bug for using this image.

Comment 10 Yedidyah Bar David 2017-12-25 14:42:02 UTC
Based on some internal discussions, this is the current idea:

1. Package the appliance (also) as a qcow2 image, to be extracted directly to the local disk, so that one can start a libvirt vm from it immediately, without further extraction/copying. This is bug 1528987.

2. Use this image in hosted-engine setup. Currently we'll only do this in "node-zero". The local engine vm will run from the qcow2 image (probably with a snapshot, so that we can easily revert). Still TBD how to create the final engine vm disk image, perhaps some variation(s) on qemu-img copying.

Comment 11 Yedidyah Bar David 2017-12-27 10:53:32 UTC
On a test deploy (on a vm on my laptop, with SSD), "Extract appliance to local vm dir" took 55 seconds, which is the time I hoped to save. Installing "ovirt-engine-appliance-qcow2-4.2-20171226.1.el7.centos.noarch.rpm", generated by jenkins for 85710, took 74 seconds. Installing the regular appliance took 9 seconds. So not sure we are saving much by this.

Comment 12 Yedidyah Bar David 2017-12-28 07:21:46 UTC
The patch is for the appliance, removing from here. Also moving to NEW, this might be a simple case of premature optimization - need to measure first on various different machines and see if we manage to find a flow with a significant saving, otherwise close the bug.