Bug 1368420 - [RFE] Improve update speed
Summary: [RFE] Improve update speed
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-node
Classification: oVirt
Component: Installation & Update
Version: 4.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.1.3
: 4.1
Assignee: Ryan Barry
QA Contact: Huijuan Zhao
URL:
Whiteboard:
Depends On: 1457111 1457670
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-19 10:37 UTC by Fabian Deutsch
Modified: 2017-07-06 13:16 UTC (History)
13 users (show)

Fixed In Version: imgbased-0.9.28-0
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-07-06 13:16:50 UTC
oVirt Team: Node
Embargoed:
rule-engine: ovirt-4.1?
fdeutsch: ovirt-4.2?
fdeutsch: planning_ack?
sbonazzo: devel_ack+
ycui: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 77206 0 master MERGED upgrade: use tar instead of rsync 2017-05-25 16:52:29 UTC
oVirt gerrit 77262 0 master MERGED osupdater: thread as many operations as possible 2017-05-25 16:53:24 UTC
oVirt gerrit 77267 0 master MERGED osupdater: use rpm -qa --queryformat for RPM scripts 2017-05-25 16:53:47 UTC
oVirt gerrit 77358 0 ovirt-4.1 MERGED upgrade: use tar instead of rsync 2017-05-25 16:54:39 UTC
oVirt gerrit 77362 0 ovirt-4.1 MERGED osupdater: thread as many operations as possible 2017-05-25 17:31:27 UTC
oVirt gerrit 77363 0 ovirt-4.1 MERGED osupdater: use rpm -qa --queryformat for RPM scripts 2017-05-25 17:31:53 UTC

Description Fabian Deutsch 2016-08-19 10:37:16 UTC
Description of problem:
Currently the update of Node takes quite long due to three main factors:
1 Large rpm download
2 Slow syncing to disk
3 initramfs rebuild

This RFE is about improving 2 and 3.

2 might be improved by tuning how the data is written (i.e. don't use ionice, different mount flags, use async, …), or change how snapshots are taken.

3 might be addressed by starting it in parallel or at a different time

Comment 1 Fabian Deutsch 2016-10-20 06:43:04 UTC
Closing this bug for now, as it is to generic.

Comment 2 Ryan Barry 2017-05-23 15:32:16 UTC
There are a few areas for improvement here:

Once the initial filesystem is synced, we perform several steps:

* /etc is kept in sync across all layers to make it behave like RPM
* /var is synced

These steps can happen in parallel. Once that is done

* Users and groups are updated to match (UIDs, GIDs, new users)
  - RPM ownership is updated
* selinux sets files ownership
* dracut regenerates

These steps can happen in parallel. 

dracut only needs /usr/lib in order to build.
selinux needs /etc present
uid/gid drift can be corrected, followed immediately by RPM ownership updates

In 4.1.3, we'll also have:

collation of RPM %post scripts, which also only depends on /usr/share being present. This can also happen in parallel.

Basically, I see a single thread for the initial steps (squashfs, NIST partitions)

Then two threads (/etc syncing and /var)

Then 4 threads (UID/GID, setfiles_t, dracut, RPM %post)

This may take a little tweaking to ensure that we aren't exhausting the system, but these are primarily limited by context switching or CPU.

The other risk is having the same filesystem mounted all over the place. We'll need wait for all the threads to join(), but hopefully won't need any mutexes/semaphores.

Comment 3 Red Hat Bugzilla Rules Engine 2017-05-23 15:32:25 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 4 Huijuan Zhao 2017-06-02 06:03:52 UTC
There is blocker Bug 1457111 in imgbased-0.9.28-0.1.el7ev which block this bug verification.

For another 4.1.3 build imgbased-0.9.30-0.1.el7ev, there is blocker bug 1457670 which block this bug verification.

So can not verify this bug in the current 4.1.3 builds, QE will verify this bug once receive new available 4.1.3 build.

Change the status to MODIFIED.

Comment 5 Huijuan Zhao 2017-06-13 10:52:35 UTC
Tested several scenarios with redhat-virtualization-host-4.1-20170609.2, the upgrade time is different with different disks.

The upgrade time means the installation time as below, mainly the process of 2/2:

# yum update
----------------
Running transaction
  Installing : redhat-virtualization-host-image-update-4.1-20170609.2.el7_3.noarch                                                            1/2 
  Erasing    : redhat-virtualization-host-image-update-placeholder-4.1-2.1.el7.noarch                                                         2/2 
  Verifying  : redhat-virtualization-host-image-update-4.1-20170609.2.el7_3.noarch                                                            1/2 
  Verifying  : redhat-virtualization-host-image-update-placeholder-4.1-2.1.el7.noarch                                                         2/2 

Installed:
  redhat-virtualization-host-image-update.noarch 0:4.1-20170609.2.el7_3                                                                           

Replaced:
  redhat-virtualization-host-image-update-placeholder.noarch 0:4.1-2.1.el7                                                                        

Complete!
----------------


Test version:
From: rhvh-4.1-0.20170522.0 (or rhvh-4.0-0.20170307.0)
To:   rhvh-4.1-0.20170609.0

Test steps:
1. Install rhvh-4.1.2 or rhvh-4.0.7 release build.
2. Add host to engine
3. Login in host, setup local repos, and upgrade to rhvh-4.1-0.20170609.0.
   # yum update
4. Focus on the upgrade process, and count upgrade time.


Test results:
1. Tested with only 1 disk(600GB SAS, or 1TB SATA, or 300GB FC), the upgrade time is from 5 minutes to 7 minutes.
2. Tested with 3 FC disks(300GB, 150GB, 150GB), the upgrade time is 10 minutes.
3. Tested with 21 iscsi disks(200GB, 100GB*20) and 1 local disk(3TB), the upgrade time is 30 minutes.


So when rhvh is installed in multiple disks, the upgrade time is long, and looks like the more disks the longer time.

Ryan, according to the test results, could I verify this bug?

Comment 6 Ryan Barry 2017-06-13 13:33:15 UTC
That appears to be a different bug.

Can you provide a test environment? Ideally one with as many disks as possible. I won't have a lab available for about a month.

For verification, please compare the time taken to upgrade from an earlier version.

Hypothetically:

Install rhvh-4.0-0.20170307.0
Upgrade to rhvh-4.1-20170522.0

Compare the time to:

Install rhvh-4.0-0.20170307.0
Upgrade to rhvh-4.1-20170609.0

We are limited by disk speed and CPU in many cases (particularly syncing the new image for the first time, selinux, and running dracut), but there should be a definite improvement.

Comment 7 Huijuan Zhao 2017-06-13 16:02:24 UTC
Ryan, already sent two testing ENVs info(testing results 2 and 3 in comment 5) to you via email.


I tested the upgrade time from rhvh-4.0-0.20170307.0 to rhvh-4.1-20170522.0,

1. In one local disk, almost same as rhvh-4.1-20170609.0, about 5 minutes.

2. I will test upgrade time in multiple disks from rhvh-4.0-0.20170307.0 to rhvh-4.1-20170522.0 tomorrow, after you finish checks in the multiple disks ENV.

But I think according to the test results in comment 5, the upgrade time is longer with more disks, so no matter what's the upgrade time in the old builds, could we accept the current upgrade time in multiple disks? If yes, I think we can verify this bug.

Comment 8 Ryan Barry 2017-06-13 20:10:50 UTC
Yeah, the upgrade time with more disks is ok. That's not a new issue.

Was the upgrade to 20170522.0 on the same system?

We added a lot of stuff in 4.1.3 (NIST partitioning, selinux/semodule fixes back in) which takes time, but I'd still hope it would be less. 

In testing, the update time on my system dropped from 10 min to 4 min, but that's with all of those patches in rather than 4.0 to 4.1.2.

I'm still curious whether it's faster on the same system with a single disk.

Either way, we can VERIFY the bug (since the upgrade time is now low enough to fit into engine's window with everything added back in), though.

Comment 9 Huijuan Zhao 2017-06-14 03:32:28 UTC
(In reply to Ryan Barry from comment #6)

> For verification, please compare the time taken to upgrade from an earlier
> version.
> 
> Hypothetically:
> 
> Install rhvh-4.0-0.20170307.0
> Upgrade to rhvh-4.1-20170522.0
> 
> Compare the time to:
> 
> Install rhvh-4.0-0.20170307.0
> Upgrade to rhvh-4.1-20170609.0
> 

Tested on the same two machines with comment 5:
Install rhvh-4.0-0.20170307.0
Upgrade to rhvh-4.1-20170522.0

Test results:
1. Tested two times with only 1 disk(1TB SATA), the upgrade time is from 6 minutes to 7 minutes.
2. Tested with 3 FC disks(300GB, 150GB, 150GB), the upgrade time is 11 minutes.

So the upgrade time is a bit less when upgrade to rhvh-4.1-20170609.0.


According to above test results, comment 5 and comment 8, I will change the status to VERIFIED.


Note You need to log in before you can comment on or make changes to this bug.