Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1724483

Summary: master0 server installation takes a long time
Product: Red Hat Enterprise Virtualization Manager Reporter: David Vaanunu <dvaanunu>
Component: ovirt-engine-metricsAssignee: Rich Megginson <rmeggins>
Status: CLOSED NEXTRELEASE QA Contact: Guilherme Santos <gdeolive>
Severity: medium Docs Contact:
Priority: medium    
Version: unspecifiedCC: bugs, lleistne, rmeggins, sradco
Target Milestone: ovirt-4.3.11   
Target Release: 4.3.11   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-07-16 13:54:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Metrics RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1827177    
Bug Blocks:    

Description David Vaanunu 2019-06-27 07:40:52 UTC
Description of problem:

The installation of master0 takes a long time ~2:30hrs.
Maybe can it be faster?

How reproducible:


Steps to Reproduce:
1. Run installation
2. Review the output



Actual results:


Expected results:


Additional info:

Bellow time breakdown. 

INSTALLER STATUS ***************************************************************************************************************************
Health Check                : Complete (0:03:48)
Node Bootstrap Preparation  : Complete (0:56:13)
etcd Install                : Complete (0:02:18)
Master Install              : Complete (0:13:23)
Master Additional Install   : Complete (0:24:06)
Node Join                   : Complete (0:00:22)
Hosted Install              : Complete (0:01:20)
Web Console Install         : Complete (0:00:49)
Console Install             : Complete (0:00:47)
metrics-server Install      : Complete (0:00:02)
Initialization              : Complete (0:00:21)
Logging Install             : Complete (0:05:26)
Monday 24 June 2019  07:51:52 -0400 (0:00:00.987)       2:28:48.018 *********** 
=============================================================================== 
openshift_node : install needed rpm(s) ------------------------------------------------------------------------------------------- 2565.93s
cockpit : Install cockpit-ws ----------------------------------------------------------------------------------------------------- 1393.28s
container_runtime : Install Docker ----------------------------------------------------------------------------------------------- 1293.17s
openshift_node : Install node, clients, and conntrack packages -------------------------------------------------------------------- 456.32s
Ensure openshift-ansible installer package deps are installed --------------------------------------------------------------------- 406.75s
Run health checks (install) - EL -------------------------------------------------------------------------------------------------- 227.81s
os_firewall : Install firewalld packages ------------------------------------------------------------------------------------------ 219.24s
openshift_ca : Install the base package for admin tooling ------------------------------------------------------------------------- 203.53s
openshift_node : Install dnsmasq -------------------------------------------------------------------------------------------------- 184.69s
openshift_cli : Install clients --------------------------------------------------------------------------------------------------- 145.77s
openshift_control_plane : Wait for control plane pods to appear -------------------------------------------------------------------- 68.78s
openshift_repos : Disable all repositories ----------------------------------------------------------------------------------------- 45.29s
openshift_control_plane : Wait for all control plane pods to become ready ---------------------------------------------------------- 44.44s
openshift_repos : Enable RHEL repositories ----------------------------------------------------------------------------------------- 40.92s
openshift_cli : Install bash completion for oc tools ------------------------------------------------------------------------------- 37.05s
etcd : Install openssl ------------------------------------------------------------------------------------------------------------- 37.00s
nickhammond.logrotate : nickhammond.logrotate | Install logrotate ------------------------------------------------------------------ 36.94s
etcd : Install openssl ------------------------------------------------------------------------------------------------------------- 36.83s
openshift_repos : Ensure libselinux-python is installed ---------------------------------------------------------------------------- 36.81s
openshift_node : Install NFS storage plugin dependencies --------------------------------------------------------------------------- 36.32s
[root@vm-71-139 ~]#

Comment 1 Sandro Bonazzola 2019-06-27 08:29:52 UTC
> openshift_node : install needed rpm(s) ------------------------------------------------------------------------------------------- 2565.93s

looks like you have a very slow network here, maybe use a local proxy?

Comment 2 Sandro Bonazzola 2019-06-27 08:30:26 UTC
Rich any task we can skip? like dropping cockpit-ws install?

Comment 3 Rich Megginson 2019-06-27 15:51:53 UTC
(In reply to Sandro Bonazzola from comment #2)
> Rich any task we can skip? like dropping cockpit-ws install?

Yes, and there are probably others as well.

Comment 4 Rich Megginson 2019-06-28 01:41:43 UTC
I'm doing a local VM install of Origin 3.11 with logging.  This is what it looks like with the default parameters.  Note that the performance characteristics are _wildly_ different than the ones posted above.

PLAY RECAP *********************************************************************
localhost                  : ok=819  changed=362  unreachable=0    failed=0


INSTALLER STATUS ***************************************************************
Initialization               : Complete (0:00:08)
Health Check                 : Complete (0:00:25)
Node Bootstrap Preparation   : Complete (0:02:05)
etcd Install                 : Complete (0:00:29)
Master Install               : Complete (0:03:38)
Master Additional Install    : Complete (0:00:28)
Node Join                    : Complete (0:00:06)
Hosted Install               : Complete (0:00:37)
Cluster Monitoring Operator  : Complete (0:01:17)
Web Console Install          : Complete (0:00:34)
Console Install              : Complete (0:00:31)
metrics-server Install       : Complete (0:00:00)
Logging Install              : Complete (0:02:27)
Service Catalog Install      : Complete (0:01:47)
Thursday 27 June 2019  18:31:59 +0000 (0:00:00.047)       0:14:48.319 *********
===============================================================================
openshift_control_plane : Wait for all control plane pods to become ready -- 71.83s
/usr/share/ansible/openshift-ansible/roles/openshift_control_plane/tasks/main.yml:227
openshift_cluster_monitoring_operator : Wait for the ServiceMonitor CRD to be created -- 65.26s
/usr/share/ansible/openshift-ansible/roles/openshift_cluster_monitoring_operator/tasks/install.yaml:115
openshift_control_plane : Wait for control plane pods to appear -------- 48.89s
/usr/share/ansible/openshift-ansible/roles/openshift_control_plane/tasks/main.yml:175
openshift_node : Install node, clients, and conntrack packages --------- 36.83s
/usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/install.yml:2 -
template_service_broker : Verify that TSB is running ------------------- 28.41s
/usr/share/ansible/openshift-ansible/roles/template_service_broker/tasks/deploy.yml:52
openshift_web_console : Verify that the console is running ------------- 26.99s
/usr/share/ansible/openshift-ansible/roles/openshift_web_console/tasks/start.yml:2
Run health checks (install) - EL --------------------------------------- 25.29s
/usr/share/ansible/openshift-ansible/playbooks/openshift-checks/private/install.yml:24
openshift_console : Waiting for console rollout to complete ------------ 24.15s
/usr/share/ansible/openshift-ansible/roles/openshift_console/tasks/start.yml:2
openshift_node : Install Ceph storage plugin dependencies -------------- 23.06s
/usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/storage_plugins/ceph.yml:2
openshift_ca : Install the base package for admin tooling -------------- 14.93s
/usr/share/ansible/openshift-ansible/roles/openshift_ca/tasks/main.yml:6 ------
openshift_service_catalog : oc_process --------------------------------- 13.93s
/usr/share/ansible/openshift-ansible/roles/openshift_service_catalog/tasks/install.yml:44
openshift_logging : Annotate Operations Projects for data prefix ------- 12.00s
/usr/share/ansible/openshift-ansible/roles/openshift_logging/tasks/annotate_ops_projects.yaml:24
openshift_node_group : Wait for the sync daemonset to become ready and available -- 10.95s
/usr/share/ansible/openshift-ansible/roles/openshift_node_group/tasks/sync.yml:65
openshift_hosted : Create OpenShift router ------------------------------ 8.44s
/usr/share/ansible/openshift-ansible/roles/openshift_hosted/tasks/router.yml:85
openshift_node : install needed rpm(s) ---------------------------------- 8.15s
/usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/install_rpms.yml:2
etcd : Install etcd ----------------------------------------------------- 7.62s
/usr/share/ansible/openshift-ansible/roles/etcd/tasks/certificates/fetch_server_certificates_from_ca.yml:2
openshift_node : Install iSCSI storage plugin dependencies -------------- 7.56s
/usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/storage_plugins/iscsi.yml:2
openshift_service_catalog : Set Controller Manager service -------------- 6.45s
/usr/share/ansible/openshift-ansible/roles/openshift_service_catalog/tasks/install.yml:177
openshift_node : Install GlusterFS storage plugin dependencies ---------- 6.36s
/usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/glusterfs.yml:2
openshift_manageiq : Configure role/user permissions -------------------- 6.29s
/usr/share/ansible/openshift-ansible/roles/openshift_manageiq/tasks/main.yaml:45


Then, I ran it again, but added the following parameters to the [OSEv3:vars] section of the ansible inventory file:

# optional components disabled here for a fast install
openshift_install_examples=False
openshift_cluster_monitoring_operator_install=False
openshift_enable_service_catalog=False
template_service_broker_install=False
osm_use_cockpit=False
openshift_hosted_registry_selector='useregistry=false'
openshift_hosted_manage_registry=False
openshift_hosted_manage_registry_console=False
openshift_metrics_install_metrics=False
osn_storage_plugin_deps=[]
openshift_use_manageiq=False

I don't know what all of these do, but it seems that RHV either doesn't use them or need them.  Of course, this would have to be fully QE tested to make sure it works for the RHV use case.

This is the result:

PLAY RECAP *********************************************************************
localhost                  : ok=660  changed=270  unreachable=0    failed=0


INSTALLER STATUS ***************************************************************
Initialization              : Complete (0:00:08)
Health Check                : Complete (0:00:23)
Node Bootstrap Preparation  : Complete (0:01:33)
etcd Install                : Complete (0:00:28)
Master Install              : Complete (0:03:17)
Master Additional Install   : Complete (0:00:09)
Node Join                   : Complete (0:00:06)
Hosted Install              : Complete (0:00:20)
Web Console Install         : Complete (0:00:40)
Console Install             : Complete (0:00:22)
metrics-server Install      : Complete (0:00:01)
Logging Install             : Complete (0:02:08)
Friday 28 June 2019  00:33:54 +0000 (0:00:00.034)       0:09:49.161 ***********
===============================================================================
openshift_control_plane : Wait for all control plane pods to become ready -- 55.32s
/usr/share/ansible/openshift-ansible/roles/openshift_control_plane/tasks/main.yml:227
openshift_control_plane : Wait for control plane pods to appear -------- 48.49s
/usr/share/ansible/openshift-ansible/roles/openshift_control_plane/tasks/main.yml:175
openshift_web_console : Verify that the console is running ------------- 31.88s
/usr/share/ansible/openshift-ansible/roles/openshift_web_console/tasks/start.yml:2
openshift_node : Install node, clients, and conntrack packages --------- 31.23s
/usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/install.yml:2 -
openshift_node : install needed rpm(s) --------------------------------- 25.10s
/usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/install_rpms.yml:2
Run health checks (install) - EL --------------------------------------- 23.25s
/usr/share/ansible/openshift-ansible/playbooks/openshift-checks/private/install.yml:24
openshift_console : Waiting for console rollout to complete ------------ 16.20s
/usr/share/ansible/openshift-ansible/roles/openshift_console/tasks/start.yml:2
openshift_ca : Install the base package for admin tooling -------------- 13.49s
/usr/share/ansible/openshift-ansible/roles/openshift_ca/tasks/main.yml:6 ------
openshift_logging : Annotate Operations Projects for data prefix ------- 12.22s
/usr/share/ansible/openshift-ansible/roles/openshift_logging/tasks/annotate_ops_projects.yaml:24
openshift_node_group : Wait for the sync daemonset to become ready and available -- 10.90s
/usr/share/ansible/openshift-ansible/roles/openshift_node_group/tasks/sync.yml:65
openshift_hosted : Create OpenShift router ------------------------------ 8.07s
/usr/share/ansible/openshift-ansible/roles/openshift_hosted/tasks/router.yml:85
etcd : Install etcd ----------------------------------------------------- 6.91s
/usr/share/ansible/openshift-ansible/roles/etcd/tasks/certificates/fetch_server_certificates_from_ca.yml:2
openshift_control_plane : Wait for APIs to become available ------------- 4.18s
/usr/share/ansible/openshift-ansible/roles/openshift_control_plane/tasks/check_master_api_is_ready.yml:2
openshift_logging : Run JKS generation script --------------------------- 4.04s
/usr/share/ansible/openshift-ansible/roles/openshift_logging/tasks/generate_jks.yaml:112
openshift_excluder : Install openshift excluder - yum ------------------- 3.61s
/usr/share/ansible/openshift-ansible/roles/openshift_excluder/tasks/install.yml:34
openshift_hosted : Create default projects ------------------------------ 3.57s
/usr/share/ansible/openshift-ansible/roles/openshift_hosted/tasks/create_projects.yml:2
Approve node certificates when bootstrapping ---------------------------- 3.08s
/usr/share/ansible/openshift-ansible/playbooks/openshift-node/private/join.yml:40
openshift_logging : Gather OpenShift Logging Facts ---------------------- 2.98s
/usr/share/ansible/openshift-ansible/roles/openshift_logging/tasks/install_logging.yaml:2
openshift_excluder : Install docker excluder - yum ---------------------- 2.74s
/usr/share/ansible/openshift-ansible/roles/openshift_excluder/tasks/install.yml:9
openshift_cli : Install clients ----------------------------------------- 2.35s
/usr/share/ansible/openshift-ansible/roles/openshift_cli/tasks/main.yml:2 -----

The time went down from 14+ minutes to 9+ minutes.

Some of the long duration tasks were shorter e.g. "openshift_control_plane : Wait for all control plane pods to become ready" came down from 71 seconds to 55 seconds.  And some of the long running tasks such as "openshift_cluster_monitoring_operator : Wait for the ServiceMonitor CRD to be created" are now gone.  But it didn't really make a difference in the task "openshift_control_plane : Wait for control plane pods to appear".