+++ This bug is an upstream to downstream clone. The original bug is: +++ +++ bug 1378087 +++ ====================================================================== Description of problem: Adding additional host to Hosted engine on RHEL7.3 + RHV4.0.3 environment leaves the host in installed-failed state. This issue is caused becaused ovirt-host-deploy tries to set rhs-virtualization profile on the host which is not present in the system and gets stuck there and times out after 10 minutes. Till RHEL7.2 this problem is not seen because if tuned fails to set non existing profiles it just returns back and does not wait for timeout to happen. Version-Release number of selected component (if applicable): RHEL7.3 ovirt-engine-4.0.3-0.1.el7ev.noarch How reproducible: Always Steps to Reproduce: 1. Have three hosts with glusterfs as storage domain 2. Now deploy hosted engine on first host 3. Then try adding the second host from the UI Actual results: Adding second host gets stuck at updating hosted-engine configuration and fails with "Failed to install Host <host_name>. Processing stopped due to timeout." Expected results: second host should be added successfully. Additional info: I strongly suppose that the issue is due to this: 2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd systemd.state:130 starting service tuned 2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:813 execute: ('/bin/systemctl', 'start', 'tuned.service'), executable='None', cwd='None', env=None 2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:863 execute-result: ('/bin/systemctl', 'start', 'tuned.service'), rc=0 2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:921 execute-output: ('/bin/systemctl', 'start', 'tuned.service') stdout: 2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:926 execute-output: ('/bin/systemctl', 'start', 'tuned.service') stderr: 2016-09-20 20:05:01 DEBUG otopi.plugins.ovirt_host_deploy.tune.tuned plugin.executeRaw:813 execute: ('/sbin/tuned-adm', 'profile', 'rhs-virtualization'), executable='None', cwd='None', env=None 2016-09-20 20:15:02 DEBUG otopi.plugins.ovirt_host_deploy.tune.tuned plugin.executeRaw:863 execute-result: ('/sbin/tuned-adm', 'profile', 'rhs-virtualization'), rc=1 2016-09-20 20:15:02 DEBUG otopi.plugins.ovirt_host_deploy.tune.tuned plugin.execute:921 execute-output: ('/sbin/tuned-adm', 'profile', 'rhs-virtualization') stdout: Operation timed out after waiting 600 seconds(s), you may try to increase timeout by using --timeout command line option or using --async. 2016-09-20 20:15:02 DEBUG otopi.plugins.ovirt_host_deploy.tune.tuned plugin.execute:926 execute-output: ('/sbin/tuned-adm', 'profile', 'rhs-virtualization') stderr: Requested profile 'rhs-virtualization' doesn't exist. 2016-09-20 20:15:02 WARNING otopi.plugins.ovirt_host_deploy.tune.tuned tuned._misc:105 Cannot set tuned profile 2016-09-20 20:15:02 DEBUG otopi.context context._executeMethod:128 Stage misc METHOD otopi.plugins.ovirt_host_deploy.vdsm.bridge.Plugin._misc 2016-09-20 20:15:02 DEBUG otopi.context context._executeMethod:134 condition False (Originally by Kasturi Narra)
This issue is not really about deploying second host in Hosted engine setup. This issue is applicable in all standard 'Add Host' flows to a Cluster with Gluster Service enabled. When host-deploy tries to set a non-existent tuned profile on a RHEL-7.3 system, it is blocked for 600 seconds by default. As a result, engine just times out. Root cause: In RHEL-7.3 /sbin/tuned-adm command has -t option to specify the timeout value and default value for this option is 600 seconds. But host-deploy calls this command without timeout option. As a result, it is blocked for 10 minutes. Fix: We should update the host-deploy to use a reasonable timeout value so that engine will not timeout. Note: This problem should happen if we set a valid tuned profile at the cluster level. (Originally by Ramesh Nachimuthu)
tuned-adm from 7.2 doesn't provide -t option but, on the other side, it doesn't hang for 10 minutes if the profile is missing. [stirabos@c72he20160830h1 ~]$ time sudo /sbin/tuned-adm profile rhs-virtualization Requested profile 'rhs-virtualization' doesn't exist. real 0m0.442s user 0m0.046s sys 0m0.019s (Originally by Simone Tiraboschi)
I think it also affects fedora hosts for the same reason. (Originally by Simone Tiraboschi)
(In reply to Simone Tiraboschi from comment #2) > tuned-adm from 7.2 doesn't provide -t option but, on the other side, it > doesn't hang for 10 minutes if the profile is missing. > -t option is introduced in RHEL7.3. I think we have to handle both the systems which does and doesn't support this option. > [stirabos@c72he20160830h1 ~]$ time sudo /sbin/tuned-adm profile > rhs-virtualization > Requested profile 'rhs-virtualization' doesn't exist. > > real 0m0.442s > user 0m0.046s > sys 0m0.019s (Originally by Ramesh Nachimuthu)
This seems to be a regression in tuned. See bz#1369502. (Originally by Ramesh Nachimuthu)
How come the profile doesn't exist? What do we expect to happen if a non-existing profile is trying to be set? Does it matter if we fail quickly or time-out? We should set a profile, and it should exist. (Originally by Yaniv Kaul)
(In reply to Yaniv Kaul from comment #6) > How come the profile doesn't exist? What do we expect to happen if a > non-existing profile is trying to be set? Does it matter if we fail quickly > or time-out? We should set a profile, and it should exist. We have some RHEL-6 specific profiles for gluster which are not available in RHEL-7. As a result, we may end up having a non-existent profile. When a non-existent profile is set, we will report a warning from host-deploy but host will come up without any issue. But now because of this regression, it is failing to install the host. Considering oVirt doesn't support RHEL-6 nodes, Let me remove all RHEL-6 specific profiles and change the default profile to a valid RHEL-7 profile. (Originally by Ramesh Nachimuthu)
Is this going to 4.0.5? (Originally by Yaniv Kaul)
RHEL-6 profiles are already removed. But still we will have invalid tuned profiles in following cases. 1. Enable 'Gluster Service' (Edit Cluster) using REST API. REST API is not yet updated to support tuned profile. So we will not have any profile set in this case. 2. No tuned Profile set in 'Edit Cluster' dialog. This happens only for 'Default' cluster. While editing Cluster to enable 'Gluster Service' by default no tuned profile is selected. User has to select a profile explicitly. But for other cluster(Both Edit and New) whenever we enable 'Gluster Service' 'rhgs-sequential-io' is selected by default. 3. Both the tuned profiles 'rhgs-random-io' and 'rhgs-sequential-io' are not available in upstream gluster releases. Workaround: Following workarounds can be used until this is fixed. 1. User has to always ensure that correct profile is set on the Cluster. 2. If the there is no profile set on the cluster or selected profile is not available, then user has to create a dummy tuned profile with the name 'rhs-virtualization' or selected profile name on the hosts. Steps to Create a dummy Tuned profile: 1. Create a folder under folder "/usr/lib/tuned" with your profile name # mkdir /usr/lib/tuned/<profile-name> 2. Create an empty file with name 'tuned.conf' under the above folder. # touch /usr/lib/tuned/<profile-name>/tuned.conf (Originally by Ramesh Nachimuthu)
(In reply to Yaniv Kaul from comment #8) > Is this going to 4.0.5? Considering still we won't have a proper fix which works in all distros and the workaround given in comment#9 I would like to move this bug to 4.0.6 release. (Originally by Ramesh Nachimuthu)
We missed to include it for 4.0.6. But RHEL bug bz#1392942 is fixed so ideally we should not hit this issue anymore in RHEL-7.3. (Originally by Ramesh Nachimuthu)
4.0.6 has been the last oVirt 4.0 release, please re-target this bug. (Originally by Sandro Bonazzola)
(In reply to Sandro Bonazzola from comment #12) > 4.0.6 has been the last oVirt 4.0 release, please re-target this bug. This is already fixed in ovirt-host-deploy-1.5. So this can be moved to ON_QA. (Originally by Ramesh Nachimuthu)
Verified and works fine with ovirt-host-deploy-1.5.4-2.el7ev.noarch with this fix on a gluster cluster when there is no tuned profile selected on the cluster, ovirt-host-deploy does not set anything and host gets added successfully to the cluster. when there is a tuned profile selected on the cluster which does not exist on the system ovirt-host-deploy tries to set and fails which triggers an event message saying "Host host1 installation in progress . Cannot set tuned profile." and host gets added successfully. Once the host is added successfully i do see an error while executing tuned-adm list on the host. Below is the error i see, any idea why this happens ? [root@dhcp37-169 ~]# tuned-adm list 2017-03-07 18:27:00,715 ERROR dbus.proxies: Introspect error on :1.19:/Tuned: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. ERROR:dbus.proxies:Introspect error on :1.19:/Tuned: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. DBus call to Tuned daemon failed Available profiles: - balanced - General non-specialized tuned profile - desktop - Optmize for the desktop use-case - latency-performance - Optimize for deterministic performance at the cost of increased power consumption - network-latency - Optimize for deterministic performance at the cost of increased power consumption, focused on low latency network performance - network-throughput - Optimize for streaming network throughput. Generally only necessary on older CPUs or 40G+ networks. - powersave - Optimize for low power consumption - throughput-performance - Broadly applicable tuning that provides excellent performance across a variety of common server workloads. This is the default profile for RHEL7. - virtual-guest - Optimize for running inside a virtual guest. - virtual-host - Optimize for running KVM guests Current active profile: virtual-guest
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0550.html
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days