Description of problem: Adding additional host to Hosted engine on RHEL7.3 + RHV4.0.3 environment leaves the host in installed-failed state. This issue is caused becaused ovirt-host-deploy tries to set rhs-virtualization profile on the host which is not present in the system and gets stuck there and times out after 10 minutes. Till RHEL7.2 this problem is not seen because if tuned fails to set non existing profiles it just returns back and does not wait for timeout to happen. Version-Release number of selected component (if applicable): RHEL7.3 ovirt-engine-4.0.3-0.1.el7ev.noarch How reproducible: Always Steps to Reproduce: 1. Have three hosts with glusterfs as storage domain 2. Now deploy hosted engine on first host 3. Then try adding the second host from the UI Actual results: Adding second host gets stuck at updating hosted-engine configuration and fails with "Failed to install Host <host_name>. Processing stopped due to timeout." Expected results: second host should be added successfully. Additional info: I strongly suppose that the issue is due to this: 2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd systemd.state:130 starting service tuned 2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:813 execute: ('/bin/systemctl', 'start', 'tuned.service'), executable='None', cwd='None', env=None 2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:863 execute-result: ('/bin/systemctl', 'start', 'tuned.service'), rc=0 2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:921 execute-output: ('/bin/systemctl', 'start', 'tuned.service') stdout: 2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:926 execute-output: ('/bin/systemctl', 'start', 'tuned.service') stderr: 2016-09-20 20:05:01 DEBUG otopi.plugins.ovirt_host_deploy.tune.tuned plugin.executeRaw:813 execute: ('/sbin/tuned-adm', 'profile', 'rhs-virtualization'), executable='None', cwd='None', env=None 2016-09-20 20:15:02 DEBUG otopi.plugins.ovirt_host_deploy.tune.tuned plugin.executeRaw:863 execute-result: ('/sbin/tuned-adm', 'profile', 'rhs-virtualization'), rc=1 2016-09-20 20:15:02 DEBUG otopi.plugins.ovirt_host_deploy.tune.tuned plugin.execute:921 execute-output: ('/sbin/tuned-adm', 'profile', 'rhs-virtualization') stdout: Operation timed out after waiting 600 seconds(s), you may try to increase timeout by using --timeout command line option or using --async. 2016-09-20 20:15:02 DEBUG otopi.plugins.ovirt_host_deploy.tune.tuned plugin.execute:926 execute-output: ('/sbin/tuned-adm', 'profile', 'rhs-virtualization') stderr: Requested profile 'rhs-virtualization' doesn't exist. 2016-09-20 20:15:02 WARNING otopi.plugins.ovirt_host_deploy.tune.tuned tuned._misc:105 Cannot set tuned profile 2016-09-20 20:15:02 DEBUG otopi.context context._executeMethod:128 Stage misc METHOD otopi.plugins.ovirt_host_deploy.vdsm.bridge.Plugin._misc 2016-09-20 20:15:02 DEBUG otopi.context context._executeMethod:134 condition False
This issue is not really about deploying second host in Hosted engine setup. This issue is applicable in all standard 'Add Host' flows to a Cluster with Gluster Service enabled. When host-deploy tries to set a non-existent tuned profile on a RHEL-7.3 system, it is blocked for 600 seconds by default. As a result, engine just times out. Root cause: In RHEL-7.3 /sbin/tuned-adm command has -t option to specify the timeout value and default value for this option is 600 seconds. But host-deploy calls this command without timeout option. As a result, it is blocked for 10 minutes. Fix: We should update the host-deploy to use a reasonable timeout value so that engine will not timeout. Note: This problem should happen if we set a valid tuned profile at the cluster level.
tuned-adm from 7.2 doesn't provide -t option but, on the other side, it doesn't hang for 10 minutes if the profile is missing. [stirabos@c72he20160830h1 ~]$ time sudo /sbin/tuned-adm profile rhs-virtualization Requested profile 'rhs-virtualization' doesn't exist. real 0m0.442s user 0m0.046s sys 0m0.019s
I think it also affects fedora hosts for the same reason.
(In reply to Simone Tiraboschi from comment #2) > tuned-adm from 7.2 doesn't provide -t option but, on the other side, it > doesn't hang for 10 minutes if the profile is missing. > -t option is introduced in RHEL7.3. I think we have to handle both the systems which does and doesn't support this option. > [stirabos@c72he20160830h1 ~]$ time sudo /sbin/tuned-adm profile > rhs-virtualization > Requested profile 'rhs-virtualization' doesn't exist. > > real 0m0.442s > user 0m0.046s > sys 0m0.019s
This seems to be a regression in tuned. See bz#1369502.
How come the profile doesn't exist? What do we expect to happen if a non-existing profile is trying to be set? Does it matter if we fail quickly or time-out? We should set a profile, and it should exist.
(In reply to Yaniv Kaul from comment #6) > How come the profile doesn't exist? What do we expect to happen if a > non-existing profile is trying to be set? Does it matter if we fail quickly > or time-out? We should set a profile, and it should exist. We have some RHEL-6 specific profiles for gluster which are not available in RHEL-7. As a result, we may end up having a non-existent profile. When a non-existent profile is set, we will report a warning from host-deploy but host will come up without any issue. But now because of this regression, it is failing to install the host. Considering oVirt doesn't support RHEL-6 nodes, Let me remove all RHEL-6 specific profiles and change the default profile to a valid RHEL-7 profile.
Is this going to 4.0.5?
RHEL-6 profiles are already removed. But still we will have invalid tuned profiles in following cases. 1. Enable 'Gluster Service' (Edit Cluster) using REST API. REST API is not yet updated to support tuned profile. So we will not have any profile set in this case. 2. No tuned Profile set in 'Edit Cluster' dialog. This happens only for 'Default' cluster. While editing Cluster to enable 'Gluster Service' by default no tuned profile is selected. User has to select a profile explicitly. But for other cluster(Both Edit and New) whenever we enable 'Gluster Service' 'rhgs-sequential-io' is selected by default. 3. Both the tuned profiles 'rhgs-random-io' and 'rhgs-sequential-io' are not available in upstream gluster releases. Workaround: Following workarounds can be used until this is fixed. 1. User has to always ensure that correct profile is set on the Cluster. 2. If the there is no profile set on the cluster or selected profile is not available, then user has to create a dummy tuned profile with the name 'rhs-virtualization' or selected profile name on the hosts. Steps to Create a dummy Tuned profile: 1. Create a folder under folder "/usr/lib/tuned" with your profile name # mkdir /usr/lib/tuned/<profile-name> 2. Create an empty file with name 'tuned.conf' under the above folder. # touch /usr/lib/tuned/<profile-name>/tuned.conf
(In reply to Yaniv Kaul from comment #8) > Is this going to 4.0.5? Considering still we won't have a proper fix which works in all distros and the workaround given in comment#9 I would like to move this bug to 4.0.6 release.
We missed to include it for 4.0.6. But RHEL bug bz#1392942 is fixed so ideally we should not hit this issue anymore in RHEL-7.3.
4.0.6 has been the last oVirt 4.0 release, please re-target this bug.
(In reply to Sandro Bonazzola from comment #12) > 4.0.6 has been the last oVirt 4.0 release, please re-target this bug. This is already fixed in ovirt-host-deploy-1.5. So this can be moved to ON_QA.
Verified and works fine with ovirt-host-deploy-1.5.4-2.el7ev.noarch with this fix on a gluster cluster when there is no tuned profile selected on the cluster, ovirt-host-deploy does not set anything and host gets added successfully to the cluster. when there is a tuned profile selected on the cluster which does not exist on the system ovirt-host-deploy tries to set and fails which triggers an event message saying "Host host1 installation in progress . Cannot set tuned profile." and host gets added successfully. Once the host is added successfully i do see an error while executing tuned-adm list on the host. Below is the error i see, any idea why this happens ? [root@dhcp37-169 ~]# tuned-adm list 2017-03-07 18:27:00,715 ERROR dbus.proxies: Introspect error on :1.19:/Tuned: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. ERROR:dbus.proxies:Introspect error on :1.19:/Tuned: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. DBus call to Tuned daemon failed Available profiles: - balanced - General non-specialized tuned profile - desktop - Optmize for the desktop use-case - latency-performance - Optimize for deterministic performance at the cost of increased power consumption - network-latency - Optimize for deterministic performance at the cost of increased power consumption, focused on low latency network performance - network-throughput - Optimize for streaming network throughput. Generally only necessary on older CPUs or 40G+ networks. - powersave - Optimize for low power consumption - throughput-performance - Broadly applicable tuning that provides excellent performance across a variety of common server workloads. This is the default profile for RHEL7. - virtual-guest - Optimize for running inside a virtual guest. - virtual-host - Optimize for running KVM guests Current active profile: virtual-guest
(In reply to RamaKasturi from comment #15) > Verified and works fine with ovirt-host-deploy-1.5.4-2.el7ev.noarch > > with this fix on a gluster cluster when there is no tuned profile selected > on the cluster, ovirt-host-deploy does not set anything and host gets added > successfully to the cluster. > > when there is a tuned profile selected on the cluster which does not exist > on the system ovirt-host-deploy tries to set and fails which triggers an > event message saying "Host host1 installation in progress . Cannot set > tuned profile." and host gets added successfully. > > Once the host is added successfully i do see an error while executing > tuned-adm list on the host. Below is the error i see, any idea why this > happens ? > > [root@dhcp37-169 ~]# tuned-adm list > 2017-03-07 18:27:00,715 ERROR dbus.proxies: Introspect error on > :1.19:/Tuned: dbus.exceptions.DBusException: > org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes > include: the remote application did not send a reply, the message bus > security policy blocked the reply, the reply timeout expired, or the network > connection was broken. > ERROR:dbus.proxies:Introspect error on :1.19:/Tuned: > dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did not > receive a reply. Possible causes include: the remote application did not > send a reply, the message bus security policy blocked the reply, the reply > timeout expired, or the network connection was broken. > DBus call to Tuned daemon failed > Available profiles: > - balanced - General non-specialized tuned profile > - desktop - Optmize for the desktop use-case > - latency-performance - Optimize for deterministic performance at > the cost of increased power consumption > - network-latency - Optimize for deterministic performance at > the cost of increased power consumption, focused on low latency network > performance > - network-throughput - Optimize for streaming network throughput. > Generally only necessary on older CPUs or 40G+ networks. > - powersave - Optimize for low power consumption > - throughput-performance - Broadly applicable tuning that provides > excellent performance across a variety of common server workloads. This is > the default profile for RHEL7. > - virtual-guest - Optimize for running inside a virtual guest. > - virtual-host - Optimize for running KVM guests > Current active profile: virtual-guest I am not sure why we see this error. This could be an issue with 'tuned-adm' command it slef. Is the command 'tuned-adm list' is working before trying to set an invalid profile? what happens when you try set an invalid profile using tuned-adm command?.