Bug 1378087 - Unable to add host when tuned profile applied is not present on RHEL7.3 node
Summary: Unable to add host when tuned profile applied is not present on RHEL7.3 node
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-host-deploy
Classification: oVirt
Component: Plugins.tune
Version: 1.5.3
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.1.0-rc
: ---
Assignee: Ramesh N
QA Contact: RamaKasturi
URL:
Whiteboard:
Depends On: 1369502
Blocks: 1425759
TreeView+ depends on / blocked
 
Reported: 2016-09-21 13:04 UTC by RamaKasturi
Modified: 2017-03-16 14:51 UTC (History)
6 users (show)

Fixed In Version: ovirt-host-deploy-1.5
Clone Of:
: 1425759 (view as bug list)
Environment:
Last Closed: 2017-03-16 14:51:04 UTC
oVirt Team: Gluster
Embargoed:
rule-engine: ovirt-4.1+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1369502 0 high CLOSED tuned-adm profile NONEXISTENT returns to the shell after timeout 2021-02-22 00:41:40 UTC
oVirt gerrit 65836 0 master ABANDONED engine: set default tuned profile in create cluster 2021-01-28 07:51:56 UTC
oVirt gerrit 65838 0 master MERGED add tuned profile to cluster model 2021-01-28 07:51:56 UTC
oVirt gerrit 65845 0 master MERGED restapi: add tuned profile to REST API 2021-01-28 07:51:56 UTC
oVirt gerrit 65966 0 master MERGED Remove default tuned profile for Gluster 2021-01-28 07:51:56 UTC
oVirt gerrit 66160 0 ovirt-host-deploy-1.5 MERGED Remove default tuned profile for Gluster 2021-01-28 07:51:56 UTC

Internal Links: 1369502

Description RamaKasturi 2016-09-21 13:04:16 UTC
Description of problem:

Adding additional host to Hosted engine on RHEL7.3 + RHV4.0.3 environment leaves the host in installed-failed state.

This issue is caused becaused ovirt-host-deploy tries to set rhs-virtualization profile on the host which is not present in the system and gets stuck there and times out after 10 minutes. Till RHEL7.2 this problem is not seen because if tuned fails to set non existing profiles it just returns back and does not wait for timeout to happen. 

Version-Release number of selected component (if applicable):
RHEL7.3
ovirt-engine-4.0.3-0.1.el7ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Have three hosts with glusterfs as storage domain
2. Now deploy hosted engine on first host
3. Then try adding the second host from the UI

Actual results:
Adding second host gets stuck at updating hosted-engine configuration and fails with "Failed to install Host <host_name>. Processing stopped due to timeout."

Expected results:
second host should be added successfully.

Additional info:

I strongly suppose that the issue is due to this:

2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd systemd.state:130 starting service tuned
2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:813 execute: ('/bin/systemctl', 'start', 'tuned.service'), executable='None', cwd='None', env=None
2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:863 execute-result: ('/bin/systemctl', 'start', 'tuned.service'), rc=0
2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:921 execute-output: ('/bin/systemctl', 'start', 'tuned.service') stdout:


2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:926 execute-output: ('/bin/systemctl', 'start', 'tuned.service') stderr:


2016-09-20 20:05:01 DEBUG otopi.plugins.ovirt_host_deploy.tune.tuned plugin.executeRaw:813 execute: ('/sbin/tuned-adm', 'profile', 'rhs-virtualization'), executable='None', cwd='None', env=None
2016-09-20 20:15:02 DEBUG otopi.plugins.ovirt_host_deploy.tune.tuned plugin.executeRaw:863 execute-result: ('/sbin/tuned-adm', 'profile', 'rhs-virtualization'), rc=1
2016-09-20 20:15:02 DEBUG otopi.plugins.ovirt_host_deploy.tune.tuned plugin.execute:921 execute-output: ('/sbin/tuned-adm', 'profile', 'rhs-virtualization') stdout:
Operation timed out after waiting 600 seconds(s), you may try to increase timeout by using --timeout command line option or using --async.

2016-09-20 20:15:02 DEBUG otopi.plugins.ovirt_host_deploy.tune.tuned plugin.execute:926 execute-output: ('/sbin/tuned-adm', 'profile', 'rhs-virtualization') stderr:
Requested profile 'rhs-virtualization' doesn't exist.

2016-09-20 20:15:02 WARNING otopi.plugins.ovirt_host_deploy.tune.tuned tuned._misc:105 Cannot set tuned profile
2016-09-20 20:15:02 DEBUG otopi.context context._executeMethod:128 Stage misc METHOD otopi.plugins.ovirt_host_deploy.vdsm.bridge.Plugin._misc
2016-09-20 20:15:02 DEBUG otopi.context context._executeMethod:134 condition False

Comment 1 Ramesh N 2016-09-21 13:18:33 UTC
This issue is not really about deploying second host in Hosted engine setup. This issue is applicable in all standard 'Add Host' flows to a Cluster with Gluster Service enabled. When host-deploy tries to set a non-existent tuned profile on a RHEL-7.3 system, it is blocked for 600 seconds by default. As a result, engine just times out. 

Root cause: In RHEL-7.3 /sbin/tuned-adm command has -t option to specify the timeout value and default value for this option is 600 seconds. But host-deploy calls this command without timeout option. As a result, it is blocked for 10 minutes.

Fix: We should update the host-deploy to use a reasonable timeout value so that engine will not timeout.


Note: This problem should happen if we set a valid tuned profile at the cluster level.

Comment 2 Simone Tiraboschi 2016-09-21 13:56:15 UTC
tuned-adm from 7.2 doesn't provide -t option but, on the other side, it doesn't hang for 10 minutes if the profile is missing.

[stirabos@c72he20160830h1 ~]$ time sudo /sbin/tuned-adm profile rhs-virtualization
Requested profile 'rhs-virtualization' doesn't exist.

real	0m0.442s
user	0m0.046s
sys	0m0.019s

Comment 3 Simone Tiraboschi 2016-09-21 14:05:16 UTC
I think it also affects fedora hosts for the same reason.

Comment 4 Ramesh N 2016-09-21 14:15:07 UTC
(In reply to Simone Tiraboschi from comment #2)
> tuned-adm from 7.2 doesn't provide -t option but, on the other side, it
> doesn't hang for 10 minutes if the profile is missing.
>

-t option is introduced in RHEL7.3. I think we have to handle both the systems which does and doesn't support this option.


> [stirabos@c72he20160830h1 ~]$ time sudo /sbin/tuned-adm profile
> rhs-virtualization
> Requested profile 'rhs-virtualization' doesn't exist.
> 
> real	0m0.442s
> user	0m0.046s
> sys	0m0.019s

Comment 5 Ramesh N 2016-10-24 10:42:03 UTC
This seems to be a regression in tuned. See bz#1369502.

Comment 6 Yaniv Kaul 2016-10-25 08:28:35 UTC
How come the profile doesn't exist? What do we expect to happen if a non-existing profile is trying to be set? Does it matter if we fail quickly or time-out? We should set a profile, and it should exist.

Comment 7 Ramesh N 2016-10-25 09:39:55 UTC
(In reply to Yaniv Kaul from comment #6)
> How come the profile doesn't exist? What do we expect to happen if a
> non-existing profile is trying to be set? Does it matter if we fail quickly
> or time-out? We should set a profile, and it should exist.

We have some RHEL-6 specific profiles for gluster which are not available in RHEL-7. As a result, we may end up having a non-existent profile. When a non-existent profile is set, we will report a warning from host-deploy but host will come up without any issue. But now because of this regression, it is failing to install the host.


Considering oVirt doesn't support RHEL-6 nodes, Let me remove all RHEL-6 specific profiles and change the default profile to a valid RHEL-7 profile.

Comment 8 Yaniv Kaul 2016-10-30 09:19:26 UTC
Is this going to 4.0.5?

Comment 9 Ramesh N 2016-11-02 09:12:12 UTC
RHEL-6 profiles are already removed. But still we will have invalid tuned profiles in following cases.

1. Enable 'Gluster Service' (Edit Cluster) using REST API. REST API is not yet updated to support tuned profile. So we will not have any profile set in this case.

2. No tuned Profile set in 'Edit Cluster' dialog. This happens only for 'Default' cluster. While editing Cluster to enable 'Gluster Service' by default no tuned profile is selected. User has to select a profile explicitly. But for other cluster(Both Edit and New) whenever we enable 'Gluster Service' 'rhgs-sequential-io' is selected by default.

3. Both the tuned profiles 'rhgs-random-io' and 'rhgs-sequential-io' are not available in upstream gluster releases.

Workaround:

Following workarounds can be used until this is fixed.

1. User has to always ensure that correct profile is set on the Cluster.
2. If the there is no profile set on the cluster or selected profile is not available, then user has to create a dummy tuned profile with the name 'rhs-virtualization' or selected profile name on the hosts.
 
 Steps to Create a dummy Tuned profile:

1. Create a folder under folder "/usr/lib/tuned" with your profile name
   # mkdir /usr/lib/tuned/<profile-name>
2. Create an empty file with name 'tuned.conf' under the above folder.
   # touch  /usr/lib/tuned/<profile-name>/tuned.conf

Comment 10 Ramesh N 2016-11-02 09:14:47 UTC
(In reply to Yaniv Kaul from comment #8)
> Is this going to 4.0.5?

Considering still we won't have a proper fix which works in all distros and the workaround given in comment#9 I would like to move this bug to 4.0.6 release.

Comment 11 Ramesh N 2016-12-12 12:17:46 UTC
We missed to include it for 4.0.6. But RHEL bug bz#1392942 is fixed so ideally we should not hit this issue anymore in RHEL-7.3.

Comment 12 Sandro Bonazzola 2017-01-25 07:57:12 UTC
4.0.6 has been the last oVirt 4.0 release, please re-target this bug.

Comment 13 Ramesh N 2017-01-25 08:09:43 UTC
(In reply to Sandro Bonazzola from comment #12)
> 4.0.6 has been the last oVirt 4.0 release, please re-target this bug.

This is already fixed in ovirt-host-deploy-1.5. So this can be moved to ON_QA.

Comment 15 RamaKasturi 2017-03-14 13:58:37 UTC
Verified and works fine with ovirt-host-deploy-1.5.4-2.el7ev.noarch

with this fix on a gluster cluster when there is no tuned profile selected on the cluster, ovirt-host-deploy does not set anything and host gets added successfully to the cluster.

when there is a tuned profile selected on the cluster which does not exist on the system   ovirt-host-deploy tries to set and fails which triggers an event message saying  "Host host1 installation in progress . Cannot set tuned profile." and host gets added successfully.

Once the host is added successfully i do see an error while executing tuned-adm list on the host. Below is the error i see, any idea why this happens ?

[root@dhcp37-169 ~]# tuned-adm list
2017-03-07 18:27:00,715 ERROR    dbus.proxies: Introspect error on :1.19:/Tuned: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
ERROR:dbus.proxies:Introspect error on :1.19:/Tuned: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
DBus call to Tuned daemon failed
Available profiles:
- balanced                    - General non-specialized tuned profile
- desktop                     - Optmize for the desktop use-case
- latency-performance         - Optimize for deterministic performance at the cost of increased power consumption
- network-latency             - Optimize for deterministic performance at the cost of increased power consumption, focused on low latency network performance
- network-throughput          - Optimize for streaming network throughput.  Generally only necessary on older CPUs or 40G+ networks.
- powersave                   - Optimize for low power consumption
- throughput-performance      - Broadly applicable tuning that provides excellent performance across a variety of common server workloads.  This is the default profile for RHEL7.
- virtual-guest               - Optimize for running inside a virtual guest.
- virtual-host                - Optimize for running KVM guests
Current active profile: virtual-guest

Comment 16 Ramesh N 2017-03-15 05:35:30 UTC
(In reply to RamaKasturi from comment #15)
> Verified and works fine with ovirt-host-deploy-1.5.4-2.el7ev.noarch
> 
> with this fix on a gluster cluster when there is no tuned profile selected
> on the cluster, ovirt-host-deploy does not set anything and host gets added
> successfully to the cluster.
> 
> when there is a tuned profile selected on the cluster which does not exist
> on the system   ovirt-host-deploy tries to set and fails which triggers an
> event message saying  "Host host1 installation in progress . Cannot set
> tuned profile." and host gets added successfully.
> 
> Once the host is added successfully i do see an error while executing
> tuned-adm list on the host. Below is the error i see, any idea why this
> happens ?
> 
> [root@dhcp37-169 ~]# tuned-adm list
> 2017-03-07 18:27:00,715 ERROR    dbus.proxies: Introspect error on
> :1.19:/Tuned: dbus.exceptions.DBusException:
> org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes
> include: the remote application did not send a reply, the message bus
> security policy blocked the reply, the reply timeout expired, or the network
> connection was broken.
> ERROR:dbus.proxies:Introspect error on :1.19:/Tuned:
> dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did not
> receive a reply. Possible causes include: the remote application did not
> send a reply, the message bus security policy blocked the reply, the reply
> timeout expired, or the network connection was broken.
> DBus call to Tuned daemon failed
> Available profiles:
> - balanced                    - General non-specialized tuned profile
> - desktop                     - Optmize for the desktop use-case
> - latency-performance         - Optimize for deterministic performance at
> the cost of increased power consumption
> - network-latency             - Optimize for deterministic performance at
> the cost of increased power consumption, focused on low latency network
> performance
> - network-throughput          - Optimize for streaming network throughput. 
> Generally only necessary on older CPUs or 40G+ networks.
> - powersave                   - Optimize for low power consumption
> - throughput-performance      - Broadly applicable tuning that provides
> excellent performance across a variety of common server workloads.  This is
> the default profile for RHEL7.
> - virtual-guest               - Optimize for running inside a virtual guest.
> - virtual-host                - Optimize for running KVM guests
> Current active profile: virtual-guest

I am not sure why we see this error. This could be an issue with 'tuned-adm' command it  slef. Is the command 'tuned-adm list' is working before trying to set an invalid profile? what happens when you try set an invalid profile using tuned-adm command?.


Note You need to log in before you can comment on or make changes to this bug.