Bug 1425759 - [downstream clone - 4.0.7] Unable to add host when tuned profile applied is not present on RHEL7.3 node
Summary: [downstream clone - 4.0.7] Unable to add host when tuned profile applied is n...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-host-deploy
Version: unspecified
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.0.7
: ---
Assignee: Ramesh N
QA Contact: RamaKasturi
URL:
Whiteboard:
Depends On: 1378087
Blocks: 1425756
TreeView+ depends on / blocked
 
Reported: 2017-02-22 10:32 UTC by rhev-integ
Modified: 2023-09-14 03:54 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, when there was no tuned profile defined on the Gluster cluster, ovirt-host-deploy used an old nonexistent profile called 'rhs-virtualization', and the 'tuned-adm profile NONEXISTENT-PROFILE' command returned to the shell after timeout, which blocked the ovirt-host-deploy execution for 10 minutes. This meant that adding a host failed when there was no tuned profile selected on the cluster. Now, ovirt-host-deploy does not try to set a tuned profile when there is no tuned profile defined on the Gluster cluster, so hosts can be added to a Gluster cluster when there is no tuned profile defined for the cluster.
Clone Of: 1378087
Environment:
Last Closed: 2017-03-16 15:41:13 UTC
oVirt Team: Gluster
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0550 0 normal SHIPPED_LIVE ovirt-host-deploy bug fix update for RHV 4.0.7 2017-03-16 19:27:10 UTC
oVirt gerrit 65836 0 master ABANDONED engine: set default tuned profile in create cluster 2017-02-22 10:33:44 UTC
oVirt gerrit 65838 0 master POST add tuned profile to cluster model 2017-02-22 10:33:44 UTC
oVirt gerrit 65845 0 master MERGED restapi: add tuned profile to REST API 2017-02-22 10:33:44 UTC
oVirt gerrit 65966 0 master POST Remove default tuned profile for Gluster 2017-02-22 10:33:44 UTC
oVirt gerrit 66160 0 ovirt-host-deploy-1.5 POST Remove default tuned profile for Gluster 2017-02-22 10:33:44 UTC

Description rhev-integ 2017-02-22 10:32:00 UTC
+++ This bug is an upstream to downstream clone. The original bug is: +++
+++   bug 1378087 +++
======================================================================

Description of problem:

Adding additional host to Hosted engine on RHEL7.3 + RHV4.0.3 environment leaves the host in installed-failed state.

This issue is caused becaused ovirt-host-deploy tries to set rhs-virtualization profile on the host which is not present in the system and gets stuck there and times out after 10 minutes. Till RHEL7.2 this problem is not seen because if tuned fails to set non existing profiles it just returns back and does not wait for timeout to happen. 

Version-Release number of selected component (if applicable):
RHEL7.3
ovirt-engine-4.0.3-0.1.el7ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Have three hosts with glusterfs as storage domain
2. Now deploy hosted engine on first host
3. Then try adding the second host from the UI

Actual results:
Adding second host gets stuck at updating hosted-engine configuration and fails with "Failed to install Host <host_name>. Processing stopped due to timeout."

Expected results:
second host should be added successfully.

Additional info:

I strongly suppose that the issue is due to this:

2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd systemd.state:130 starting service tuned
2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:813 execute: ('/bin/systemctl', 'start', 'tuned.service'), executable='None', cwd='None', env=None
2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:863 execute-result: ('/bin/systemctl', 'start', 'tuned.service'), rc=0
2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:921 execute-output: ('/bin/systemctl', 'start', 'tuned.service') stdout:


2016-09-20 20:05:01 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:926 execute-output: ('/bin/systemctl', 'start', 'tuned.service') stderr:


2016-09-20 20:05:01 DEBUG otopi.plugins.ovirt_host_deploy.tune.tuned plugin.executeRaw:813 execute: ('/sbin/tuned-adm', 'profile', 'rhs-virtualization'), executable='None', cwd='None', env=None
2016-09-20 20:15:02 DEBUG otopi.plugins.ovirt_host_deploy.tune.tuned plugin.executeRaw:863 execute-result: ('/sbin/tuned-adm', 'profile', 'rhs-virtualization'), rc=1
2016-09-20 20:15:02 DEBUG otopi.plugins.ovirt_host_deploy.tune.tuned plugin.execute:921 execute-output: ('/sbin/tuned-adm', 'profile', 'rhs-virtualization') stdout:
Operation timed out after waiting 600 seconds(s), you may try to increase timeout by using --timeout command line option or using --async.

2016-09-20 20:15:02 DEBUG otopi.plugins.ovirt_host_deploy.tune.tuned plugin.execute:926 execute-output: ('/sbin/tuned-adm', 'profile', 'rhs-virtualization') stderr:
Requested profile 'rhs-virtualization' doesn't exist.

2016-09-20 20:15:02 WARNING otopi.plugins.ovirt_host_deploy.tune.tuned tuned._misc:105 Cannot set tuned profile
2016-09-20 20:15:02 DEBUG otopi.context context._executeMethod:128 Stage misc METHOD otopi.plugins.ovirt_host_deploy.vdsm.bridge.Plugin._misc
2016-09-20 20:15:02 DEBUG otopi.context context._executeMethod:134 condition False

(Originally by Kasturi Narra)

Comment 1 rhev-integ 2017-02-22 10:32:11 UTC
This issue is not really about deploying second host in Hosted engine setup. This issue is applicable in all standard 'Add Host' flows to a Cluster with Gluster Service enabled. When host-deploy tries to set a non-existent tuned profile on a RHEL-7.3 system, it is blocked for 600 seconds by default. As a result, engine just times out. 

Root cause: In RHEL-7.3 /sbin/tuned-adm command has -t option to specify the timeout value and default value for this option is 600 seconds. But host-deploy calls this command without timeout option. As a result, it is blocked for 10 minutes.

Fix: We should update the host-deploy to use a reasonable timeout value so that engine will not timeout.


Note: This problem should happen if we set a valid tuned profile at the cluster level.

(Originally by Ramesh Nachimuthu)

Comment 3 rhev-integ 2017-02-22 10:32:18 UTC
tuned-adm from 7.2 doesn't provide -t option but, on the other side, it doesn't hang for 10 minutes if the profile is missing.

[stirabos@c72he20160830h1 ~]$ time sudo /sbin/tuned-adm profile rhs-virtualization
Requested profile 'rhs-virtualization' doesn't exist.

real	0m0.442s
user	0m0.046s
sys	0m0.019s

(Originally by Simone Tiraboschi)

Comment 4 rhev-integ 2017-02-22 10:32:26 UTC
I think it also affects fedora hosts for the same reason.

(Originally by Simone Tiraboschi)

Comment 5 rhev-integ 2017-02-22 10:32:32 UTC
(In reply to Simone Tiraboschi from comment #2)
> tuned-adm from 7.2 doesn't provide -t option but, on the other side, it
> doesn't hang for 10 minutes if the profile is missing.
>

-t option is introduced in RHEL7.3. I think we have to handle both the systems which does and doesn't support this option.


> [stirabos@c72he20160830h1 ~]$ time sudo /sbin/tuned-adm profile
> rhs-virtualization
> Requested profile 'rhs-virtualization' doesn't exist.
> 
> real	0m0.442s
> user	0m0.046s
> sys	0m0.019s

(Originally by Ramesh Nachimuthu)

Comment 6 rhev-integ 2017-02-22 10:32:39 UTC
This seems to be a regression in tuned. See bz#1369502.

(Originally by Ramesh Nachimuthu)

Comment 7 rhev-integ 2017-02-22 10:32:45 UTC
How come the profile doesn't exist? What do we expect to happen if a non-existing profile is trying to be set? Does it matter if we fail quickly or time-out? We should set a profile, and it should exist.

(Originally by Yaniv Kaul)

Comment 8 rhev-integ 2017-02-22 10:32:51 UTC
(In reply to Yaniv Kaul from comment #6)
> How come the profile doesn't exist? What do we expect to happen if a
> non-existing profile is trying to be set? Does it matter if we fail quickly
> or time-out? We should set a profile, and it should exist.

We have some RHEL-6 specific profiles for gluster which are not available in RHEL-7. As a result, we may end up having a non-existent profile. When a non-existent profile is set, we will report a warning from host-deploy but host will come up without any issue. But now because of this regression, it is failing to install the host.


Considering oVirt doesn't support RHEL-6 nodes, Let me remove all RHEL-6 specific profiles and change the default profile to a valid RHEL-7 profile.

(Originally by Ramesh Nachimuthu)

Comment 9 rhev-integ 2017-02-22 10:32:58 UTC
Is this going to 4.0.5?

(Originally by Yaniv Kaul)

Comment 10 rhev-integ 2017-02-22 10:33:05 UTC
RHEL-6 profiles are already removed. But still we will have invalid tuned profiles in following cases.

1. Enable 'Gluster Service' (Edit Cluster) using REST API. REST API is not yet updated to support tuned profile. So we will not have any profile set in this case.

2. No tuned Profile set in 'Edit Cluster' dialog. This happens only for 'Default' cluster. While editing Cluster to enable 'Gluster Service' by default no tuned profile is selected. User has to select a profile explicitly. But for other cluster(Both Edit and New) whenever we enable 'Gluster Service' 'rhgs-sequential-io' is selected by default.

3. Both the tuned profiles 'rhgs-random-io' and 'rhgs-sequential-io' are not available in upstream gluster releases.

Workaround:

Following workarounds can be used until this is fixed.

1. User has to always ensure that correct profile is set on the Cluster.
2. If the there is no profile set on the cluster or selected profile is not available, then user has to create a dummy tuned profile with the name 'rhs-virtualization' or selected profile name on the hosts.
 
 Steps to Create a dummy Tuned profile:

1. Create a folder under folder "/usr/lib/tuned" with your profile name
   # mkdir /usr/lib/tuned/<profile-name>
2. Create an empty file with name 'tuned.conf' under the above folder.
   # touch  /usr/lib/tuned/<profile-name>/tuned.conf

(Originally by Ramesh Nachimuthu)

Comment 11 rhev-integ 2017-02-22 10:33:12 UTC
(In reply to Yaniv Kaul from comment #8)
> Is this going to 4.0.5?

Considering still we won't have a proper fix which works in all distros and the workaround given in comment#9 I would like to move this bug to 4.0.6 release.

(Originally by Ramesh Nachimuthu)

Comment 12 rhev-integ 2017-02-22 10:33:18 UTC
We missed to include it for 4.0.6. But RHEL bug bz#1392942 is fixed so ideally we should not hit this issue anymore in RHEL-7.3.

(Originally by Ramesh Nachimuthu)

Comment 13 rhev-integ 2017-02-22 10:33:25 UTC
4.0.6 has been the last oVirt 4.0 release, please re-target this bug.

(Originally by Sandro Bonazzola)

Comment 14 rhev-integ 2017-02-22 10:33:31 UTC
(In reply to Sandro Bonazzola from comment #12)
> 4.0.6 has been the last oVirt 4.0 release, please re-target this bug.

This is already fixed in ovirt-host-deploy-1.5. So this can be moved to ON_QA.

(Originally by Ramesh Nachimuthu)

Comment 17 RamaKasturi 2017-03-07 13:12:53 UTC
Verified and works fine with ovirt-host-deploy-1.5.4-2.el7ev.noarch

with this fix on a gluster cluster when there is no tuned profile selected on the cluster, ovirt-host-deploy does not set anything and host gets added successfully to the cluster.

when there is a tuned profile selected on the cluster which does not exist on the system   ovirt-host-deploy tries to set and fails which triggers an event message saying  "Host host1 installation in progress . Cannot set tuned profile." and host gets added successfully.

Once the host is added successfully i do see an error while executing tuned-adm list on the host. Below is the error i see, any idea why this happens ?

[root@dhcp37-169 ~]# tuned-adm list
2017-03-07 18:27:00,715 ERROR    dbus.proxies: Introspect error on :1.19:/Tuned: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
ERROR:dbus.proxies:Introspect error on :1.19:/Tuned: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
DBus call to Tuned daemon failed
Available profiles:
- balanced                    - General non-specialized tuned profile
- desktop                     - Optmize for the desktop use-case
- latency-performance         - Optimize for deterministic performance at the cost of increased power consumption
- network-latency             - Optimize for deterministic performance at the cost of increased power consumption, focused on low latency network performance
- network-throughput          - Optimize for streaming network throughput.  Generally only necessary on older CPUs or 40G+ networks.
- powersave                   - Optimize for low power consumption
- throughput-performance      - Broadly applicable tuning that provides excellent performance across a variety of common server workloads.  This is the default profile for RHEL7.
- virtual-guest               - Optimize for running inside a virtual guest.
- virtual-host                - Optimize for running KVM guests
Current active profile: virtual-guest

Comment 19 errata-xmlrpc 2017-03-16 15:41:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0550.html

Comment 20 Red Hat Bugzilla 2023-09-14 03:54:02 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.