Bug 1111268

Summary: Auto-config fails to execute as NRPE is NOT set to restart after 'Add Host'
Product: [oVirt] ovirt-host-deploy Reporter: Chris Pelland <cpelland>
Component: Plugins.GlusterAssignee: Alon Bar-Lev <alonbl>
Status: CLOSED CURRENTRELEASE QA Contact: SATHEESARAN <sasundar>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 1.2.0CC: acathrow, adahms, alonbl, bazulay, bugs, cpelland, dnarayan, dougsland, dpati, eedri, esammons, gklein, iheim, nlevinki, pprakash, pstehlik, Rhev-m-bugs, rhs-bugs, rhsc-qe-bugs, sasundar, sherold, yeylon
Target Milestone: ---Keywords: ZStream
Target Release: 1.2.1Flags: cpelland: devel_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Previously, attempting to run the auto-configuration script on a Red Hat Storage controller would fail under certain conditions, reporting an SSL handshake error. This would occur when Red Hat Storage nodes were edited to a Red Hat Enterprise Virtualization environment, the NRPE configuration was edited during the host deployment operation and the NRPE service was not updated following those changes. Now, the NRPE service is restarted if the NRPE configuration is changed during a host deployment operation to ensure any configuration changes are correctly updated, making it possible to perform auto-configuration successfully under these conditions.
Story Points: ---
Clone Of: 1111053 Environment:
Last Closed: 2014-09-04 15:17:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1111053    
Bug Blocks: 1110623, 1123858    

Description Chris Pelland 2014-06-19 15:21:13 UTC
+++ This bug was initially created as a clone of Bug #1111053 +++

+++ This bug was initially created as a clone of Bug #1110623 +++

Description of problem:

Auto-config fails with "Error : CHECK_NRPE: Error - Could not complete SSL handshake" when RHSC engine uses a resolvable hostname. See below:

-----------
#  /usr/lib64/nagios/plugins/gluster/discovery.py -c 34cluster -H 10.70.42.229
Failed to execute NRPE command 'discoverhostparams' in host '10.70.42.203' 
Error : CHECK_NRPE: Error - Could not complete SSL handshake.
Make sure NPRE server in host '10.70.42.203' is configured to accept requests from Nagios server
-----------

Version-Release number of selected component (if applicable):

rhsc-3.0.0-0.10.el6_5.noarch
nagios-server-addons-0.1.3-3.el6rhs.x86_64


How reproducible: 100%


Steps to Reproduce:
1. Install and setup RHSC + Nagios Server by following http://rhsm.pad.engineering.redhat.com/rhsc-nagios-release-denali-7
2. Make sure that the RHSC engine is having a DNS resolvable hostname (EX: dhcp43-180.lab.eng.blr.redhat.com)
3. Add a few RHS nodes to a 3.4 cluster from the UI
4. Now, execute the following auto-config script from the engine:
 # /usr/lib64/nagios/plugins/gluster/discovery.py -c <cluster-name> -H <ip-address>


Actual results: Auto-config script fails with the error mentioned above.


Expected results: Auto-config script should execute sucessfully and detect the changes in the cluster configurations.


Additional info: Restarting nrpe in ALL the RHS nodes seems to resolve the issue.

--- Additional comment from Alon Bar-Lev on 2014-06-18 10:31:21 EDT ---

Please move/duplicate to rhev/ovirt-host-deploy so I can add this to errata.

--- Additional comment from Pavel Stehlik on 2014-06-19 05:30:12 EDT ---

Guys, are you able to verify this? 
We don't have environment for testing this.
Thank you, P.

--- Additional comment from errata-xmlrpc on 2014-06-19 09:39:21 EDT ---

Bug report changed to ON_QA status by Errata System.
A QE request has been submitted for advisory RHBA-2014:18082-01
https://errata.devel.redhat.com/advisory/18082

Comment 3 Eyal Edri 2014-08-05 07:53:21 UTC
pavel, why was this bug removed from errata for 3.4.1?
afaiu this fix was already released with ovirt-host-deploy.
wasn't it verified?

Comment 5 SATHEESARAN 2014-08-27 07:10:59 UTC
Verified this bug with RHS 3.0 RC ( glusterfs-3.6.0.27-1.el6rhs ) and
RHEVM 3.4.2 ( av11 ) 3.4.2-0.1.el6ev

1. Stopped nrpe in RHSS Node
2. Added the node to RHEV
Observation - Found that nrpe got started after adding the node to RHEVM

Also I see that nrpe was listening to port 5666
[Wed Aug 27 06:34:20 UTC 2014 root.37.138:~ ] # netstat -tulp | grep 5666
tcp        0      0 *:5666                      *:*                         LISTEN      17205/nrpe   

Based on the above observation, marking this bug as VERIFIED