Bug 1405447

Summary: [RFE] Provide the configuration check for Grafton deployment
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: SATHEESARAN <sasundar>
Component: gdeployAssignee: Sachidananda Urs <surs>
Status: CLOSED ERRATA QA Contact: SATHEESARAN <sasundar>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rhgs-3.2CC: amukherj, knarra, rcyriac, rhinduja, rhs-bugs, rnachimu, sabose, smohan, storage-qa-internal, surs
Target Milestone: ---Keywords: FutureFeature
Target Release: RHGS 3.2.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: gdeploy-2.0.1-11 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-23 05:09:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1277939, 1351503    

Description SATHEESARAN 2016-12-16 14:37:29 UTC
Description of problem:
-----------------------
Provide the pre-flight check for the Grafton deployments

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
gdeploy-2.0.1-5

How reproducible:
-----------------
NA

Steps to Reproduce:
--------------------
NA

Actual results:
---------------
Currently there is no configuration check available

Expected results:
-----------------
Configuration check helps the customer to understand that something is incorrect and serves as a warning before installation of Grafton

Comment 1 SATHEESARAN 2016-12-16 14:49:43 UTC
The following checks are required:

1. Disks ( RAID volumes ) should be blank and should not contain any partition table. If there are MBR/GPT partition style information available, gdeploy should throw proper warning and stop proceeding

2. Network configuration.
The hostname of the server should resolve to the same IP address on all the servers.

In the particular customer case, we have found that the hostname resolved to a different IP on one particular host. This should be caught well in advance before installation and gdeploy should stop proceeding when found a problem.

Hint: 'nslookup' should help in getting the hostnames & IPs of that host

Comment 3 Sahina Bose 2016-12-21 07:08:14 UTC
We should be able to invoke a script before the deployment, and gdeploy should fail if script returns a non-zero return value.

Can this script be included in extras in gdeploy rpm for 3.2?

Comment 4 Sachidananda Urs 2016-12-21 07:15:25 UTC
Sahina I think we can't put this in 3.2.0 release, however we can make another release before Grafton GA.

Comment 8 Sahina Bose 2016-12-22 08:54:13 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1386443 - is there a way to provide check to avoid this too?

Comment 9 Sachidananda Urs 2016-12-22 10:02:39 UTC
This is during hosted engine setup, and there is no way to know if it fails unless we run the `setup' or the setup process provides --dry-run or something like that to detect any errors.

Comment 11 Sachidananda Urs 2017-01-05 10:10:41 UTC
(In reply to Sahina Bose from comment #3)
> We should be able to invoke a script before the deployment, and gdeploy
> should fail if script returns a non-zero return value.

We can now include ignore_script_errors=no and in the script we can return
non-zero return status and gdeploy will exit.

Comment 12 RamaKasturi 2017-01-10 14:50:58 UTC
proposing this as blocker because this script is very much required for grafton deployments to avoid installation/setup failures.

Comment 13 Sahina Bose 2017-01-13 09:21:33 UTC
Requirements are as per Comment 1.

For the network validation, if a host has multiple IP addresses, ensure
1. each IP address resolves to a separate fqdn
2. Any host FQDN resolves consistently to the same IP address from all hosts. i.e if x.domain.com resolves to ip1 from host1, it should resolve to ip1 even from host2.

Comment 15 Sachidananda Urs 2017-01-23 07:07:37 UTC
Commits:

https://github.com/gluster/gdeploy/commit/0433977fe
https://github.com/gluster/gdeploy/commit/63703e7bd

Should resolve the issues.

Comment 17 SATHEESARAN 2017-02-06 15:39:32 UTC
When executed there was an error.

PLAY [gluster_servers] *********************************************************

TASK [Run a shell script] ******************************************************
fatal: [dhcp37-196.lab.eng.blr.redhat.com]: FAILED! => {"failed": true, "msg": "The conditional check 'result.rc != 0' failed. The error was: error while evaluating conditional (result.rc != 0): 'dict object' has no attribute 'rc'"}
fatal: [dhcp37-188.lab.eng.blr.redhat.com]: FAILED! => {"failed": true, "msg": "The conditional check 'result.rc != 0' failed. The error was: error while evaluating conditional (result.rc != 0): 'dict object' has no attribute 'rc'"}
fatal: [dhcp37-169.lab.eng.blr.redhat.com]: FAILED! => {"failed": true, "msg": "The conditional check 'result.rc != 0' failed. The error was: error while evaluating conditional (result.rc != 0): 'dict object' has no attribute 'rc'"}
	to retry, use: --limit @/tmp/tmpYjqYOD/run-script.retry

PLAY RECAP *********************************************************************
dhcp37-169.lab.eng.blr.redhat.com : ok=0    changed=0    unreachable=0    failed=1   
dhcp37-188.lab.eng.blr.redhat.com : ok=0    changed=0    unreachable=0    failed=1   
dhcp37-196.lab.eng.blr.redhat.com : ok=0    changed=0    unreachable=0    failed=1  

Also the script - /usr/share/ansible/gdeploy/scripts/grafton-sanity-check.sh - uses 'ifconfig' command which was obseleted in favor of 'ip' command. 'ifconfig' command is not available in RHEL 7.3

So replace 'ifconfig' command with 'ip addr show'

Based on the above observations this bug failed verification.

Comment 18 Sachidananda Urs 2017-02-11 09:54:27 UTC
https://github.com/gluster/gdeploy/commit/a2227 should fix the issue

Comment 19 Sachidananda Urs 2017-02-11 09:55:10 UTC
The error:

TASK [Run a shell script] ******************************************************
fatal: [dhcp37-196.lab.eng.blr.redhat.com]: FAILED! => {"failed": true, "msg": "The conditional check 'result.rc != 0' failed. The error was: error while evaluating 

Was due to the missing `ifconfig' command.

Comment 20 SATHEESARAN 2017-02-11 17:04:01 UTC
(In reply to Sachidananda Urs from comment #19)
> The error:
> 
> TASK [Run a shell script]
> ******************************************************
> fatal: [dhcp37-196.lab.eng.blr.redhat.com]: FAILED! => {"failed": true,
> "msg": "The conditional check 'result.rc != 0' failed. The error was: error
> while evaluating 
> 
> Was due to the missing `ifconfig' command.

True. 'ifconfig' command in RHEL-7 is deprecated in favor of 'ip' command as mentioned in comment17. Please make the script to use 'ip addr show' command

Comment 21 SATHEESARAN 2017-02-17 13:42:08 UTC
Tested with gdeploy-2.0.1-11.el7rhgs

1. When the disks are partitioned with GPT, gdeploy check validates the same and with the presence of ignore_script_errors, gdeploy stops and doesn't proceed further.

2. When one of the host is not reachable from the master node ( node from which the gdeploy is run ), gdeploy check validates the same and with the presence of ignore_script_errors, gdeploy stops and doesn't proceed further.

3. Created a situation where from host2, host1's hostname resolves to some other IP which is not reachable ( by having entries in /etc/hosts ), then too gdeploy check validates the same and with the presence of ignore_script_errors, gdeploy stops and doesn't proceed further.

Based on this observation, moving this bug to verified state

Comment 23 errata-xmlrpc 2017-03-23 05:09:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2017-0482.html