Description of problem: ----------------------- Provide the pre-flight check for the Grafton deployments Version-Release number of selected component (if applicable): ------------------------------------------------------------- gdeploy-2.0.1-5 How reproducible: ----------------- NA Steps to Reproduce: -------------------- NA Actual results: --------------- Currently there is no configuration check available Expected results: ----------------- Configuration check helps the customer to understand that something is incorrect and serves as a warning before installation of Grafton
The following checks are required: 1. Disks ( RAID volumes ) should be blank and should not contain any partition table. If there are MBR/GPT partition style information available, gdeploy should throw proper warning and stop proceeding 2. Network configuration. The hostname of the server should resolve to the same IP address on all the servers. In the particular customer case, we have found that the hostname resolved to a different IP on one particular host. This should be caught well in advance before installation and gdeploy should stop proceeding when found a problem. Hint: 'nslookup' should help in getting the hostnames & IPs of that host
We should be able to invoke a script before the deployment, and gdeploy should fail if script returns a non-zero return value. Can this script be included in extras in gdeploy rpm for 3.2?
Sahina I think we can't put this in 3.2.0 release, however we can make another release before Grafton GA.
https://bugzilla.redhat.com/show_bug.cgi?id=1386443 - is there a way to provide check to avoid this too?
This is during hosted engine setup, and there is no way to know if it fails unless we run the `setup' or the setup process provides --dry-run or something like that to detect any errors.
(In reply to Sahina Bose from comment #3) > We should be able to invoke a script before the deployment, and gdeploy > should fail if script returns a non-zero return value. We can now include ignore_script_errors=no and in the script we can return non-zero return status and gdeploy will exit.
proposing this as blocker because this script is very much required for grafton deployments to avoid installation/setup failures.
Requirements are as per Comment 1. For the network validation, if a host has multiple IP addresses, ensure 1. each IP address resolves to a separate fqdn 2. Any host FQDN resolves consistently to the same IP address from all hosts. i.e if x.domain.com resolves to ip1 from host1, it should resolve to ip1 even from host2.
Commits: https://github.com/gluster/gdeploy/commit/0433977fe https://github.com/gluster/gdeploy/commit/63703e7bd Should resolve the issues.
When executed there was an error. PLAY [gluster_servers] ********************************************************* TASK [Run a shell script] ****************************************************** fatal: [dhcp37-196.lab.eng.blr.redhat.com]: FAILED! => {"failed": true, "msg": "The conditional check 'result.rc != 0' failed. The error was: error while evaluating conditional (result.rc != 0): 'dict object' has no attribute 'rc'"} fatal: [dhcp37-188.lab.eng.blr.redhat.com]: FAILED! => {"failed": true, "msg": "The conditional check 'result.rc != 0' failed. The error was: error while evaluating conditional (result.rc != 0): 'dict object' has no attribute 'rc'"} fatal: [dhcp37-169.lab.eng.blr.redhat.com]: FAILED! => {"failed": true, "msg": "The conditional check 'result.rc != 0' failed. The error was: error while evaluating conditional (result.rc != 0): 'dict object' has no attribute 'rc'"} to retry, use: --limit @/tmp/tmpYjqYOD/run-script.retry PLAY RECAP ********************************************************************* dhcp37-169.lab.eng.blr.redhat.com : ok=0 changed=0 unreachable=0 failed=1 dhcp37-188.lab.eng.blr.redhat.com : ok=0 changed=0 unreachable=0 failed=1 dhcp37-196.lab.eng.blr.redhat.com : ok=0 changed=0 unreachable=0 failed=1 Also the script - /usr/share/ansible/gdeploy/scripts/grafton-sanity-check.sh - uses 'ifconfig' command which was obseleted in favor of 'ip' command. 'ifconfig' command is not available in RHEL 7.3 So replace 'ifconfig' command with 'ip addr show' Based on the above observations this bug failed verification.
https://github.com/gluster/gdeploy/commit/a2227 should fix the issue
The error: TASK [Run a shell script] ****************************************************** fatal: [dhcp37-196.lab.eng.blr.redhat.com]: FAILED! => {"failed": true, "msg": "The conditional check 'result.rc != 0' failed. The error was: error while evaluating Was due to the missing `ifconfig' command.
(In reply to Sachidananda Urs from comment #19) > The error: > > TASK [Run a shell script] > ****************************************************** > fatal: [dhcp37-196.lab.eng.blr.redhat.com]: FAILED! => {"failed": true, > "msg": "The conditional check 'result.rc != 0' failed. The error was: error > while evaluating > > Was due to the missing `ifconfig' command. True. 'ifconfig' command in RHEL-7 is deprecated in favor of 'ip' command as mentioned in comment17. Please make the script to use 'ip addr show' command
Tested with gdeploy-2.0.1-11.el7rhgs 1. When the disks are partitioned with GPT, gdeploy check validates the same and with the presence of ignore_script_errors, gdeploy stops and doesn't proceed further. 2. When one of the host is not reachable from the master node ( node from which the gdeploy is run ), gdeploy check validates the same and with the presence of ignore_script_errors, gdeploy stops and doesn't proceed further. 3. Created a situation where from host2, host1's hostname resolves to some other IP which is not reachable ( by having entries in /etc/hosts ), then too gdeploy check validates the same and with the presence of ignore_script_errors, gdeploy stops and doesn't proceed further. Based on this observation, moving this bug to verified state
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2017-0482.html