Bug 1518525 - Tendrl-ansible setup script fails if the server has 2 IP addresses [NEEDINFO]
Summary: Tendrl-ansible setup script fails if the server has 2 IP addresses
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: web-admin-tendrl-ansible
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: RHGS 3.4.0
Assignee: Timothy Asir
QA Contact: Daniel Horák
URL:
Whiteboard:
Depends On:
Blocks: 1503134
TreeView+ depends on / blocked
 
Reported: 2017-11-29 06:57 UTC by Sweta Anandpara
Modified: 2018-09-04 07:00 UTC (History)
10 users (show)

Fixed In Version: tendrl-ansible-1.6.1-2.el7rhgs.noarch.rpm, tendrl-api-1.6.1-1.el7rhgs.noarch.rpm, tendrl-commons-1.6.1-1.el7rhgs.noarch.rpm, tendrl-monitoring-integration-1.6.1-1.el7rhgs.noarch.rpm, tendrl-node-agent-1.6.1-1.el7, tendrl-ui-1.6.1-1.el7rhgs.noarch.rpm,
Doc Type: Known Issue
Doc Text:
Previously, during Red Hat Gluster Storage Web Administration installation, if the server had multiple active IP addresses, tendrl-ansible failed to automatically choose the correct one, causing installation failure. In this version, the user has to set all the required variables for tendrl-ansible as per the installation instructions.
Clone Of:
Environment:
Last Closed: 2018-09-04 06:59:21 UTC
Target Upstream Version:
rghatvis: needinfo? (tjeyasin)


Attachments (Terms of Use)
console logs (146.33 KB, application/octet-stream)
2017-11-29 06:57 UTC, Sweta Anandpara
no flags Details


Links
System ID Priority Status Summary Last Updated
Github Tendrl tendrl-ansible issues 74 None None None 2018-03-08 12:53:28 UTC
Red Hat Product Errata RHSA-2018:2616 None None None 2018-09-04 07:00:23 UTC

Description Sweta Anandpara 2017-11-29 06:57:06 UTC
Created attachment 1360185 [details]
console logs

Description of problem:
=======================
My setup comprised of beaker (perf) machines, for storage as well as tendrl server. 'ip a' on the tendrl server showed the below output. Even after specifying hostname in site.yml file, 'etcd_ip_address' was getting udpated to a different IP -- and that resulted in failure while creating tendrl-admin-user. Pasted below is the error output seen on console. I suppose we need better resolution of IP address in the line: etcd_ip_address: "{{ lookup('dig', ansible_fqdn) }}"

TASK [tendrl-ansible.tendrl-server : Create Tendrl admin user] ********************
^[[0;31mfatal: [gqas012.sbu.lab.eng.bos.redhat.com]: FAILED! => {"changed": true, "cmd": "RACK_ENV=production rake etcd:load_admin", "delta": "0:00:00.606007", "end": "2017-11-28 20:49:48.948206", "failed": true, "failed_when_result": true, "rc": 1, "start": "2017-11-28 20:49:48.342199", "stderr": "rake aborted!\nConnection refused - connect(2)\n\nTasks: TOP => etcd:load_admin\n(See full trace by running task with --trace)", "stderr_lines": ["rake aborted!", "Connection refused - connect(2)", "", "Tasks: TOP => etcd:load_admin", "(See full trace by running task with --trace)"], "stdout": "\"Generating default Tendrl admin\"", "stdout_lines": ["\"Generating default Tendrl admin\""]}^[[0m



1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 78:2b:cb:74:46:ac brd ff:ff:ff:ff:ff:ff
    inet 10.16.156.33/21 brd 10.16.159.255 scope global dynamic em1
       valid_lft 57514sec preferred_lft 57514sec
    inet6 fe80::7a2b:cbff:fe74:46ac/64 scope link 
       valid_lft forever preferred_lft forever
3: em2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
    link/ether 78:2b:cb:74:46:ad brd ff:ff:ff:ff:ff:ff
4: p2p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP qlen 1000
    link/ether 90:e2:ba:07:a2:84 brd ff:ff:ff:ff:ff:ff
    inet 192.168.96.143/24 brd 192.168.96.255 scope global p2p1
       valid_lft forever preferred_lft forever
    inet6 fe80::92e2:baff:fe07:a284/64 scope link 
       valid_lft forever preferred_lft forever
5: p2p2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
    link/ether 90:e2:ba:07:a2:85 brd ff:ff:ff:ff:ff:ff




Version-Release number of selected component (if applicable):
============================================================
tendrl-ansible-1.5.4-2.el7rhgs.noarch

How reproducible:
=================
1:1


Steps to Reproduce:
===================
1. Have storage nodes and a rhel7 machine for web-admin server, which has 2 IP addresses.
2. 'Run ansible-playbook -i hosts site.yml' from the tendrl-server

Actual results:
===============
Step 2 fails in the middle


Expected results:
================
tendrl-ansible should succeed in resolving and picking the right IP address.


Additional info:
================
PFA the console logs.

Comment 2 Martin Bukatovic 2017-11-30 07:12:13 UTC
I need to review this. Reply is expected no sooner than on Monday.

Comment 4 Nishanth Thomas 2017-12-04 16:56:53 UTC
Since workaround available, moving this out of this release.

Comment 6 Martin Bukatovic 2017-12-11 17:12:43 UTC
Sorry, I didn't have the time to try to replicate this.

That said, is this a issue caused by (previously) unclear downstream
documentation or actual issue which one would hit even when follows
upstream installation docs and README file from rpm package?

Context: The docs states one needs to update variables in example playbook
file when needed. The defaults there are not expected to just work in all
circumstances.

Comment 7 Martin Bukatovic 2017-12-11 17:13:35 UTC
Fixing typo:

(In reply to Martin Bukatovic from comment #6)
> upstream installation docs and README file from rpm package?

s/upstream/downstream/

Comment 10 Nishanth Thomas 2017-12-15 02:55:00 UTC
Remove 'and not in the tendrl-ansible configuration file'

Comment 13 Nishanth Thomas 2017-12-15 12:18:05 UTC
Updated,please check now

Comment 15 Timothy Asir 2018-01-29 04:50:18 UTC
IP address of a machine can change any time unless we have a static ip.
If there is a change in ip, we may need to update wherever the system has configured to use the ip. So its better to use static-ip or host-fqdn which can be configured to resolve the ip at top level.

@Sweta Anandpara
Could you please post the results of the following output:

$ip a
$nslookup <hostname>

Comment 17 Timothy Asir 2018-01-29 08:17:39 UTC
Ok, That means even ping <hostname/fqdn> might return incorrect ip. I hope this may not be an ansible or tendrl-ansible issue. It could be a dns issue and for some reason (dnscache) the host could unable to provide the correct ip address of the given fqdn. I will try to reproduce and update here and come up with a possible solution.

@Sweta, did you installed etcd in a different server or in a same server where you have installed tendrl-api and tendrl-ui?

Comment 20 Daniel Horák 2018-07-25 13:56:28 UTC
I'm verifying this Bug based on following points:

* All mandatory variables have to be configured manually by user, no
  automated guess is performed, so it is up to the user, to properly
  configure the variables.

* If some mandatory variable is not configured, execution of tendrl-ansible
  site.yml playbook is failed with proper error message:

    You need to define all mandatory ansible variables to run this
    playbook, see README file for guidance.

* The required mandatory variables are described in the README and will be
  described in documentation (Bug 1584737).

Also this bug doesn't cover case, when Gluster Cluster is using different
network than the monitoring (similar scenario is covered by Bug 1573075).

Tested with:
  tendrl-ansible-1.6.3-5.el7rhgs.noarch

>> VERIFIED

Comment 25 errata-xmlrpc 2018-09-04 06:59:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2616


Note You need to log in before you can comment on or make changes to this bug.