Bug 1267740 - deploying an etcd cluster with mixed default interface types fails
deploying an etcd cluster with mixed default interface types fails
Status: CLOSED DUPLICATE of bug 1267733
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer (Show other bugs)
3.0.0
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Scott Dodson
Ma xiaoqiang
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-09-30 16:17 EDT by Scott Dodson
Modified: 2016-07-03 20:45 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-10-09 08:46:58 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Scott Dodson 2015-09-30 16:17:56 EDT
The template that builds the etcd config uses the local variable etcd_interface which defaults to ansible_default_ipv4.interface to build the etcd configuration. So on your host that uses eth0 it tries to lookup the ip address of the other host based on ansible_eth0.ipv4.address, which doesn't exist and problems ensue.

TASK: [etcd | Validate permissions on peer certificate files] *****************
ok: [host1.example.com] => (item=/etc/etcd/ca.crt)
changed: [host1.example.com] => (item=/etc/etcd/peer.crt)
changed: [host1.example.com] => (item=/etc/etcd/peer.key)
ok: [host2.example.com] => (item=/etc/etcd/ca.crt)
changed: [host2.example.com] => (item=/etc/etcd/peer.crt)
changed: [host2.example.com] => (item=/etc/etcd/peer.key)

TASK: [etcd | Write etcd global config file] **********************************
fatal: [host1.example.com] => {'msg': "AnsibleUndefinedVariable: One or more undefined variables: 'dict object' has no attribute u'ansible_ens192'", 'failed': True}
fatal: [host1.example.com] => {'msg': "AnsibleUndefinedVariable: One or more undefined variables: 'dict object' has no attribute u'ansible_ens192'", 'failed': True}
fatal: [host2.example.com] => {'msg': "AnsibleUndefinedVariable: One or more undefined variables: 'dict object' has no attribute u'ansible_eth0'", 'failed': True}
fatal: [host2.example.com] => {'msg': "AnsibleUndefinedVariable: One or more undefined variables: 'dict object' has no attribute u'ansible_eth0'", 'failed': True}

I believe this is fixed by https://github.com/openshift/openshift-ansible/pull/437 which switches the behaviour to use hostnames rather than looking up the ip addresses.

I'm not entirely sure how to setup an environment that varies the interface names but testing between bare metal and a virtual machine may be one way. Ultimately testing that this doesn't cause regressions should be sufficient if setting up an environment with different interface names proves to be difficult.
We've had one customer verify that this addresses the issue for them.
Comment 2 Gaoyun Pei 2015-10-09 04:57:50 EDT
QE could reproduce this bug on a set of machines which have different interface names.

After manually merging the PR into latest openshift-ansible master branch, ha etcd cluster could be installed, but not all of the etcd service get started.
Comment 4 Brenton Leanhardt 2015-10-09 08:46:58 EDT

*** This bug has been marked as a duplicate of bug 1267733 ***

Note You need to log in before you can comment on or make changes to this bug.