Bug 1267740 - deploying an etcd cluster with mixed default interface types fails
Summary: deploying an etcd cluster with mixed default interface types fails
Keywords:
Status: CLOSED DUPLICATE of bug 1267733
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.0.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Scott Dodson
QA Contact: Ma xiaoqiang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-09-30 20:17 UTC by Scott Dodson
Modified: 2016-07-04 00:45 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-10-09 12:46:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Scott Dodson 2015-09-30 20:17:56 UTC
The template that builds the etcd config uses the local variable etcd_interface which defaults to ansible_default_ipv4.interface to build the etcd configuration. So on your host that uses eth0 it tries to lookup the ip address of the other host based on ansible_eth0.ipv4.address, which doesn't exist and problems ensue.

TASK: [etcd | Validate permissions on peer certificate files] *****************
ok: [host1.example.com] => (item=/etc/etcd/ca.crt)
changed: [host1.example.com] => (item=/etc/etcd/peer.crt)
changed: [host1.example.com] => (item=/etc/etcd/peer.key)
ok: [host2.example.com] => (item=/etc/etcd/ca.crt)
changed: [host2.example.com] => (item=/etc/etcd/peer.crt)
changed: [host2.example.com] => (item=/etc/etcd/peer.key)

TASK: [etcd | Write etcd global config file] **********************************
fatal: [host1.example.com] => {'msg': "AnsibleUndefinedVariable: One or more undefined variables: 'dict object' has no attribute u'ansible_ens192'", 'failed': True}
fatal: [host1.example.com] => {'msg': "AnsibleUndefinedVariable: One or more undefined variables: 'dict object' has no attribute u'ansible_ens192'", 'failed': True}
fatal: [host2.example.com] => {'msg': "AnsibleUndefinedVariable: One or more undefined variables: 'dict object' has no attribute u'ansible_eth0'", 'failed': True}
fatal: [host2.example.com] => {'msg': "AnsibleUndefinedVariable: One or more undefined variables: 'dict object' has no attribute u'ansible_eth0'", 'failed': True}

I believe this is fixed by https://github.com/openshift/openshift-ansible/pull/437 which switches the behaviour to use hostnames rather than looking up the ip addresses.

I'm not entirely sure how to setup an environment that varies the interface names but testing between bare metal and a virtual machine may be one way. Ultimately testing that this doesn't cause regressions should be sufficient if setting up an environment with different interface names proves to be difficult.
We've had one customer verify that this addresses the issue for them.

Comment 2 Gaoyun Pei 2015-10-09 08:57:50 UTC
QE could reproduce this bug on a set of machines which have different interface names.

After manually merging the PR into latest openshift-ansible master branch, ha etcd cluster could be installed, but not all of the etcd service get started.

Comment 4 Brenton Leanhardt 2015-10-09 12:46:58 UTC

*** This bug has been marked as a duplicate of bug 1267733 ***


Note You need to log in before you can comment on or make changes to this bug.