Bug 1254389 - Can no longer run packstack to maintain cluster
Summary: Can no longer run packstack to maintain cluster
Keywords:
Status: CLOSED EOL
Alias: None
Product: RDO
Classification: Community
Component: openstack-packstack
Version: trunk
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: Kilo
Assignee: Ivan Chavero
QA Contact: Shai Revivo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-08-17 23:58 UTC by Bryce Nordgren
Modified: 2016-05-19 15:31 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-05-19 15:31:40 UTC


Attachments (Terms of Use)
autogenerated/edited answer file (40.61 KB, text/plain)
2015-08-17 23:58 UTC, Bryce Nordgren
no flags Details
Rebuild log on RDO Liberty . Adding patch to openstack-packstack-2015.2-0.1.dev1654.gcbbf46e.el7.src.rpm (36.06 KB, text/plain)
2016-01-23 18:14 UTC, Boris Derzhavets
no flags Details

Description Bryce Nordgren 2015-08-17 23:58:24 UTC
Created attachment 1064102 [details]
autogenerated/edited answer file

Description of problem:

Created an RDO Kilo cluster around the beginning of July by generating, then editing an answer file. In the interim, I did a "yum upgrade", and packstack was updated. Now when I run packstack, it consistently fails during "Adding Neutron API Manifest Entries" with 'ERROR : "Couldn't detect ipaddress of interface ten on node 10.0.2.13"'

This occurs even with an identical answer file to what was used to produce the working system a month ago. 

The main question here is: how do I need to change my old answer file such that it will work with the new packstack? Or is the recommended approach to hold packstack at the version used to install your system because breaking changes should be expected?



Version-Release number of selected component (if applicable):

The code which emits this error message was committed to the repo after I originally generated the answer file: 

https://github.com/stackforge/packstack/commit/d1211af056549ec803ebd33217ba2257cbd4b7bd


How reproducible:

Always, on my system.


Steps to Reproduce:
1. packstack --answer-file=openstack-answer-file.txt
2.
3.

Actual results:
ERROR : "Couldn't detect ipaddress of interface ten on node 10.0.2.13"

Expected results:
Successful completion.


Additional info:
Further documentation on ask.openstack.org

https://ask.openstack.org/en/question/79980/packstack-complains-of-not-being-able-to-detect-ipaddress-of-interface/

Comment 3 Bryce Nordgren 2015-08-20 23:25:50 UTC
Rather than update my answer file, I downgraded packstack and all seems to be well. 

There probably should be some communication with upstream about when breaking changes should be expected. The enterprise linux packaging, at least, should hold off on packaging those updates. Kinda violates expectations.

Comment 4 Ivan Chavero 2015-12-03 08:26:21 UTC
you have the node: 10.0.2.13 added in EXCLUDE_SERVERS it seems like packstack still tries to do stuff on the node.

Comment 5 Etsuji Nakai 2015-12-05 13:59:14 UTC
My guess on how this has happened....

1. Since the controller node 10.0.2.13 is added in EXCLUDE_SERVERS, plugins/prescript_000.py skips to collect the host information into config['HOST_DETAILS'] in preinstall_and_discover().

2. plugins/neutron_350.py still try to use config['HOST_DETAILS'] for this node as a peer of the VXLAN tunnel in create_manifests()

I'm not sure what's the best way to fix it. At least, adding the controller (and existing compute nodes) in EXCLUDE_SERVERS is very common when adding new compute nodes in the existing cluster. So I think neutron_350.py should collect the peer IP addresses without using config['HOST_DETAILS'].

Comment 6 Etsuji Nakai 2015-12-05 14:02:56 UTC
(In reply to Etsuji Nakai from comment #5)

> I'm not sure what's the best way to fix it. At least, adding the controller
> (and existing compute nodes) in EXCLUDE_SERVERS is very common when adding
> new compute nodes in the existing cluster. So I think neutron_350.py should
> collect the peer IP addresses without using config['HOST_DETAILS'].

You may be able to change plugins/prescript_000.py so that it still collects information of hosts in EXCLUDE_SERVERS without modifying their configuration.

Comment 7 Bryce Nordgren 2015-12-07 19:27:27 UTC
The answer on ask.openstack.org seems to indicate that it's just looking in slightly the wrong place? Maybe the easy solution is to just make it look at br-eth1 instead of eth1? 

https://ask.openstack.org/en/question/79980/packstack-complains-of-not-being-able-to-detect-ipaddress-of-interface/?answer=82694#post-id-82694

Comment 8 Etsuji Nakai 2015-12-13 11:14:19 UTC
(In reply to Ivan Chavero from comment #4)
> you have the node: 10.0.2.13 added in EXCLUDE_SERVERS it seems like
> packstack still tries to do stuff on the node.

I submitted a patch in the upstream which allows you to use subnets for IP filtering of tunneling packets so that you can safely add existing nodes to EXCLUDE_SERVERS.

https://review.openstack.org/#/c/257033/

Comment 9 Alan Pevec (Fedora) 2016-01-23 11:54:23 UTC
Ivan, please backport this fix to RDO Kilo.

Comment 10 Boris Derzhavets 2016-01-23 17:27:13 UTC
We experiencing the same problem as described originally by Bryce Nordgren when
bug was opened on RDO Kilo 2015.1.1

I attempted to rebuild  openstack-packstack-2015.1-0.16.dev1637.g2bb5c1d.el7.src.rpm ( the most recent available)
adding fourth patch 

https://review.openstack.org/gitweb?p=openstack/packstack.git;a=patch;h=04e3572e618713828ffafb1ce24790f26499719e

as   004-Fix-exclude-servers.patch.

Build failed  with errors :-

+ echo 'Patch #1 (0001-Do-not-enable-Keystone-in-httpd-by-default.patch):'
Patch #1 (0001-Do-not-enable-Keystone-in-httpd-by-default.patch):
+ /usr/bin/cat /root/rpmbuild/SOURCES/0001-Do-not-enable-Keystone-in-httpd-by-default.patch
+ /usr/bin/patch -p1 --fuzz=0
patching file packstack/plugins/keystone_100.py
Hunk #1 succeeded at 168 (offset 16 lines).
+ echo 'Patch #2 (0002-Do-not-enable-EPEL-when-installing-RDO.patch):'
Patch #2 (0002-Do-not-enable-EPEL-when-installing-RDO.patch):
+ /usr/bin/cat /root/rpmbuild/SOURCES/0002-Do-not-enable-EPEL-when-installing-RDO.patch
+ /usr/bin/patch -p1 --fuzz=0
patching file packstack/plugins/prescript_000.py
Hunk #1 succeeded at 1113 (offset 21 lines).
+ echo 'Patch #3 (0003-Fix-nagios-service-configuration.patch):'
Patch #3 (0003-Fix-nagios-service-configuration.patch):
+ /usr/bin/cat /root/rpmbuild/SOURCES/0003-Fix-nagios-service-configuration.patch
+ /usr/bin/patch -p1 --fuzz=0
patching file packstack/puppet/modules/packstack/manifests/nagios_config_wrapper.pp

+ echo 'Patch #4 (004-Fix-exclude-servers.patch):'
Patch #4 (004-Fix-exclude-servers.patch):  <==== my patch placed in SOURCES 
+ /usr/bin/cat /root/rpmbuild/SOURCES/004-Fix-exclude-servers.patch
+ /usr/bin/patch -p1 --fuzz=0
patching file docs/packstack.rst
Hunk #1 succeeded at 838 (offset -19 lines).
patching file packstack/plugins/neutron_350.py
Hunk #2 succeeded at 515 (offset -59 lines).
Hunk #3 FAILED at 670.
1 out of 3 hunks FAILED -- saving rejects to file packstack/plugins/neutron_350.py.rej
error: Bad exit status from /var/tmp/rpm-tmp.ZVNRrt (%prep)

Comment 11 Boris Derzhavets 2016-01-23 18:14:27 UTC
Created attachment 1117469 [details]
Rebuild log on RDO Liberty . Adding patch to  openstack-packstack-2015.2-0.1.dev1654.gcbbf46e.el7.src.rpm

Patch https://review.openstack.org/gitweb?p=openstack/packstack.git;a=patch;h=04e3572e618713828ffafb1ce24790f26499719e
works for openstack-packstack-2015.2-0.1.dev1654.gcbbf46e.el7.src.rpm ( e.g. on
RDO Liberty ). 
However, fails for  openstack-packstack-2015.1-0.16.dev1637.g2bb5c1d.el7.src.rpm (e.g. on RDO Kilo )

Comment 13 Chandan Kumar 2016-05-19 15:31:40 UTC
This bug is against a Version which has reached End of Life.
If it's still present in supported release (http://releases.openstack.org), please update Version and reopen.


Note You need to log in before you can comment on or make changes to this bug.