Bug 1254389 - Can no longer run packstack to maintain cluster
Can no longer run packstack to maintain cluster
Status: CLOSED EOL
Product: RDO
Classification: Community
Component: openstack-packstack (Show other bugs)
trunk
Unspecified Unspecified
unspecified Severity unspecified
: ---
: Kilo
Assigned To: Ivan Chavero
Shai Revivo
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-08-17 19:58 EDT by Bryce Nordgren
Modified: 2016-05-19 11:31 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-05-19 11:31:40 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
autogenerated/edited answer file (40.61 KB, text/plain)
2015-08-17 19:58 EDT, Bryce Nordgren
no flags Details
Rebuild log on RDO Liberty . Adding patch to openstack-packstack-2015.2-0.1.dev1654.gcbbf46e.el7.src.rpm (36.06 KB, text/plain)
2016-01-23 13:14 EST, Boris Derzhavets
no flags Details

  None (edit)
Description Bryce Nordgren 2015-08-17 19:58:24 EDT
Created attachment 1064102 [details]
autogenerated/edited answer file

Description of problem:

Created an RDO Kilo cluster around the beginning of July by generating, then editing an answer file. In the interim, I did a "yum upgrade", and packstack was updated. Now when I run packstack, it consistently fails during "Adding Neutron API Manifest Entries" with 'ERROR : "Couldn't detect ipaddress of interface ten on node 10.0.2.13"'

This occurs even with an identical answer file to what was used to produce the working system a month ago. 

The main question here is: how do I need to change my old answer file such that it will work with the new packstack? Or is the recommended approach to hold packstack at the version used to install your system because breaking changes should be expected?



Version-Release number of selected component (if applicable):

The code which emits this error message was committed to the repo after I originally generated the answer file: 

https://github.com/stackforge/packstack/commit/d1211af056549ec803ebd33217ba2257cbd4b7bd


How reproducible:

Always, on my system.


Steps to Reproduce:
1. packstack --answer-file=openstack-answer-file.txt
2.
3.

Actual results:
ERROR : "Couldn't detect ipaddress of interface ten on node 10.0.2.13"

Expected results:
Successful completion.


Additional info:
Further documentation on ask.openstack.org

https://ask.openstack.org/en/question/79980/packstack-complains-of-not-being-able-to-detect-ipaddress-of-interface/
Comment 3 Bryce Nordgren 2015-08-20 19:25:50 EDT
Rather than update my answer file, I downgraded packstack and all seems to be well. 

There probably should be some communication with upstream about when breaking changes should be expected. The enterprise linux packaging, at least, should hold off on packaging those updates. Kinda violates expectations.
Comment 4 Ivan Chavero 2015-12-03 03:26:21 EST
you have the node: 10.0.2.13 added in EXCLUDE_SERVERS it seems like packstack still tries to do stuff on the node.
Comment 5 Etsuji Nakai 2015-12-05 08:59:14 EST
My guess on how this has happened....

1. Since the controller node 10.0.2.13 is added in EXCLUDE_SERVERS, plugins/prescript_000.py skips to collect the host information into config['HOST_DETAILS'] in preinstall_and_discover().

2. plugins/neutron_350.py still try to use config['HOST_DETAILS'] for this node as a peer of the VXLAN tunnel in create_manifests()

I'm not sure what's the best way to fix it. At least, adding the controller (and existing compute nodes) in EXCLUDE_SERVERS is very common when adding new compute nodes in the existing cluster. So I think neutron_350.py should collect the peer IP addresses without using config['HOST_DETAILS'].
Comment 6 Etsuji Nakai 2015-12-05 09:02:56 EST
(In reply to Etsuji Nakai from comment #5)

> I'm not sure what's the best way to fix it. At least, adding the controller
> (and existing compute nodes) in EXCLUDE_SERVERS is very common when adding
> new compute nodes in the existing cluster. So I think neutron_350.py should
> collect the peer IP addresses without using config['HOST_DETAILS'].

You may be able to change plugins/prescript_000.py so that it still collects information of hosts in EXCLUDE_SERVERS without modifying their configuration.
Comment 7 Bryce Nordgren 2015-12-07 14:27:27 EST
The answer on ask.openstack.org seems to indicate that it's just looking in slightly the wrong place? Maybe the easy solution is to just make it look at br-eth1 instead of eth1? 

https://ask.openstack.org/en/question/79980/packstack-complains-of-not-being-able-to-detect-ipaddress-of-interface/?answer=82694#post-id-82694
Comment 8 Etsuji Nakai 2015-12-13 06:14:19 EST
(In reply to Ivan Chavero from comment #4)
> you have the node: 10.0.2.13 added in EXCLUDE_SERVERS it seems like
> packstack still tries to do stuff on the node.

I submitted a patch in the upstream which allows you to use subnets for IP filtering of tunneling packets so that you can safely add existing nodes to EXCLUDE_SERVERS.

https://review.openstack.org/#/c/257033/
Comment 9 Alan Pevec 2016-01-23 06:54:23 EST
Ivan, please backport this fix to RDO Kilo.
Comment 10 Boris Derzhavets 2016-01-23 12:27:13 EST
We experiencing the same problem as described originally by Bryce Nordgren when
bug was opened on RDO Kilo 2015.1.1

I attempted to rebuild  openstack-packstack-2015.1-0.16.dev1637.g2bb5c1d.el7.src.rpm ( the most recent available)
adding fourth patch 

https://review.openstack.org/gitweb?p=openstack/packstack.git;a=patch;h=04e3572e618713828ffafb1ce24790f26499719e

as   004-Fix-exclude-servers.patch.

Build failed  with errors :-

+ echo 'Patch #1 (0001-Do-not-enable-Keystone-in-httpd-by-default.patch):'
Patch #1 (0001-Do-not-enable-Keystone-in-httpd-by-default.patch):
+ /usr/bin/cat /root/rpmbuild/SOURCES/0001-Do-not-enable-Keystone-in-httpd-by-default.patch
+ /usr/bin/patch -p1 --fuzz=0
patching file packstack/plugins/keystone_100.py
Hunk #1 succeeded at 168 (offset 16 lines).
+ echo 'Patch #2 (0002-Do-not-enable-EPEL-when-installing-RDO.patch):'
Patch #2 (0002-Do-not-enable-EPEL-when-installing-RDO.patch):
+ /usr/bin/cat /root/rpmbuild/SOURCES/0002-Do-not-enable-EPEL-when-installing-RDO.patch
+ /usr/bin/patch -p1 --fuzz=0
patching file packstack/plugins/prescript_000.py
Hunk #1 succeeded at 1113 (offset 21 lines).
+ echo 'Patch #3 (0003-Fix-nagios-service-configuration.patch):'
Patch #3 (0003-Fix-nagios-service-configuration.patch):
+ /usr/bin/cat /root/rpmbuild/SOURCES/0003-Fix-nagios-service-configuration.patch
+ /usr/bin/patch -p1 --fuzz=0
patching file packstack/puppet/modules/packstack/manifests/nagios_config_wrapper.pp

+ echo 'Patch #4 (004-Fix-exclude-servers.patch):'
Patch #4 (004-Fix-exclude-servers.patch):  <==== my patch placed in SOURCES 
+ /usr/bin/cat /root/rpmbuild/SOURCES/004-Fix-exclude-servers.patch
+ /usr/bin/patch -p1 --fuzz=0
patching file docs/packstack.rst
Hunk #1 succeeded at 838 (offset -19 lines).
patching file packstack/plugins/neutron_350.py
Hunk #2 succeeded at 515 (offset -59 lines).
Hunk #3 FAILED at 670.
1 out of 3 hunks FAILED -- saving rejects to file packstack/plugins/neutron_350.py.rej
error: Bad exit status from /var/tmp/rpm-tmp.ZVNRrt (%prep)
Comment 11 Boris Derzhavets 2016-01-23 13:14 EST
Created attachment 1117469 [details]
Rebuild log on RDO Liberty . Adding patch to  openstack-packstack-2015.2-0.1.dev1654.gcbbf46e.el7.src.rpm

Patch https://review.openstack.org/gitweb?p=openstack/packstack.git;a=patch;h=04e3572e618713828ffafb1ce24790f26499719e
works for openstack-packstack-2015.2-0.1.dev1654.gcbbf46e.el7.src.rpm ( e.g. on
RDO Liberty ). 
However, fails for  openstack-packstack-2015.1-0.16.dev1637.g2bb5c1d.el7.src.rpm (e.g. on RDO Kilo )
Comment 13 Chandan Kumar 2016-05-19 11:31:40 EDT
This bug is against a Version which has reached End of Life.
If it's still present in supported release (http://releases.openstack.org), please update Version and reopen.

Note You need to log in before you can comment on or make changes to this bug.