Bug 1254389

Summary: Can no longer run packstack to maintain cluster
Product: [Community] RDO Reporter: Bryce Nordgren <bnordgren>
Component: openstack-packstackAssignee: Ivan Chavero <ichavero>
Status: CLOSED EOL QA Contact: Shai Revivo <srevivo>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: trunkCC: aortega, apevec, bderzhavets, derekh, enakai, ichavero, jpena, srevivo
Target Milestone: ---Keywords: Triaged
Target Release: Kilo   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-05-19 15:31:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
autogenerated/edited answer file
none
Rebuild log on RDO Liberty . Adding patch to openstack-packstack-2015.2-0.1.dev1654.gcbbf46e.el7.src.rpm none

Description Bryce Nordgren 2015-08-17 23:58:24 UTC
Created attachment 1064102 [details]
autogenerated/edited answer file

Description of problem:

Created an RDO Kilo cluster around the beginning of July by generating, then editing an answer file. In the interim, I did a "yum upgrade", and packstack was updated. Now when I run packstack, it consistently fails during "Adding Neutron API Manifest Entries" with 'ERROR : "Couldn't detect ipaddress of interface ten on node 10.0.2.13"'

This occurs even with an identical answer file to what was used to produce the working system a month ago. 

The main question here is: how do I need to change my old answer file such that it will work with the new packstack? Or is the recommended approach to hold packstack at the version used to install your system because breaking changes should be expected?



Version-Release number of selected component (if applicable):

The code which emits this error message was committed to the repo after I originally generated the answer file: 

https://github.com/stackforge/packstack/commit/d1211af056549ec803ebd33217ba2257cbd4b7bd


How reproducible:

Always, on my system.


Steps to Reproduce:
1. packstack --answer-file=openstack-answer-file.txt
2.
3.

Actual results:
ERROR : "Couldn't detect ipaddress of interface ten on node 10.0.2.13"

Expected results:
Successful completion.


Additional info:
Further documentation on ask.openstack.org

https://ask.openstack.org/en/question/79980/packstack-complains-of-not-being-able-to-detect-ipaddress-of-interface/

Comment 3 Bryce Nordgren 2015-08-20 23:25:50 UTC
Rather than update my answer file, I downgraded packstack and all seems to be well. 

There probably should be some communication with upstream about when breaking changes should be expected. The enterprise linux packaging, at least, should hold off on packaging those updates. Kinda violates expectations.

Comment 4 Ivan Chavero 2015-12-03 08:26:21 UTC
you have the node: 10.0.2.13 added in EXCLUDE_SERVERS it seems like packstack still tries to do stuff on the node.

Comment 5 Etsuji Nakai 2015-12-05 13:59:14 UTC
My guess on how this has happened....

1. Since the controller node 10.0.2.13 is added in EXCLUDE_SERVERS, plugins/prescript_000.py skips to collect the host information into config['HOST_DETAILS'] in preinstall_and_discover().

2. plugins/neutron_350.py still try to use config['HOST_DETAILS'] for this node as a peer of the VXLAN tunnel in create_manifests()

I'm not sure what's the best way to fix it. At least, adding the controller (and existing compute nodes) in EXCLUDE_SERVERS is very common when adding new compute nodes in the existing cluster. So I think neutron_350.py should collect the peer IP addresses without using config['HOST_DETAILS'].

Comment 6 Etsuji Nakai 2015-12-05 14:02:56 UTC
(In reply to Etsuji Nakai from comment #5)

> I'm not sure what's the best way to fix it. At least, adding the controller
> (and existing compute nodes) in EXCLUDE_SERVERS is very common when adding
> new compute nodes in the existing cluster. So I think neutron_350.py should
> collect the peer IP addresses without using config['HOST_DETAILS'].

You may be able to change plugins/prescript_000.py so that it still collects information of hosts in EXCLUDE_SERVERS without modifying their configuration.

Comment 7 Bryce Nordgren 2015-12-07 19:27:27 UTC
The answer on ask.openstack.org seems to indicate that it's just looking in slightly the wrong place? Maybe the easy solution is to just make it look at br-eth1 instead of eth1? 

https://ask.openstack.org/en/question/79980/packstack-complains-of-not-being-able-to-detect-ipaddress-of-interface/?answer=82694#post-id-82694

Comment 8 Etsuji Nakai 2015-12-13 11:14:19 UTC
(In reply to Ivan Chavero from comment #4)
> you have the node: 10.0.2.13 added in EXCLUDE_SERVERS it seems like
> packstack still tries to do stuff on the node.

I submitted a patch in the upstream which allows you to use subnets for IP filtering of tunneling packets so that you can safely add existing nodes to EXCLUDE_SERVERS.

https://review.openstack.org/#/c/257033/

Comment 9 Alan Pevec (Fedora) 2016-01-23 11:54:23 UTC
Ivan, please backport this fix to RDO Kilo.

Comment 10 Boris Derzhavets 2016-01-23 17:27:13 UTC
We experiencing the same problem as described originally by Bryce Nordgren when
bug was opened on RDO Kilo 2015.1.1

I attempted to rebuild  openstack-packstack-2015.1-0.16.dev1637.g2bb5c1d.el7.src.rpm ( the most recent available)
adding fourth patch 

https://review.openstack.org/gitweb?p=openstack/packstack.git;a=patch;h=04e3572e618713828ffafb1ce24790f26499719e

as   004-Fix-exclude-servers.patch.

Build failed  with errors :-

+ echo 'Patch #1 (0001-Do-not-enable-Keystone-in-httpd-by-default.patch):'
Patch #1 (0001-Do-not-enable-Keystone-in-httpd-by-default.patch):
+ /usr/bin/cat /root/rpmbuild/SOURCES/0001-Do-not-enable-Keystone-in-httpd-by-default.patch
+ /usr/bin/patch -p1 --fuzz=0
patching file packstack/plugins/keystone_100.py
Hunk #1 succeeded at 168 (offset 16 lines).
+ echo 'Patch #2 (0002-Do-not-enable-EPEL-when-installing-RDO.patch):'
Patch #2 (0002-Do-not-enable-EPEL-when-installing-RDO.patch):
+ /usr/bin/cat /root/rpmbuild/SOURCES/0002-Do-not-enable-EPEL-when-installing-RDO.patch
+ /usr/bin/patch -p1 --fuzz=0
patching file packstack/plugins/prescript_000.py
Hunk #1 succeeded at 1113 (offset 21 lines).
+ echo 'Patch #3 (0003-Fix-nagios-service-configuration.patch):'
Patch #3 (0003-Fix-nagios-service-configuration.patch):
+ /usr/bin/cat /root/rpmbuild/SOURCES/0003-Fix-nagios-service-configuration.patch
+ /usr/bin/patch -p1 --fuzz=0
patching file packstack/puppet/modules/packstack/manifests/nagios_config_wrapper.pp

+ echo 'Patch #4 (004-Fix-exclude-servers.patch):'
Patch #4 (004-Fix-exclude-servers.patch):  <==== my patch placed in SOURCES 
+ /usr/bin/cat /root/rpmbuild/SOURCES/004-Fix-exclude-servers.patch
+ /usr/bin/patch -p1 --fuzz=0
patching file docs/packstack.rst
Hunk #1 succeeded at 838 (offset -19 lines).
patching file packstack/plugins/neutron_350.py
Hunk #2 succeeded at 515 (offset -59 lines).
Hunk #3 FAILED at 670.
1 out of 3 hunks FAILED -- saving rejects to file packstack/plugins/neutron_350.py.rej
error: Bad exit status from /var/tmp/rpm-tmp.ZVNRrt (%prep)

Comment 11 Boris Derzhavets 2016-01-23 18:14:27 UTC
Created attachment 1117469 [details]
Rebuild log on RDO Liberty . Adding patch to  openstack-packstack-2015.2-0.1.dev1654.gcbbf46e.el7.src.rpm

Patch https://review.openstack.org/gitweb?p=openstack/packstack.git;a=patch;h=04e3572e618713828ffafb1ce24790f26499719e
works for openstack-packstack-2015.2-0.1.dev1654.gcbbf46e.el7.src.rpm ( e.g. on
RDO Liberty ). 
However, fails for  openstack-packstack-2015.1-0.16.dev1637.g2bb5c1d.el7.src.rpm (e.g. on RDO Kilo )

Comment 13 Chandan Kumar 2016-05-19 15:31:40 UTC
This bug is against a Version which has reached End of Life.
If it's still present in supported release (http://releases.openstack.org), please update Version and reopen.