Bug 1430757 - [Docs][Upgrades] OSP10 -> OSP11 upgrade fails for deployments using external loadbalancer
Summary: [Docs][Upgrades] OSP10 -> OSP11 upgrade fails for deployments using external ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: documentation
Version: 11.0 (Ocata)
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ga
: 11.0 (Ocata)
Assignee: Dan Macpherson
QA Contact: Marius Cornea
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-09 14:26 UTC by Marius Cornea
Modified: 2017-05-18 08:03 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
Under certain circumstances, such as when OVS is upgraded, or when the network service is restarted, the OVS bridges were torn down and rebuilt. Consequently, the existing network flows were interrupted when this happened, causing network traffic to stop forwarding until the flows were rebuilt. This can take some time in a complex deployment. In order to avoid any possible downtime, the control plane networks should not be placed on an OVS bridge. The Control Plane (Provisioning), Internal API, and Storage Management networks should instead be dedicated interfaces or VLAN interfaces that are not on a bridge. For instance, one interface or bond could contain the control plane VLANs, while another interface or bond can be placed on an OVS bridge for tenant network data. As long as the control plane interfaces are not on an OVS bridge, any network downtime will be limited to the Tenant data plane.
Clone Of:
Environment:
Last Closed: 2017-05-18 08:03:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1416355 0 unspecified CLOSED OSP11: environment file used for external loadbalancer deployment uses new FixedIPs parameters 2021-02-22 00:41:40 UTC

Internal Links: 1416355

Description Marius Cornea 2017-03-09 14:26:44 UTC
Description of problem:
OSP10 -> OSP11 upgrade fails for deployments using external loadbalancer. 

Deployment fails with the following error:

stdout: overcloud.ControlVirtualIP:
  resource_type: OS::Neutron::Port
  physical_resource_id: ea69f3c5-a40a-4bd7-bb7b-4ba27dee7141
  status: UPDATE_FAILED
  status_reason: |
    InvalidIpForNetworkClient: resources.ControlVirtualIP: IP address 192.0.2.251 is not a valid IP for any of the subnets on the specified network.
    Neutron server returns request_ids: ['req-de73ef1b-d2c8-4843-83a6-7470cc3c9c3d']


Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-6.0.0-0.20170303152752.0rc1.el7ost.noarch.rpm  

How reproducible:
100%

Steps to Reproduce:
1. Deploy OSP10 with external loadbalancer:
deploy command and environment files:
http://paste.openstack.org/show/602074/

2. Run upgrade composable steps

source ~/stackrc
export THT=/usr/share/openstack-tripleo-heat-templates/

openstack overcloud deploy --templates $THT \
-r ~/openstack_deployment/roles/roles_data.yaml \
-e $THT/environments/network-isolation.yaml \
-e $THT/environments/network-management.yaml \
-e $THT/environments/storage-environment.yaml \
-e $THT/environments/external-loadbalancer-vip.yaml \
-e ~/openstack_deployment/environments/nodes.yaml \
-e ~/openstack_deployment/environments/network-environment.yaml \
-e ~/openstack_deployment/environments/disk-layout.yaml \
-e ~/openstack_deployment/environments/neutron-settings.yaml \
-e ~/openstack_deployment/environments/external-lb.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-composable-steps.yaml \
-e ~/repo.yaml \
--log-file overcloud_deployment.log &> overcloud_install.log

Actual results:
Fails

Expected results:
Successful upgrade.

Additional info:
It looks that the parameters inside external-loadbalancer-vip.yaml got changed so most probably we need to adapt them during upgrade:

https://github.com/openstack/tripleo-heat-templates/commit/a3f03eb307797ac5eef1251b9252e642db326e07

Comment 1 Sofer Athlan-Guyot 2017-03-24 13:00:06 UTC
Checking with dsneddon what would be the best course of action here.

Comment 2 Marius Cornea 2017-03-24 15:40:06 UTC
So I think we need to test by adjusting the environment files during upgrade with the file which worked for OSP11 fresh deployment(https://bugzilla.redhat.com/show_bug.cgi?id=1416355#c10)

One question that I have: in OSP10 we didn't have to pass the ControlPlaneIP parameter but it looks like in OSP11 passing ControlFixedIPs is mandatory, otherwise the default is going to be used which could not match the existing ctlplane subnet. Can we get a clarification on what the ControlFixedIPs is used for? Does it needs to be configured on the external loadbalancer?

Comment 3 Sofer Athlan-Guyot 2017-03-29 10:40:17 UTC
Hi,

After discussing with Marius, we think there are two issues here:

 - one is a documentation problem: We need to make the user aware of
   the breaking changes introduced in osp11 for ext-lb:
   - parameter name change;
   - new required modules;

This would be solved by a documentation bug.  Marius is currently
investigating what should go into this documentation.

 - the other one is what seems to be an useless requirement for
   ControlFixedIPs.  Marius is validating that as well and we will
   open a new bz based on his finding for tracking this one.

Bottom line: we're going to solve this one as documentation.

Adding HardProv for awareness and help to get this right.

Comment 5 Sofer Athlan-Guyot 2017-03-31 12:34:55 UTC
Hi Lucas,

hum ... it's because the change has been made by a member of your team so I just made a simple association here.  But all in all, you're right it make more sense to put the networking squad in the loop.

Thanks,

Comment 6 Assaf Muller 2017-04-03 10:38:19 UTC
(In reply to Sofer Athlan-Guyot from comment #5)
> Hi Lucas,
> 
> hum ... it's because the change has been made by a member of your team so I
> just made a simple association here.  But all in all, you're right it make
> more sense to put the networking squad in the loop.
> 
> Thanks,

It seems like the issue is a backwards incompat change to TripleO Heat Templates. Is there anything you expect from the Network team here?

Comment 7 Jaromir Coufal 2017-04-03 13:19:49 UTC
Based on today's sync (and Sofer), this is going to be documentation effort.

Dan can you please sync with Sofer and Marius on the update of our documentation, please?

Comment 9 Marius Cornea 2017-04-03 14:48:37 UTC
In addition to bug 1416355 which tracks the parameters change we need to document that the external loadbalancer configuration needs to be updated with the new services introduced in OSP11 for the nova placement api and Panko API.

HAProxy example configs:

listen nova_placement
  bind 10.0.0.120:8778 transparent
  bind 172.16.18.120:8778 transparent
  mode http
  http-request set-header X-Forwarded-Proto https if { ssl_fc }
  http-request set-header X-Forwarded-Proto http if !{ ssl_fc }
  server overcloud-serviceapi-0.internalapi.localdomain 10.0.0.124:8778 check fall 5 inter 2000 rise 2
  server overcloud-serviceapi-1.internalapi.localdomain 10.0.0.125:8778 check fall 5 inter 2000 rise 2
  server overcloud-serviceapi-2.internalapi.localdomain 10.0.0.126:8778 check fall 5 inter 2000 rise 2

listen panko
  bind 10.0.0.120:8779 transparent
  bind 172.16.18.120:8779 transparent
  http-request set-header X-Forwarded-Proto https if { ssl_fc }
  http-request set-header X-Forwarded-Proto http if !{ ssl_fc }
  server overcloud-serviceapi-0.internalapi.localdomain 10.0.0.124:8779 check fall 5 inter 2000 rise 2
  server overcloud-serviceapi-1.internalapi.localdomain 10.0.0.125:8779 check fall 5 inter 2000 rise 2
  server overcloud-serviceapi-2.internalapi.localdomain 10.0.0.126:8779 check fall 5 inter 2000 rise 2

Comment 13 Lucy Bopf 2017-04-06 01:42:25 UTC
Assigning to Dan to cover as part of work on upgrades.

Dan, feel free to close this one out if you have a central tracker you'd prefer to use.

Comment 18 Dan Macpherson 2017-05-18 01:43:21 UTC
Thanks, Marius.

@Sofer, did you have any feedback?

Comment 19 Sofer Athlan-Guyot 2017-05-18 07:57:29 UTC
Hi Dan,

from comment #3, the first part (doc) is good.  The second point (ControlFixedIPs) has been answered as well.  So all is good on my side.

Thanks Dan.

Comment 20 Dan Macpherson 2017-05-18 08:03:49 UTC
Thanks, Sofer!


Note You need to log in before you can comment on or make changes to this bug.