Bug 1768412 - [THT]Minor update is blocked for overcloud with pre-provisioned nodes
Summary: [THT]Minor update is blocked for overcloud with pre-provisioned nodes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 13.0 (Queens)
Hardware: All
OS: All
high
medium
Target Milestone: ---
: ---
Assignee: RHOS Maint
QA Contact: Sasha Smolyak
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-04 11:50 UTC by Alex Stupnikov
Modified: 2024-01-06 04:27 UTC (History)
16 users (show)

Fixed In Version: openstack-tripleo-heat-templates-8.4.1-21.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-10 11:22:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 627963 0 'None' MERGED Fall back service_net_map to ctlplane 2020-08-26 12:27:34 UTC
Red Hat Issue Tracker OSP-28288 0 None None None 2023-09-07 20:58:33 UTC
Red Hat Product Errata RHBA-2020:0760 0 None None None 2020-03-10 11:22:45 UTC

Comment 9 Bob Fournier 2019-11-06 19:13:34 UTC
Alex - the first step should be as you indicated - remove the invalid template files network_data.yaml and network-environment.yaml.  Do you have the templates that were used in the original deploy?  Those should be reused.

Next step is you may need to restore the heat DB, but its not clear if that is necessary.  When switching between network-isolation and non-network-isolation and redeploying you will often get "Physical network XXX is in use" errors, for example [1].  I don't think you will have that problem when using the "openstack overcloud update prepare" but its worth checking the neutron log files for similar type errors from the previous deployment.  If so, and you did not save the heat DB, the heat DB will have to be restored using a method similar to [1], the heat team can help with this.

Next is to determine why the "openstack overcloud update prepare" command is failing, you should look for errors in heat/mistral logs.  Sorry to farm this out but as I mentioned earlier the Upgrades DFG would probably be the most helpful for this update command along with doing a minor update in general.

Its good that this is a staging environment, are they using this to test for a production update?  


[1] https://bugzilla.redhat.com/show_bug.cgi?id=1483246

Comment 11 Bob Fournier 2019-11-07 13:00:05 UTC
Thanks Ilyas, yes I suggest that they redeploy this test environment instead of attempting to recover the DB.

Comment 12 Alex Stupnikov 2019-11-07 17:03:07 UTC
Hi Bob!

Huge thanks for your follow-ups. We will try to re-run "openstack overcloud update prepare ..." command without network isolation and understand what is the issue there.

At the same time, I would like to ask you to help me to understand the current options for networking setup of pre-provisioned nodes in RHOSP 13. Let me explain the point. We know that old behavior was changed [1] somewhere around Z5 minor release and now network-isolation works differently. So I am wondering, if we still can configure overclouds templates for pre-provisioned nodes in a way that will allow us to

1. Use assigned IP addresses for CTLPLANE network [2]
2. Have isolated storage network (and this network ONLY) [2]

If there is a way to configure this setup, what should be the templates?

Kind Regards, Alex S.

[1]
https://access.redhat.com/solutions/4300061
https://bugzilla.redhat.com/show_bug.cgi?id=1643423 

[2]
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html-single/director_installation_and_usage/index#sect-Configuring_Network_Interfaces_for_the_Control_Plane

Comment 13 Dan Sneddon 2019-11-07 19:21:41 UTC
Alex, you can assign the IPs for both the Control Plane and Storage networks. See openstack-tripleo-heat-templates/environmants/ips-from-pool-all.yaml, but create a custom version where you remove all references to networks other than ctlplane and storage. Each node will have to be specified with ctlplane and storage fixed IPs in the custom ips-from-pool-all.yaml.

In order to deploy with only those two networks, you will need a custom network_data.yaml where the Storage network is defined with the proper IP subnet, and all other networks are marked with "enabled: false".

For instance, your ips-from-pool-all.yaml will look something like this if you had 3 controllers and 5 computes (add other roles if applicable):

resource_registry:
  OS::TripleO::Controller::Ports::StoragePort: ../network/ports/storage_from_pool.yaml
  OS::TripleO::Compute::Ports::StoragePort: ../network/ports/storage_from_pool.yaml
parameter_defaults:
  ControllerIPs:
    # Each controller will get an IP from the lists below, first controller, first IP
    ctlplane:
    - 192.168.24.249
    - 192.168.24.250
    - 192.168.24.251
    storage:
    - 172.16.1.249
    - 172.16.1.250
    - 172.16.1.251
  ComputeIPs:
    # Each compute will get an IP from the lists below, first compute, first IP
    ctlplane:
    - 192.168.24.10
    - 192.168.24.11
    - 192.168.24.12
    - 192.168.24.13
    - 192.168.24.14
    storage:
    - 172.16.1.10
    - 172.16.1.11
    - 172.16.1.12
    - 172.16.1.13
    - 172.16.1.14

Your custom network-isolation.yaml will look like this (you don't need port definitions for the roles, since that is covered in ips-from-pool-all.yaml):

resource_registry:
  # networks as defined in network_data.yaml
  OS::TripleO::Network::Storage: ../network/storage.yaml
  # Port assignments for the VIPs
  OS::TripleO::Network::Ports::StorageVipPort: ../network/ports/storage.yaml

You will need a custom deployed-server-roles-data.yaml file that will include references to only the Storage network in all applicable roles:

- name: ControllerDeployedServer
  CountDefault: 1
  disable_constraints: True
  tags:
    - primary
    - controller
  networks:
    - Storage
[...]

- name: ComputeDeployedServer
  CountDefault: 1
  HostnameFormatDefault: '%stackname%-novacompute-%index%'
  disable_constraints: True
  disable_upgrade_deployment: True
  networks:
    - Storage



When you deploy, you need to point to the custom network_data.yaml, deployed-server-roles-data.yaml, and network-isolation.yaml:
  openstack overcloud update prepare -v \
  --log-file $LOGFILE \
  --disable-validations \
  --templates /usr/share/openstack-tripleo-heat-templates \
  --stack overcloud \
  -r /home/stack/templates/deployed-server-roles-data.yaml \
  -n /home/stack/templates/network_data.yaml \
  -e /home/stack/templates/network-isolation.yaml \
[...]

Comment 14 Bob Fournier 2019-11-13 13:14:04 UTC
Alex - closing this for now as Dan has provided the recommended network.  Please reopen if you need more info.

Comment 30 Jad Haj Yahya 2020-03-05 13:28:07 UTC
Verified according to steps described above

Comment 33 errata-xmlrpc 2020-03-10 11:22:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0760

Comment 34 Red Hat Bugzilla 2024-01-06 04:27:04 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.