Bug 1945280 - Step to export ceph configuration in a spine/leaf fails if the deployment uses a collapsed network topology
Summary: Step to export ceph configuration in a spine/leaf fails if the deployment use...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-tripleoclient
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: z7
: 16.1 (Train on RHEL 8.2)
Assignee: John Fulton
QA Contact: Alfredo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-03-31 15:18 UTC by Darin Sorrentino
Modified: 2021-12-09 20:18 UTC (History)
9 users (show)

Fixed In Version: python-tripleoclient-12.3.2-1.20210505144302.ae58329
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-12-09 20:18:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
TGZ of /var/lib/mistral on undercloud as requested by John Fulton (11.72 MB, application/gzip)
2021-04-06 13:32 UTC, Darin Sorrentino
no flags Details
central inventory without storage_ip (9.23 KB, text/plain)
2021-04-06 14:15 UTC, John Fulton
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1922788 0 None None None 2021-04-06 19:18:32 UTC
OpenStack gerrit 785048 0 None NEW Use ctlplane_ip if storage_ip is unavailable when exporting Ceph 2021-04-06 21:03:22 UTC
OpenStack gerrit 786419 0 None NEW Use ceph_mon_network to find Ceph monitor IPs for export 2021-04-15 15:13:22 UTC
Red Hat Issue Tracker OSP-1678 0 None None None 2021-11-18 11:31:32 UTC
Red Hat Product Errata RHBA-2021:3762 0 None None None 2021-12-09 20:18:54 UTC

Description Darin Sorrentino 2021-03-31 15:18:09 UTC
Description of problem:

Exporting ceph configuration data for a DCN deployment fails when using a collapsed network topology.

You don't need to utilize isolated networks when deploying spine/leaf topology and it is possible to deploy with collapsing all networks down into the provisioning network.  When you have this deployment topology, the command to export the ceph data from the central/leaf0 location fails:

Exception occured while running the command
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/tripleoclient/command.py", line 32, in run
    super(Command, self).run(parsed_args)
  File "/usr/lib/python3.6/site-packages/osc_lib/command/command.py", line 41, in run
    return super(Command, self).run(parsed_args)
  File "/usr/lib/python3.6/site-packages/cliff/command.py", line 185, in run
    return_code = self.take_action(parsed_args) or 0
  File "/usr/lib/python3.6/site-packages/tripleoclient/v1/overcloud_export_ceph.py", line 105, in take_action
    config_download_dir))
  File "/usr/lib/python3.6/site-packages/tripleoclient/export.py", line 171, in export_ceph
    mon_ips = export_storage_ips(stack, config_download_dir)
  File "/usr/lib/python3.6/site-packages/tripleoclient/export.py", line 158, in export_storage_ips
    ip = inventory_data[mon_role]['hosts'][hostname]['storage_ip']
KeyError: 'storage_ip'
'storage_ip'


Version-Release number of selected component (if applicable):
16.1

How reproducible:
Every time.

Steps to Reproduce:
1. Deploy DCN/Spine Leaf with ceph in central & edge without isolated networks.
2. Execute command:

sudo -E openstack overcloud export ceph \
--stack central \
--config-download-dir /var/lib/mistral \
--output-file ~/dcn-common/central_ceph_external.yaml

This is from section 5.3, step 2 here:

https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html/distributed_compute_node_and_storage_deployment/assembly_deploying-storage-at-the-edge#deploying_edge_sites_with_storage

3.

Actual results:
Traceback show above.

Expected results:
Creation of ~/dcn-common/central_ceph_external.yaml file which contains the Ceph credential information.  Example:

parameter_defaults:
  CephExternalMultiConfig:
  - ceph_conf_overrides:
      client:
        keyring: /etc/ceph/central.client.openstack.keyring
    cluster: central
    dashboard_enabled: false
    external_cluster_mon_ips: 10.20.0.10,10.20.0.11,10.20.0.12
    fsid: 12345678-1234-1234-1234-1234567890ab
    keys:
    - caps:
        mgr: allow *
        mon: profile rbd
        osd: profile rbd pool=vms, profile rbd pool=volumes, profile rbd pool=images
      key: ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ab==
      mode: '0600'
      name: client.openstack


Additional info:

Comment 1 John Fulton 2021-04-05 15:28:57 UTC
(In reply to Darin Sorrentino from comment #0)
> Description of problem:
> 
> Exporting ceph configuration data for a DCN deployment fails when using a
> collapsed network topology.
> 
> You don't need to utilize isolated networks when deploying spine/leaf
> topology and it is possible to deploy with collapsing all networks down into
> the provisioning network.  When you have this deployment topology, the
> command to export the ceph data from the central/leaf0 location fails:
> 
> Exception occured while running the command
> Traceback (most recent call last):
>   File "/usr/lib/python3.6/site-packages/tripleoclient/command.py", line 32,
> in run
>     super(Command, self).run(parsed_args)
>   File "/usr/lib/python3.6/site-packages/osc_lib/command/command.py", line
> 41, in run
>     return super(Command, self).run(parsed_args)
>   File "/usr/lib/python3.6/site-packages/cliff/command.py", line 185, in run
>     return_code = self.take_action(parsed_args) or 0
>   File
> "/usr/lib/python3.6/site-packages/tripleoclient/v1/overcloud_export_ceph.py",
> line 105, in take_action
>     config_download_dir))
>   File "/usr/lib/python3.6/site-packages/tripleoclient/export.py", line 171,
> in export_ceph
>     mon_ips = export_storage_ips(stack, config_download_dir)
>   File "/usr/lib/python3.6/site-packages/tripleoclient/export.py", line 158,
> in export_storage_ips
>     ip = inventory_data[mon_role]['hosts'][hostname]['storage_ip']
> KeyError: 'storage_ip'
> 'storage_ip'
> 
> 
> Version-Release number of selected component (if applicable):
> 16.1
> 
> How reproducible:
> Every time.
> 
> Steps to Reproduce:
> 1. Deploy DCN/Spine Leaf with ceph in central & edge without isolated
> networks.
> 2. Execute command:
> 
> sudo -E openstack overcloud export ceph \
> --stack central \
> --config-download-dir /var/lib/mistral \
> --output-file ~/dcn-common/central_ceph_external.yaml
> 
> This is from section 5.3, step 2 here:
> 
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.
> 1/html/distributed_compute_node_and_storage_deployment/assembly_deploying-
> storage-at-the-edge#deploying_edge_sites_with_storage
> 
> 3.
> 
> Actual results:
> Traceback show above.
> 
> Expected results:
> Creation of ~/dcn-common/central_ceph_external.yaml file which contains the
> Ceph credential information.  Example:
> 
> parameter_defaults:
>   CephExternalMultiConfig:
>   - ceph_conf_overrides:
>       client:
>         keyring: /etc/ceph/central.client.openstack.keyring
>     cluster: central
>     dashboard_enabled: false
>     external_cluster_mon_ips: 10.20.0.10,10.20.0.11,10.20.0.12
>     fsid: 12345678-1234-1234-1234-1234567890ab
>     keys:
>     - caps:
>         mgr: allow *
>         mon: profile rbd
>         osd: profile rbd pool=vms, profile rbd pool=volumes, profile rbd
> pool=images
>       key: ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ab==
>       mode: '0600'
>       name: client.openstack
> 
> 
> Additional info:

Would you please send me a copy of /var/lib/mistral from your undercloud so that I can be sure the export script can deal with this scenario? 

I am unable to reproduce this in my environment. I deployed without network isolation and so my ceph services are listening on the provisioning network (as you describe above) but when this happens the storage_ip is still set.

$ grep storage_ip inventory.yml 
      storage_ip: 192.168.24.8
      storage_ip: 192.168.24.23
      storage_ip: 192.168.24.12
      storage_ip: 192.168.24.11
$ 

My experience is that the inventory gets built with storage_ip entry either way, even if it's on the provisioning network (defaulting to 192.168.24.0/24).

Comment 2 Darin Sorrentino 2021-04-06 13:32:26 UTC
Created attachment 1769573 [details]
TGZ of /var/lib/mistral on undercloud as requested by John Fulton

Comment 4 John Fulton 2021-04-06 14:15:39 UTC
Created attachment 1769581 [details]
central inventory without storage_ip

Comment 28 errata-xmlrpc 2021-12-09 20:18:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.7 (Train) bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3762


Note You need to log in before you can comment on or make changes to this bug.