Description of problem: Exporting ceph configuration data for a DCN deployment fails when using a collapsed network topology. You don't need to utilize isolated networks when deploying spine/leaf topology and it is possible to deploy with collapsing all networks down into the provisioning network. When you have this deployment topology, the command to export the ceph data from the central/leaf0 location fails: Exception occured while running the command Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/tripleoclient/command.py", line 32, in run super(Command, self).run(parsed_args) File "/usr/lib/python3.6/site-packages/osc_lib/command/command.py", line 41, in run return super(Command, self).run(parsed_args) File "/usr/lib/python3.6/site-packages/cliff/command.py", line 185, in run return_code = self.take_action(parsed_args) or 0 File "/usr/lib/python3.6/site-packages/tripleoclient/v1/overcloud_export_ceph.py", line 105, in take_action config_download_dir)) File "/usr/lib/python3.6/site-packages/tripleoclient/export.py", line 171, in export_ceph mon_ips = export_storage_ips(stack, config_download_dir) File "/usr/lib/python3.6/site-packages/tripleoclient/export.py", line 158, in export_storage_ips ip = inventory_data[mon_role]['hosts'][hostname]['storage_ip'] KeyError: 'storage_ip' 'storage_ip' Version-Release number of selected component (if applicable): 16.1 How reproducible: Every time. Steps to Reproduce: 1. Deploy DCN/Spine Leaf with ceph in central & edge without isolated networks. 2. Execute command: sudo -E openstack overcloud export ceph \ --stack central \ --config-download-dir /var/lib/mistral \ --output-file ~/dcn-common/central_ceph_external.yaml This is from section 5.3, step 2 here: https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html/distributed_compute_node_and_storage_deployment/assembly_deploying-storage-at-the-edge#deploying_edge_sites_with_storage 3. Actual results: Traceback show above. Expected results: Creation of ~/dcn-common/central_ceph_external.yaml file which contains the Ceph credential information. Example: parameter_defaults: CephExternalMultiConfig: - ceph_conf_overrides: client: keyring: /etc/ceph/central.client.openstack.keyring cluster: central dashboard_enabled: false external_cluster_mon_ips: 10.20.0.10,10.20.0.11,10.20.0.12 fsid: 12345678-1234-1234-1234-1234567890ab keys: - caps: mgr: allow * mon: profile rbd osd: profile rbd pool=vms, profile rbd pool=volumes, profile rbd pool=images key: ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ab== mode: '0600' name: client.openstack Additional info:
(In reply to Darin Sorrentino from comment #0) > Description of problem: > > Exporting ceph configuration data for a DCN deployment fails when using a > collapsed network topology. > > You don't need to utilize isolated networks when deploying spine/leaf > topology and it is possible to deploy with collapsing all networks down into > the provisioning network. When you have this deployment topology, the > command to export the ceph data from the central/leaf0 location fails: > > Exception occured while running the command > Traceback (most recent call last): > File "/usr/lib/python3.6/site-packages/tripleoclient/command.py", line 32, > in run > super(Command, self).run(parsed_args) > File "/usr/lib/python3.6/site-packages/osc_lib/command/command.py", line > 41, in run > return super(Command, self).run(parsed_args) > File "/usr/lib/python3.6/site-packages/cliff/command.py", line 185, in run > return_code = self.take_action(parsed_args) or 0 > File > "/usr/lib/python3.6/site-packages/tripleoclient/v1/overcloud_export_ceph.py", > line 105, in take_action > config_download_dir)) > File "/usr/lib/python3.6/site-packages/tripleoclient/export.py", line 171, > in export_ceph > mon_ips = export_storage_ips(stack, config_download_dir) > File "/usr/lib/python3.6/site-packages/tripleoclient/export.py", line 158, > in export_storage_ips > ip = inventory_data[mon_role]['hosts'][hostname]['storage_ip'] > KeyError: 'storage_ip' > 'storage_ip' > > > Version-Release number of selected component (if applicable): > 16.1 > > How reproducible: > Every time. > > Steps to Reproduce: > 1. Deploy DCN/Spine Leaf with ceph in central & edge without isolated > networks. > 2. Execute command: > > sudo -E openstack overcloud export ceph \ > --stack central \ > --config-download-dir /var/lib/mistral \ > --output-file ~/dcn-common/central_ceph_external.yaml > > This is from section 5.3, step 2 here: > > https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16. > 1/html/distributed_compute_node_and_storage_deployment/assembly_deploying- > storage-at-the-edge#deploying_edge_sites_with_storage > > 3. > > Actual results: > Traceback show above. > > Expected results: > Creation of ~/dcn-common/central_ceph_external.yaml file which contains the > Ceph credential information. Example: > > parameter_defaults: > CephExternalMultiConfig: > - ceph_conf_overrides: > client: > keyring: /etc/ceph/central.client.openstack.keyring > cluster: central > dashboard_enabled: false > external_cluster_mon_ips: 10.20.0.10,10.20.0.11,10.20.0.12 > fsid: 12345678-1234-1234-1234-1234567890ab > keys: > - caps: > mgr: allow * > mon: profile rbd > osd: profile rbd pool=vms, profile rbd pool=volumes, profile rbd > pool=images > key: ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ab== > mode: '0600' > name: client.openstack > > > Additional info: Would you please send me a copy of /var/lib/mistral from your undercloud so that I can be sure the export script can deal with this scenario? I am unable to reproduce this in my environment. I deployed without network isolation and so my ceph services are listening on the provisioning network (as you describe above) but when this happens the storage_ip is still set. $ grep storage_ip inventory.yml storage_ip: 192.168.24.8 storage_ip: 192.168.24.23 storage_ip: 192.168.24.12 storage_ip: 192.168.24.11 $ My experience is that the inventory gets built with storage_ip entry either way, even if it's on the provisioning network (defaulting to 192.168.24.0/24).
Created attachment 1769573 [details] TGZ of /var/lib/mistral on undercloud as requested by John Fulton
Created attachment 1769581 [details] central inventory without storage_ip
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.1.7 (Train) bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3762