Bug 2193223

Summary: OSP Director Deployed Ceph with Dashboard on external network; Grafana causes 500 Internal Server Error messages
Product: Red Hat OpenStack Reporter: Brenda McLaren <bmclaren>
Component: tripleo-ansibleAssignee: Francesco Pantano <fpantano>
Status: ON_QA --- QA Contact: Alfredo <alfrgarc>
Severity: medium Docs Contact:
Priority: high    
Version: 16.2 (Train)CC: alfrgarc, jdurgin, lhh, mhicks, mkatari
Target Milestone: z6Keywords: Triaged
Target Release: 16.2 (Train on RHEL 8.4)   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: tripleo-ansible-0.8.1-2.20230613005013.cc5950e.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Screenshot of dashboard with error. none

Description Brenda McLaren 2023-05-04 18:34:23 UTC
Created attachment 1962320 [details]
Screenshot of dashboard with error.

Description of problem:
When deploying the Ceph Dashboard in an OSP Director deployed Ceph environment on the external network following the work around in BZ2082361, the dashboard throws 500 - Internal Server Errors when accessing any screen that has the Overall Performance Link available.

Version-Release number of selected component (if applicable):
ceph version 14.2.22-128.el8cp (40a2bf9c4e79e39754d69a95cd51bd60991284be) nautilus (stable)


How reproducible:
Deploy an OSP 16.2 environment with Director Deployed Ceph and set the following parameters in the ceph-dashboard-network-override.yaml file:

parameter_defaults:
  # Have ceph dashboard listen on same IP as Horizon dashboard
  ServiceNetMap:
    CephDashboardNetwork: external


Actual results:
The Ceph dashboard gets deployed on the external network but Grafana gets deployed on the ctlplane network.  All screens that pull data from Grafana throw a 500 Internal Server Error message and Overall Performance graphs are not rendered.

Expected results:
No error received and the graphs are rendered in the Ceph dashboard.

Additional info:
The data on the Ceph dashboard (i.e. Pools, Pool Lists tab) is displayed correctly.  It's just the data coming from Grafana (i.e Pools, Overall Performance) that doesn't display and appears to be causing the error.  When accessing a screen that doesn't pull from Grafana (i.e. Block --> Mirroring), no error is received.

From one of the control nodes:
# sudo grep dashboard /etc/puppet/hieradata/vip_data.json
    "ceph_dashboard_vip": "10.1.0.95",


From inside one of the mon containers:
# ceph config dump | grep -i grafana_api_url
  mgr                            advanced mgr/dashboard/GRAFANA_API_URL                  http://10.10.0.119:3100                  *