Bug 2303338 - Ceph Cluster 8.0 connection fails in the vSphere plugin while adding Add Storage System without mTLS configured
Summary: Ceph Cluster 8.0 connection fails in the vSphere plugin while adding Add Sto...
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: vSphere plugin
Version: 8.0
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: 8.0
Assignee: Ernesto Puerta
QA Contact: Krishna Ramaswamy
ceph-docs@redhat.com
URL:
Whiteboard:
Depends On: 2303116 2306778
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-08-07 05:14 UTC by Krishna Ramaswamy
Modified: 2024-09-10 10:44 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2306778 (view as bug list)
Environment:
Last Closed:
Embargoed:
hberrisf: needinfo+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-9493 0 None None None 2024-08-21 16:15:12 UTC

Description Krishna Ramaswamy 2024-08-07 05:14:57 UTC
Description of problem:

Ceph Cluster 8.0 connection  fails in the vSphere plugin while adding Add Storage System without mTLS configured

Version-Release number of selected component (if applicable):


 cp.stg.icr.io/cp/ibm-ceph/ceph-8-rhel9:8-13
 cp.stg.icr.io/cp/ibm-ceph/nvmeof-cli-rhel9:1.2.17-8
 cp.stg.icr.io/cp/ibm-ceph/nvmeof-rhel9:1.2.17-6


Pre-Req Configured:

[root@cephqe-node1 ~]# ceph auth ls | grep nvmeof
client.nvmeof.rbd.cephqe-node2.ycbwfr
client.nvmeof.rbd.cephqe-node3.unadxm
client.nvmeof.rbd.cephqe-node5.yagutr
client.nvmeof.rbd.cephqe-node7.jiunhe
[root@cephqe-node1 ~]# ceph osd get-require-min-compat-client
mimic
[root@cephqe-node1 ~]# ceph dashboard nvmeof-gateway-list
{"gateways": {"cephqe-node2": {"service_url": "10.70.39.49:5500"}, "cephqe-node3": {"service_url": "10.70.39.50:5500"}, "cephqe-node5": {"service_url": "10.70.39.52:5500"}, "cephqe-node7": {"service_url": "10.70.39.54:5500"}}}
[root@cephqe-node1 ~]# 


Plugin Error Log:

2024-08-07 04:54:52,469 - endpoints.py[line:177] - vsphere-plugin.endpoints - INFO : GET /api/cephclusters/9
2024-08-07 04:54:52,471 - ceph_manager.py[line:508] - vsphere-plugin.ceph_manager - INFO : Sending command: https://cephqe-node1.lab.eng.blr.redhat.com:8443/api/summary
2024-08-07 04:54:52,490 - ceph_manager.py[line:515] - vsphere-plugin.ceph_manager - INFO : Storage system bf73c41c-541a-11ef-a88e-4c5262033c3d response for command https://cephqe-node1.lab.eng.blr.redhat.com:8443/api/summary
2024-08-07 04:54:52,490 - ceph_manager.py[line:508] - vsphere-plugin.ceph_manager - INFO : Sending command: https://cephqe-node1.lab.eng.blr.redhat.com:8443/api/health/get_cluster_capacity
2024-08-07 04:54:52,501 - ceph_manager.py[line:515] - vsphere-plugin.ceph_manager - INFO : Storage system bf73c41c-541a-11ef-a88e-4c5262033c3d response for command https://cephqe-node1.lab.eng.blr.redhat.com:8443/api/health/get_cluster_capacity
2024-08-07 04:54:52,502 - ceph_manager.py[line:508] - vsphere-plugin.ceph_manager - INFO : Sending command: https://cephqe-node1.lab.eng.blr.redhat.com:8443/api/pool/rbd?stats=true
2024-08-07 04:54:52,518 - ceph_manager.py[line:515] - vsphere-plugin.ceph_manager - INFO : Storage system bf73c41c-541a-11ef-a88e-4c5262033c3d response for command https://cephqe-node1.lab.eng.blr.redhat.com:8443/api/pool/rbd?stats=true
2024-08-07 04:54:52,519 - ceph_manager.py[line:508] - vsphere-plugin.ceph_manager - INFO : Sending command: https://cephqe-node1.lab.eng.blr.redhat.com:8443/api/nvmeof/gateway
2024-08-07 04:54:52,539 - ceph_manager.py[line:515] - vsphere-plugin.ceph_manager - INFO : Storage system bf73c41c-541a-11ef-a88e-4c5262033c3d response for command https://cephqe-node1.lab.eng.blr.redhat.com:8443/api/nvmeof/gateway
2024-08-07 04:54:52,540 - ceph_manager.py[line:531] - vsphere-plugin.ceph_manager - ERROR : Caught HTTPStatusError with status_code 400 and detail {"detail": "Failed to get nvmeof_server_cert for cephqe-node2: No secret found for entity nvmeof_server_cert with service name cephqe-node2", "component": null}
2024-08-07 04:54:52,540 - ceph_exception_manager.py[line:57] - vsphere-plugin.ceph_exception_manager - ERROR : Status code: 400, detail: {"detail": "Failed to get nvmeof_server_cert for cephqe-node2: No secret found for entity nvmeof_server_cert with service name cephqe-node2", "component": null}
Traceback (most recent call last):
  File "/app/ceph_manager.py", line 520, in _make_get_request
    response.raise_for_status()
  File "/usr/local/lib/python3.11/site-packages/httpx/_models.py", line 758, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Client error '400 Bad Request' for url 'https://cephqe-node1.lab.eng.blr.redhat.com:8443/api/nvmeof/gateway'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/400

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/ceph_exception_manager.py", line 53, in wrapper
    return await fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/endpoints.py", line 58, in make_basic_request
    return await fs.make_basic_request(command)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ceph_manager.py", line 367, in make_basic_request
    response = await self._make_get_request(request, headers)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ceph_manager.py", line 534, in _make_get_request
    raise ceph_exception.ConnectionErrorException(status_code, detail) from err
ceph_exception_manager.ConnectionErrorException: (400, '{"detail": "Failed to get nvmeof_server_cert for cephqe-node2: No secret found for entity nvmeof_server_cert with service name cephqe-node2", "component": null}')
^C
root@ibm-storage-ceph-plugin-for-vsphere-1 [ /opt/persistent ]#

Comment 1 Hannah Berrisford 2024-09-05 14:22:32 UTC
Need some more info - do we need to do any work alongside the Dashboard work?
Will the Dashboard team be working on this fix?


Note You need to log in before you can comment on or make changes to this bug.