Bug 2306778

Summary: Ceph Cluster 8.0 connection fails in the vSphere plugin while adding Add Storage System without mTLS configured
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Ernesto Puerta <epuertat>
Component: Ceph-DashboardAssignee: Nizamudeen <nia>
Status: CLOSED DUPLICATE QA Contact: Vinayak Papnoi <vpapnoi>
Severity: urgent Docs Contact: Anjana Suparna Sriram <asriram>
Priority: unspecified    
Version: 8.0CC: afrahman, ceph-docs, ceph-eng-bugs, cephqe-warriors, kramaswa
Target Milestone: ---   
Target Release: 8.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2303338 Environment:
Last Closed: 2024-09-10 08:55:41 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2303338    

Description Ernesto Puerta 2024-08-21 16:13:23 UTC
+++ This bug was initially created as a clone of Bug #2303338 +++

Description of problem:

Ceph Cluster 8.0 connection  fails in the vSphere plugin while adding Add Storage System without mTLS configured

Version-Release number of selected component (if applicable):


 cp.stg.icr.io/cp/ibm-ceph/ceph-8-rhel9:8-13
 cp.stg.icr.io/cp/ibm-ceph/nvmeof-cli-rhel9:1.2.17-8
 cp.stg.icr.io/cp/ibm-ceph/nvmeof-rhel9:1.2.17-6


Pre-Req Configured:

[root@cephqe-node1 ~]# ceph auth ls | grep nvmeof
client.nvmeof.rbd.cephqe-node2.ycbwfr
client.nvmeof.rbd.cephqe-node3.unadxm
client.nvmeof.rbd.cephqe-node5.yagutr
client.nvmeof.rbd.cephqe-node7.jiunhe
[root@cephqe-node1 ~]# ceph osd get-require-min-compat-client
mimic
[root@cephqe-node1 ~]# ceph dashboard nvmeof-gateway-list
{"gateways": {"cephqe-node2": {"service_url": "10.70.39.49:5500"}, "cephqe-node3": {"service_url": "10.70.39.50:5500"}, "cephqe-node5": {"service_url": "10.70.39.52:5500"}, "cephqe-node7": {"service_url": "10.70.39.54:5500"}}}
[root@cephqe-node1 ~]# 


Plugin Error Log:

2024-08-07 04:54:52,469 - endpoints.py[line:177] - vsphere-plugin.endpoints - INFO : GET /api/cephclusters/9
2024-08-07 04:54:52,471 - ceph_manager.py[line:508] - vsphere-plugin.ceph_manager - INFO : Sending command: https://cephqe-node1.lab.eng.blr.redhat.com:8443/api/summary
2024-08-07 04:54:52,490 - ceph_manager.py[line:515] - vsphere-plugin.ceph_manager - INFO : Storage system bf73c41c-541a-11ef-a88e-4c5262033c3d response for command https://cephqe-node1.lab.eng.blr.redhat.com:8443/api/summary
2024-08-07 04:54:52,490 - ceph_manager.py[line:508] - vsphere-plugin.ceph_manager - INFO : Sending command: https://cephqe-node1.lab.eng.blr.redhat.com:8443/api/health/get_cluster_capacity
2024-08-07 04:54:52,501 - ceph_manager.py[line:515] - vsphere-plugin.ceph_manager - INFO : Storage system bf73c41c-541a-11ef-a88e-4c5262033c3d response for command https://cephqe-node1.lab.eng.blr.redhat.com:8443/api/health/get_cluster_capacity
2024-08-07 04:54:52,502 - ceph_manager.py[line:508] - vsphere-plugin.ceph_manager - INFO : Sending command: https://cephqe-node1.lab.eng.blr.redhat.com:8443/api/pool/rbd?stats=true
2024-08-07 04:54:52,518 - ceph_manager.py[line:515] - vsphere-plugin.ceph_manager - INFO : Storage system bf73c41c-541a-11ef-a88e-4c5262033c3d response for command https://cephqe-node1.lab.eng.blr.redhat.com:8443/api/pool/rbd?stats=true
2024-08-07 04:54:52,519 - ceph_manager.py[line:508] - vsphere-plugin.ceph_manager - INFO : Sending command: https://cephqe-node1.lab.eng.blr.redhat.com:8443/api/nvmeof/gateway
2024-08-07 04:54:52,539 - ceph_manager.py[line:515] - vsphere-plugin.ceph_manager - INFO : Storage system bf73c41c-541a-11ef-a88e-4c5262033c3d response for command https://cephqe-node1.lab.eng.blr.redhat.com:8443/api/nvmeof/gateway
2024-08-07 04:54:52,540 - ceph_manager.py[line:531] - vsphere-plugin.ceph_manager - ERROR : Caught HTTPStatusError with status_code 400 and detail {"detail": "Failed to get nvmeof_server_cert for cephqe-node2: No secret found for entity nvmeof_server_cert with service name cephqe-node2", "component": null}
2024-08-07 04:54:52,540 - ceph_exception_manager.py[line:57] - vsphere-plugin.ceph_exception_manager - ERROR : Status code: 400, detail: {"detail": "Failed to get nvmeof_server_cert for cephqe-node2: No secret found for entity nvmeof_server_cert with service name cephqe-node2", "component": null}
Traceback (most recent call last):
  File "/app/ceph_manager.py", line 520, in _make_get_request
    response.raise_for_status()
  File "/usr/local/lib/python3.11/site-packages/httpx/_models.py", line 758, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Client error '400 Bad Request' for url 'https://cephqe-node1.lab.eng.blr.redhat.com:8443/api/nvmeof/gateway'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/400

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/ceph_exception_manager.py", line 53, in wrapper
    return await fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/endpoints.py", line 58, in make_basic_request
    return await fs.make_basic_request(command)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ceph_manager.py", line 367, in make_basic_request
    response = await self._make_get_request(request, headers)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ceph_manager.py", line 534, in _make_get_request
    raise ceph_exception.ConnectionErrorException(status_code, detail) from err
ceph_exception_manager.ConnectionErrorException: (400, '{"detail": "Failed to get nvmeof_server_cert for cephqe-node2: No secret found for entity nvmeof_server_cert with service name cephqe-node2", "component": null}')
^C
root@ibm-storage-ceph-plugin-for-vsphere-1 [ /opt/persistent ]#