Bug 2134769

Summary: create-external-cluster-resources.py didn't properly handle --rgw-tls-cert-path parameter
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Daniel Horák <dahorak>
Component: rookAssignee: Parth Arora <paarora>
Status: CLOSED CURRENTRELEASE QA Contact: Daniel Horák <dahorak>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.12CC: ebenahar, muagarwa, ocs-bugs, odf-bz-bot, paarora, tnielsen
Target Milestone: ---   
Target Release: ODF 4.12.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 4.12.0-100 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-02-08 14:06:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Daniel Horák 2022-10-14 08:46:57 UTC
Description of problem (please be detailed as possible and provide log
snippests):
  Calling the `create-external-cluster-resources.py` script with --rgw-endpoint
  and --rgw-tls-cert-path parameters (first pointing to the RGW Endpoint,
  second to the path with certificate), fails with vague error:
    Execution Failed: unable to connect to endpoint: <IP>:443


Version of all relevant components (if applicable):
  Rook version (use rook version inside of a Rook Pod): v4.12.0-0.ffcae8e019e3e67f76c70c9badde72646034ec79
  Storage backend version (e.g. for ceph do ceph -v): ceph version 16.2.7-126.el8cp (fe0af61d104d48cb9d116cde6e593b5fc8c197e4) pacific (stable)


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
  I'm not able to use the script for RGW with SSL/TLS enabled.


Is there any workaround available to the best of your knowledge?
  No


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
  3


Can this issue reproducible?
  100%


Can this issue reproduce from the UI?
  N/A


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. prepare external Ceph cluster with SSL for RGW
2. run the `create-external-cluster-resources.py` script (with `--rgw-endpoint`
    and `--rgw-tls-cert-path` parameters)


Actual results:
  When is the create-external-cluster-resources.py script called with
  --rgw-endpoint and --rgw-tls-cert-path parameters (first pointing to the RGW
  Endpoint, second to the path with certificate, it fails with following vague
  error:

    Execution Failed: unable to connect to endpoint: <IP>:443

  The problem is, that the path from --rgw-tls-cert-path parameter is used in
  validate_rgw_endpoint_tls_cert, which return the content of the cert file.
  And then in the endpoint_dial method, the content of the cert file is passed
  to requests.head(...) as verify parameter:

    r = requests.head(ep, timeout=timeout, verify=cert)

  But it expect path to the cert, not the content of the cert.


Expected results:
  The create-external-cluster-resources.py script will properly handle the
  certificate provided via --rgw-tls-cert-path argument.

Additional info:
  There is also similar problem in the get_rgw_fsid method. In following code:

  ...   
          if self._arg_parser.rgw_tls_cert_path and not self._arg_parser.rgw_skip_tls:
              cert = self.validate_rgw_endpoint_tls_cert()
              verify = True
  ...
              r = requests.get(
                  request_url,
                  auth=S3Auth(access_key, secret_key, rgw_endpoint),
                  cert=cert,
                  verify=verify,
              )
  ...

  It uses both cert and verify arguments of the get method, but in wrong
  meaning[1]:

  >  cert – (optional) if String, path to ssl client cert file (.pem). If
  >     Tuple, (‘cert’, ‘key’) pair.
  >  verify – (optional) Either a boolean, in which case it controls whether we
  >     verify the server’s TLS certificate, or a string, in which case it must be
  >     a path to a CA bundle to use. Defaults to True.
  [1] https://requests.readthedocs.io/en/latest/api/

  This is required for https://issues.redhat.com/browse/RHSTOR-2537
  Upstream issue: https://github.com/rook/rook/issues/11060

Comment 1 Parth Arora 2022-10-14 09:06:08 UTC
PR: https://github.com/rook/rook/pull/11090

Comment 2 Travis Nielsen 2022-10-14 14:17:11 UTC
Moving back to post until it's merged downstream

Comment 10 Daniel Horák 2023-01-10 08:21:04 UTC
Tested and verified on:
  OCP: 4.12.0-0.nightly-2022-12-13-205407
  ODF: 4.12.0-140

Execution of the create-external-cluster-resources.py script
(named /tmp/external-cluster-details-exporter-h7qnoepc.py) - output truncated:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Executing cmd: python3 /tmp/external-cluster-details-exporter-h7qnoepc.py --rbd-data-pool-name rbd --rgw-endpoint <IP1>:443 --rgw-tls-cert-path /tmp/cephqe-ca.pem on <IP2>
retcode: 0
stdout: [{"name": "rook-ceph-mon-endpoints", "kind": "ConfigMap", "data": {"data": "ceph-ci-j-045vu1ce33-d-exagl-fpe19m-node1-installer=<IP2>:6789", "maxMonId": "0", "mapping": "{}"}}, {"name": "rook-ceph-mon", "kind": "Secret", "data": {"admin-secret": "admin-secret", "fsid": "934f475c-7b7d-11ed-a37b-fa163ed4a18a", "mon-secret": "mon-secret"}}, {"name": "rook-ceph-operator-creds", "kind": "Secret", "data": {"userID": "client.healthchecker", "userKey": "AQC0gZljRv9OMhAAJ7vQ4YEez8IRnkHw7/6/pw=="}}, {"name": "monitoring-endpoint", "kind": "CephCluster", "data": {"MonitoringEndpoint": "<IP2>", "MonitoringPort": "9283"}}, {"name": "rook-csi-rbd-node", "kind": "Secret", "data": {"userID": "csi-rbd-node", "userKey": "..."}}, {"name": "rook-csi-rbd-provisioner", "kind": "Secret", "data": {"userID": "csi-rbd-provisioner", "userKey": "..."}}, {"name": "rook-csi-cephfs-provisioner", "kind": "Secret", "data": {"adminID": "csi-cephfs-provisioner", "adminKey": "..."}}, {"name": "rook-csi-cephfs-node", "kind": "Secret", "data": {"adminID": "csi-cephfs-node", "adminKey": "..."}}, {"name": "rook-ceph-dashboard-link", "kind": "Secret", "data": {"userID": "ceph-dashboard-link", "userKey": "https://<IP2>:8443/"}}, {"name": "ceph-rbd", "kind": "StorageClass", "data": {"pool": "rbd", "csi.storage.k8s.io/provisioner-secret-name": "rook-csi-rbd-provisioner", "csi.storage.k8s.io/controller-expand-secret-name": "rook-csi-rbd-provisioner", "csi.storage.k8s.io/node-stage-secret-name": "rook-csi-rbd-node"}}, {"name": "cephfs", "kind": "StorageClass", "data": {"fsName": "fsvol001", "pool": "cephfs.fsvol001.data", "csi.storage.k8s.io/provisioner-secret-name": "rook-csi-cephfs-provisioner", "csi.storage.k8s.io/controller-expand-secret-name": "rook-csi-cephfs-provisioner", "csi.storage.k8s.io/node-stage-secret-name": "rook-csi-cephfs-node"}}, {"name": "ceph-rgw", "kind": "StorageClass", "data": {"endpoint": "<IP1>:443", "poolPrefix": "default"}}, {"name": "rgw-admin-ops-user", "kind": "Secret", "data": {"accessKey": "...", "secretKey": "..."}}, {"name": "ceph-rgw-tls-cert", "kind": "Secret", "data": {"cert": "-----BEGIN CERTIFICATE-----\nMIID/zCCAu ... \n-----END CERTIFICATE-----"}}]
stderr: 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Full log output: https://url.corp.redhat.com/e71e8a3

The create-external-cluster-resources.py script properly handled certificate
provided via --rgw-tls-cert-path parameter.

>> VERIFIED