Bug 2134769 - create-external-cluster-resources.py didn't properly handle --rgw-tls-cert-path parameter
Summary: create-external-cluster-resources.py didn't properly handle --rgw-tls-cert-pa...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: rook
Version: 4.12
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ODF 4.12.0
Assignee: Parth Arora
QA Contact: Daniel Horák
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-10-14 08:46 UTC by Daniel Horák
Modified: 2023-08-09 17:03 UTC (History)
6 users (show)

Fixed In Version: 4.12.0-100
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-02-08 14:06:28 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github red-hat-storage rook pull 424 0 None open Bug 2134769: external: fix endpoint_dial check for rgw endpoint 2022-10-24 17:25:04 UTC
Github rook rook issues 11060 0 None open create-external-cluster-resources.py didn't properly handle --rgw-tls-cert-path parameter 2022-10-14 08:46:56 UTC
Github rook rook pull 11090 0 None open external: fix endpoint_dial check for rgw endpoint 2022-10-14 09:06:07 UTC

Description Daniel Horák 2022-10-14 08:46:57 UTC
Description of problem (please be detailed as possible and provide log
snippests):
  Calling the `create-external-cluster-resources.py` script with --rgw-endpoint
  and --rgw-tls-cert-path parameters (first pointing to the RGW Endpoint,
  second to the path with certificate), fails with vague error:
    Execution Failed: unable to connect to endpoint: <IP>:443


Version of all relevant components (if applicable):
  Rook version (use rook version inside of a Rook Pod): v4.12.0-0.ffcae8e019e3e67f76c70c9badde72646034ec79
  Storage backend version (e.g. for ceph do ceph -v): ceph version 16.2.7-126.el8cp (fe0af61d104d48cb9d116cde6e593b5fc8c197e4) pacific (stable)


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
  I'm not able to use the script for RGW with SSL/TLS enabled.


Is there any workaround available to the best of your knowledge?
  No


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
  3


Can this issue reproducible?
  100%


Can this issue reproduce from the UI?
  N/A


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. prepare external Ceph cluster with SSL for RGW
2. run the `create-external-cluster-resources.py` script (with `--rgw-endpoint`
    and `--rgw-tls-cert-path` parameters)


Actual results:
  When is the create-external-cluster-resources.py script called with
  --rgw-endpoint and --rgw-tls-cert-path parameters (first pointing to the RGW
  Endpoint, second to the path with certificate, it fails with following vague
  error:

    Execution Failed: unable to connect to endpoint: <IP>:443

  The problem is, that the path from --rgw-tls-cert-path parameter is used in
  validate_rgw_endpoint_tls_cert, which return the content of the cert file.
  And then in the endpoint_dial method, the content of the cert file is passed
  to requests.head(...) as verify parameter:

    r = requests.head(ep, timeout=timeout, verify=cert)

  But it expect path to the cert, not the content of the cert.


Expected results:
  The create-external-cluster-resources.py script will properly handle the
  certificate provided via --rgw-tls-cert-path argument.

Additional info:
  There is also similar problem in the get_rgw_fsid method. In following code:

  ...   
          if self._arg_parser.rgw_tls_cert_path and not self._arg_parser.rgw_skip_tls:
              cert = self.validate_rgw_endpoint_tls_cert()
              verify = True
  ...
              r = requests.get(
                  request_url,
                  auth=S3Auth(access_key, secret_key, rgw_endpoint),
                  cert=cert,
                  verify=verify,
              )
  ...

  It uses both cert and verify arguments of the get method, but in wrong
  meaning[1]:

  >  cert – (optional) if String, path to ssl client cert file (.pem). If
  >     Tuple, (‘cert’, ‘key’) pair.
  >  verify – (optional) Either a boolean, in which case it controls whether we
  >     verify the server’s TLS certificate, or a string, in which case it must be
  >     a path to a CA bundle to use. Defaults to True.
  [1] https://requests.readthedocs.io/en/latest/api/

  This is required for https://issues.redhat.com/browse/RHSTOR-2537
  Upstream issue: https://github.com/rook/rook/issues/11060

Comment 1 Parth Arora 2022-10-14 09:06:08 UTC
PR: https://github.com/rook/rook/pull/11090

Comment 2 Travis Nielsen 2022-10-14 14:17:11 UTC
Moving back to post until it's merged downstream

Comment 10 Daniel Horák 2023-01-10 08:21:04 UTC
Tested and verified on:
  OCP: 4.12.0-0.nightly-2022-12-13-205407
  ODF: 4.12.0-140

Execution of the create-external-cluster-resources.py script
(named /tmp/external-cluster-details-exporter-h7qnoepc.py) - output truncated:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Executing cmd: python3 /tmp/external-cluster-details-exporter-h7qnoepc.py --rbd-data-pool-name rbd --rgw-endpoint <IP1>:443 --rgw-tls-cert-path /tmp/cephqe-ca.pem on <IP2>
retcode: 0
stdout: [{"name": "rook-ceph-mon-endpoints", "kind": "ConfigMap", "data": {"data": "ceph-ci-j-045vu1ce33-d-exagl-fpe19m-node1-installer=<IP2>:6789", "maxMonId": "0", "mapping": "{}"}}, {"name": "rook-ceph-mon", "kind": "Secret", "data": {"admin-secret": "admin-secret", "fsid": "934f475c-7b7d-11ed-a37b-fa163ed4a18a", "mon-secret": "mon-secret"}}, {"name": "rook-ceph-operator-creds", "kind": "Secret", "data": {"userID": "client.healthchecker", "userKey": "AQC0gZljRv9OMhAAJ7vQ4YEez8IRnkHw7/6/pw=="}}, {"name": "monitoring-endpoint", "kind": "CephCluster", "data": {"MonitoringEndpoint": "<IP2>", "MonitoringPort": "9283"}}, {"name": "rook-csi-rbd-node", "kind": "Secret", "data": {"userID": "csi-rbd-node", "userKey": "..."}}, {"name": "rook-csi-rbd-provisioner", "kind": "Secret", "data": {"userID": "csi-rbd-provisioner", "userKey": "..."}}, {"name": "rook-csi-cephfs-provisioner", "kind": "Secret", "data": {"adminID": "csi-cephfs-provisioner", "adminKey": "..."}}, {"name": "rook-csi-cephfs-node", "kind": "Secret", "data": {"adminID": "csi-cephfs-node", "adminKey": "..."}}, {"name": "rook-ceph-dashboard-link", "kind": "Secret", "data": {"userID": "ceph-dashboard-link", "userKey": "https://<IP2>:8443/"}}, {"name": "ceph-rbd", "kind": "StorageClass", "data": {"pool": "rbd", "csi.storage.k8s.io/provisioner-secret-name": "rook-csi-rbd-provisioner", "csi.storage.k8s.io/controller-expand-secret-name": "rook-csi-rbd-provisioner", "csi.storage.k8s.io/node-stage-secret-name": "rook-csi-rbd-node"}}, {"name": "cephfs", "kind": "StorageClass", "data": {"fsName": "fsvol001", "pool": "cephfs.fsvol001.data", "csi.storage.k8s.io/provisioner-secret-name": "rook-csi-cephfs-provisioner", "csi.storage.k8s.io/controller-expand-secret-name": "rook-csi-cephfs-provisioner", "csi.storage.k8s.io/node-stage-secret-name": "rook-csi-cephfs-node"}}, {"name": "ceph-rgw", "kind": "StorageClass", "data": {"endpoint": "<IP1>:443", "poolPrefix": "default"}}, {"name": "rgw-admin-ops-user", "kind": "Secret", "data": {"accessKey": "...", "secretKey": "..."}}, {"name": "ceph-rgw-tls-cert", "kind": "Secret", "data": {"cert": "-----BEGIN CERTIFICATE-----\nMIID/zCCAu ... \n-----END CERTIFICATE-----"}}]
stderr: 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Full log output: https://url.corp.redhat.com/e71e8a3

The create-external-cluster-resources.py script properly handled certificate
provided via --rgw-tls-cert-path parameter.

>> VERIFIED


Note You need to log in before you can comment on or make changes to this bug.