Bug 2271773 - KMSServerConnectionAlert not raised when KMIP KMS connection is unavailable
Summary: KMSServerConnectionAlert not raised when KMIP KMS connection is unavailable
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ceph-monitoring
Version: 4.15
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ODF 4.17.0
Assignee: arun kumar mohan
QA Contact: Filip Balák
URL:
Whiteboard:
Depends On:
Blocks: 2281703
TreeView+ depends on / blocked
 
Reported: 2024-03-27 10:10 UTC by Filip Balák
Modified: 2024-10-30 14:27 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
.Invalid KMIP configurations now treated as errors Previously, Thales Enterprise Key Management (KMIP) were not added in the recognized KMS services. This meant that whenever an invalid KMIP configuration was provided, it was not treated as an error. With this fix, Thales KMIP service has been added as a valid KMS service. This enables KMS services to propagate KMIP configuration statuses correctly. Therefore, any mis-configurations are treated as errors.
Clone Of:
Environment:
Last Closed: 2024-10-30 14:27:10 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2024:8676 0 None None None 2024-10-30 14:27:19 UTC

Description Filip Balák 2024-03-27 10:10:06 UTC
Description of problem (please be detailed as possible and provide log
snippests):
KMSServerConnectionAlert is raised correctly for Vault KMS but if cluster uses Thales enterprise key management (KMIP) and the connection in ocs-kms-connection-details is invalid then the alert is not raised.

Version of all relevant components (if applicable):
ODF 4.15.0-158
OCP 4.15

Can this issue reproducible?
yes

Can this issue reproduce from the UI?
yes - the config can be changed in UI

Steps to Reproduce:
1. Install cluster with Thales enterprise key management
2. Edit ocs-kms-connection-details with invalid value. E.g.:
oc -n openshift-storage patch ConfigMap ocs-kms-connection-details -n openshift-storage -p '{"data": {"KMIP_ENDPOINT": {some_invalid_endpoint}}}' --type merge
3. Wait few minutes if the alert is raised

Actual results:
Alert is not raised.

Expected results:
Alert KMSServerConnectionAlert is raised and user is notified that connection to KMS is unavailable.

Additional info:
Test run with failed test case - https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/35425/
The configuration in ocs-kms-connection-details is little different and different parameters are used (e.g. VAULT_ADDR is missing but there is KMIP_ENDPOINT)

Comment 3 arun kumar mohan 2024-04-05 10:01:59 UTC
Hi Filip,
Can you please provide me the following metric value (from the setup, which has the issue):
`ocs_storagecluster_kms_connection_status`

Some explanation:
According to the query, the alert "KMSServerConnectionAlert" will only be triggered under following condition:
`ocs_storagecluster_kms_connection_status{job="ocs-metrics-exporter"} == 1`

From the implementation (ocs-operator/metrics/internal/collectors/storage-cluster.go#33), we understand
`KMS Connection Status; 0: Connected, 1: Not Connected, 2: KMS not enabled`

So we should check what value KMS status is providing during the misconfiguration.

Comment 4 arun kumar mohan 2024-05-08 08:09:31 UTC
Filip had shared the needed info needed. Thanks Filip.

The value of ocs_storagecluster_kms_connection_status stays at 0 (in an invalid kms configured cluster setup, see comment#1).

According to the above comment, the value 0 means that it is connected. Need some more time to check this, meanwhile reducing the severity as this will happen only on a misconfigured cluster only.
We can move this out of 4.16

Comment 7 Sunil Kumar Acharya 2024-08-26 11:22:42 UTC
Are there any blockers to provide devel ack for this bz? If not, please provide the devel ack.

Comment 8 Sunil Kumar Acharya 2024-08-28 12:37:53 UTC
are we blocked on anything to provide devel ack on this bz?

Comment 13 Sunil Kumar Acharya 2024-09-18 12:06:54 UTC
Please update the RDT flag/text appropriately.

Comment 18 errata-xmlrpc 2024-10-30 14:27:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.17.0 Security, Enhancement, & Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:8676


Note You need to log in before you can comment on or make changes to this bug.