Bug 2268412 - [Backport to 4.12.z] Noobaa fails to use the new internal cert after rotation
Summary: [Backport to 4.12.z] Noobaa fails to use the new internal cert after rotation
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: Multi-Cloud Object Gateway
Version: 4.12
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ODF 4.12.13
Assignee: Nimrod Becker
QA Contact: krishnaram Karthick
URL:
Whiteboard:
Depends On: 2237903 2268410
Blocks: 2259839
TreeView+ depends on / blocked
 
Reported: 2024-03-07 11:00 UTC by Nimrod Becker
Modified: 2024-06-19 10:00 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2268410
Environment:
Last Closed: 2024-06-13 08:33:17 UTC
Embargoed:


Attachments (Terms of Use)

Description Nimrod Becker 2024-03-07 11:00:58 UTC
+++ This bug was initially created as a clone of Bug #2268410 +++

+++ This bug was initially created as a clone of Bug #2237903 +++

Description of problem (please be detailed as possible and provide log
snippests):

I have been working with my customer and Noobaa has an issue when the internal certificates are rotated:

- Certificates rotated internally on this cluster on the 28th August
- From the Noobaa endpoint:

Doing the pre check on the noobaa certificate now.

sh-4.4$ openssl s_client -connect s3.openshift-storage.svc.cluster.local:443 -showcerts 2>/dev/null </dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p'
-----BEGIN CERTIFICATE-----
MIID1jCCAr6gAwIBAgIIHHJvZg8H90wwDQYJKoZIhvcNAQELBQAwNjE0MDIGA1UE
Awwrb3BlbnNoaWZ0LXNlcnZpY2Utc2VydmluZy1zaWduZXJAMTY1OTExMTQzMDAe
Fw0yMjA5MTMxNTI5NDlaFw0yNDA5MTIxNTI5NTBaMCMxITAfBgNVBAMTGHMzLm9w
ZW5zaGlmdC1zdG9yYWdlLnN2YzCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoC
ggEBAKyt3new40UT+bzHG39SLSm5XhcYt+WkrjItgn+cYIzXwRmZVehYS0g2VVCL
zuCqCqOYkl80FQk0VXIN+on5yuBAEhm4Iu51KvXb8LQL+Gd+jgCzxagv1ar45izq
f9YqmPpfXDHwtVQKeYt9qUxgcZJJ3u+a0hpqlw36kVRc8lNOtLlnDo6c4fJj6mZT
HIfHUIpZp05eXcHYPMiEGXUrV4IfzbJ8aMMT8E00rILsqlQITB3m3HMDox4f2Sns
asP5nhTVx5boJsbFhoD1Btc2nXxr5h4rwH2cGGRxkeJ5yWJGIo+n9CdF/olD5MBW
FIZLHZ9owXXbdza7H3zOcYchwLkCAwEAAaOB+jCB9zAOBgNVHQ8BAf8EBAMCBaAw
EwYDVR0lBAwwCgYIKwYBBQUHAwEwDAYDVR0TAQH/BAIwADAdBgNVHQ4EFgQU60OA
IvfY0a/LLlTt9t6bNmmpZs4wHwYDVR0jBBgwFoAUHDbfLlYxj2u6FCpI2wtBNswR
KT0wSwYDVR0RBEQwQoIYczMub3BlbnNoaWZ0LXN0b3JhZ2Uuc3ZjgiZzMy5vcGVu
c2hpZnQtc3RvcmFnZS5zdmMuY2x1c3Rlci5sb2NhbDA1BgsrBgEEAZIIEWQCAQQm
EyQxYWViODlhZi1iMzdmLTRmNGYtYTllNS0xNWFhODk5MjJkMjEwDQYJKoZIhvcN
AQELBQADggEBAIvwEySqdjuwXjx+RDFLelDgtUkwtR9j20CYrWTSM0qE2qrAa1VR
0/cViY0/jmp/8xwuYl+3pvNSpntECz4MXgp+YNebXfewJnnlDQKAtYVpCJnahrfC
AFNFitqU+ZwABnbs7Awb8gjvlHbgYDC4G8UR3tUV+v2nnWvNt4gikGKwKpz7YgwW
rua3PrEdFZJF2TA/LbKFaPUZF6oManlX2b4gC7SmhAszeuQCvnY05GhluqRwrtBH
pmLUaUY8DmzIopq44CZcM8850JNL/p+Ds0MxHdoJqsYePjo4m2W2JJCebuEEcovq
diXJ62eEIQDkxRUZvz82M1sOrJ8CX7S/shY=
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
MIIDUTCCAjmgAwIBAgIIVzTOuAujXdgwDQYJKoZIhvcNAQELBQAwNjE0MDIGA1UE
Awwrb3BlbnNoaWZ0LXNlcnZpY2Utc2VydmluZy1zaWduZXJAMTY1OTExMTQzMDAe
Fw0yMjA3MjkxNjE3MDlaFw0yNDA5MjYxNjE3MTBaMDYxNDAyBgNVBAMMK29wZW5z
aGlmdC1zZXJ2aWNlLXNlcnZpbmctc2lnbmVyQDE2NTkxMTE0MzAwggEiMA0GCSqG
SIb3DQEBAQUAA4IBDwAwggEKAoIBAQC0Xz8aawepXoSeYjhzK9Bg0yDeI1t2QnrR
+JoZQt/PKV/URwazCdHZQRiKH6k5n+M99uUxTh7Uw4qNRoX6xzp5xddYspmDaKtp
8YKDPWH2VJ9GKDLqCBEbH3FDZTCTgz3Vhp0iYkfCNbxN0w6eOqf3thrJ6SqSwevd
UngDAHufVJjntBmoJJ+30+htMGK79Ix9RZSxvV8nWmS1EosmAhYtcLMCTJD8VnqY
eAi5lJ8SE4XKayW1ISM+SR69DNIj+WgKFACmGx826nGkr84b2WOjkPH51bPyEFkx
jrmDltuIAzAhtByY1csZ5/lUN9A1LBXsYUA/HwDA1aG0IwWn7B1jAgMBAAGjYzBh
MA4GA1UdDwEB/wQEAwICpDAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBQcNt8u
VjGPa7oUKkjbC0E2zBEpPTAfBgNVHSMEGDAWgBQcNt8uVjGPa7oUKkjbC0E2zBEp
PTANBgkqhkiG9w0BAQsFAAOCAQEAioqIEBfNBn1DGqogGjIQIZv5oc9MR3bgWOx1
6ilOBX/p0CzM6qQnaMDZYFbsF2up3oD1vlMmE/P0IEyTbryDbQsHRaDhrR4pVivB
3NkuFPPP3RbWtes9BBVuE4VnK9/gqT08U+FOOVd6h8vp6DgC8k438RNo0U12CPQF
xUkTkW+ZExR2pSi/fghGcQ3z8oZcQMsfO9W1sco6i0uyzjD0mt9UeWzHWQ/v3hSf
DVizES+B/5fB8jynBiEqoSq9CcGKnVOeCtOGN7e6nmebcP1sIg3NF5G72rf4SBFJ
WRfHjx1CseLINDhrIeRnogiMCYx2o+D8Vb++9CeglweyP6wqpw==
-----END CERTIFICATE-----

This is the old pre rotation certificate

After the noobaa end point POD restart the real live in RAM cert changes to what is recorded in the secret (tmpfs). See below when connecting to noobaa service (noobaa endpoint POD).

sh-4.4$ openssl s_client -connect s3.openshift-storage.svc.cluster.local:443 -showcerts 2>/dev/null </dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p'
-----BEGIN CERTIFICATE-----
MIID1jCCAr6gAwIBAgIIc9b+K9v7+2YwDQYJKoZIhvcNAQELBQAwNjE0MDIGA1UE
Awwrb3BlbnNoaWZ0LXNlcnZpY2Utc2VydmluZy1zaWduZXJAMTY1OTExMTQzMDAe
Fw0yMzA4MjgxNjIxMTBaFw0yNTA4MjcxNjIxMTFaMCMxITAfBgNVBAMTGHMzLm9w
ZW5zaGlmdC1zdG9yYWdlLnN2YzCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoC
ggEBAMDeoLwb1rRidBn/kkzyG8An+08flbkVJEp4FMFSFRnFhJD6C8wiKU1+FELb
cZo5L+4oS757SaP5VO7qgNDSRgaq3spyp0LJwgAxe4IjJmAHxgqFauciVVK1qUhw
R1z/5mbACNvL1bX9E+tWlDfaj7oc6oUfDqt8ni2BV/t3rZJX0lOMyq7FxLIhpotf
22ZEadf5tcWq8ZvNqsRzo6q/DwRdu5mkFhVX7SyqcTi4gyLKPwr2IvAbd1RttTLj
9DRjyEjqy2HVnv8omVEwH0wzi0AgfekG/uEmbJtcIJCMwZ+utzmyIjllgmUvM0wB
Tzd4PCLCtUVwpz9tftmuB9gPX8ECAwEAAaOB+jCB9zAOBgNVHQ8BAf8EBAMCBaAw
EwYDVR0lBAwwCgYIKwYBBQUHAwEwDAYDVR0TAQH/BAIwADAdBgNVHQ4EFgQUyJqf
MCBgzjjeN6qlsayKTQX/kIUwHwYDVR0jBBgwFoAUHlBdO0N76T6tlxhXNb/cRKj2
agAwSwYDVR0RBEQwQoIYczMub3BlbnNoaWZ0LXN0b3JhZ2Uuc3ZjgiZzMy5vcGVu
c2hpZnQtc3RvcmFnZS5zdmMuY2x1c3Rlci5sb2NhbDA1BgsrBgEEAZIIEWQCAQQm
EyQxYWViODlhZi1iMzdmLTRmNGYtYTllNS0xNWFhODk5MjJkMjEwDQYJKoZIhvcN
AQELBQADggEBABzklTqvnlw4i04V0y8OTKiVnjxuJs5zVO+EeeBmuhKb5f/O+KW9
o66WB9r4158sJpLVfVU+bGoSyhWtNGkYpHDHCFCoDqT4QdzSpVQbKi32tbACRlJe
4NFwViVUrZU0IeTjBbX6hoWBRMb6fPlEHSi9mAKYpV1PfZpTHoDHZTFhEFi+CndA
+fmrFwFAfE0KYdrnFGfFf/kZXkM0h+0+vcwIcjxGTidp6GpIUV5dGDR6kZHQ738v
C3S78HAXbLQkPylaWsrTpiUUKpLXMESEP7VpkV/E3RZ+4kuQSVyg9422jg7xaMJA
vJMvYFVZ+rAxghiJGti3XQq+/QehWM5o8+I=
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
MIIDUTCCAjmgAwIBAgIIWMEl+2yguMYwDQYJKoZIhvcNAQELBQAwNjE0MDIGA1UE
Awwrb3BlbnNoaWZ0LXNlcnZpY2Utc2VydmluZy1zaWduZXJAMTY1OTExMTQzMDAe
Fw0yMzA4MjgxNjE3MjdaFw0yNTEwMjYxNjE3MjhaMDYxNDAyBgNVBAMMK29wZW5z
aGlmdC1zZXJ2aWNlLXNlcnZpbmctc2lnbmVyQDE2NTkxMTE0MzAwggEiMA0GCSqG
SIb3DQEBAQUAA4IBDwAwggEKAoIBAQDEzvQ+VySQK/k/0sKVdwN7J4E4OJ8h+9GC
rDS38cLnYD3q6I/iC3ZoIZkkCkcbnHSc0/4Q/AKecXsb4pwI+9WPE5w2YQmtY6ey
2VB6Bg1BYTLw65WsWmm0CjszjMFSxyn3spesKFlYuT8mepC9ynsSofUQFUrEHZk3
YSq6sz24+KXIzCZS3k7ECGqKSyNZg30jBZmqa8cPAaws/zl9/U/rXP994qsNFruQ
DcLO1IVHYl650oOT6zswNhlzZ311fNIbf0S8VzgVxiC+TQgQJ1NQar2NmpROMSgX
Ybw6dFRxodkFfcNQAGcrqWlPCQTxlGGrl5GW5IKjkIYanw5szD9HAgMBAAGjYzBh
MA4GA1UdDwEB/wQEAwICpDAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBQeUF07
Q3vpPq2XGFc1v9xEqPZqADAfBgNVHSMEGDAWgBQeUF07Q3vpPq2XGFc1v9xEqPZq
ADANBgkqhkiG9w0BAQsFAAOCAQEAFgsXg4gciulG51Ls8W4mln4HDmYmrFLxwhZQ
qhYr0pK8p+/WHJ6wjQueMuUK2DRBX1IKnOcz3FbLgTssHp11tBxadQotVCzvaD+g
AV6njgdxIv4J0KIrONzMnlU31NkO9xRfXzyJHa6frZLxzIZ8glSiUY6U4q2Q6E9P
/eUQeVxoDthTV4iYzWBS/R3rnNBloB+2PAKUDNyNfnDwcA6f+Q4k818eI8cnbyaz
iumM/yE8V3pJfDdb1slZHEhEbR6T2DDDP7G0DOoCQ3sSbRwXQwSA2TRG/eVBBenZ
SDQgReolRpbl5pntsGPmNfmnJv7Wqwaqi3yWZQuvz0wVaH8Ilg==
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
MIIDUTCCAjmgAwIBAgIIVapBs0FkjS0wDQYJKoZIhvcNAQELBQAwNjE0MDIGA1UE
Awwrb3BlbnNoaWZ0LXNlcnZpY2Utc2VydmluZy1zaWduZXJAMTY1OTExMTQzMDAe
Fw0yMzA4MjgxNjE3MjdaFw0yNTEwMjYxNjE3MjhaMDYxNDAyBgNVBAMMK29wZW5z
aGlmdC1zZXJ2aWNlLXNlcnZpbmctc2lnbmVyQDE2NTkxMTE0MzAwggEiMA0GCSqG
SIb3DQEBAQUAA4IBDwAwggEKAoIBAQDEzvQ+VySQK/k/0sKVdwN7J4E4OJ8h+9GC
rDS38cLnYD3q6I/iC3ZoIZkkCkcbnHSc0/4Q/AKecXsb4pwI+9WPE5w2YQmtY6ey
2VB6Bg1BYTLw65WsWmm0CjszjMFSxyn3spesKFlYuT8mepC9ynsSofUQFUrEHZk3
YSq6sz24+KXIzCZS3k7ECGqKSyNZg30jBZmqa8cPAaws/zl9/U/rXP994qsNFruQ
DcLO1IVHYl650oOT6zswNhlzZ311fNIbf0S8VzgVxiC+TQgQJ1NQar2NmpROMSgX
Ybw6dFRxodkFfcNQAGcrqWlPCQTxlGGrl5GW5IKjkIYanw5szD9HAgMBAAGjYzBh
MA4GA1UdDwEB/wQEAwICpDAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBQeUF07
Q3vpPq2XGFc1v9xEqPZqADAfBgNVHSMEGDAWgBQcNt8uVjGPa7oUKkjbC0E2zBEp
PTANBgkqhkiG9w0BAQsFAAOCAQEAmnzSOU/eJbeYNFnrfuiEARnBm/vAsRU9yCgE
MgiGCVOHO//iOhCd2w0uQJCSUhk9yz0tvwofFfz5vqaSEk7IgY+BErlsA24/j6VY
6R7GshiCEZj/XeDIjwVhGRXnzOL+QadzVBbJVGFrr82LC5iw+x2bVmLoNs9VjKDN
MM8GaQgJ1PbpPm23GwRcpsdPRuvOFwkdVU+9hxMFqZsHEz1AqILbFqog7Z9Fh88O
s8FbK3nVYImTBQLCdjUlXZY7oNdex6NkB1v8UJgRgTUkXAr+y3j7yYfMqL0yrPLf
lsbsbfGJPClVboj28qZf7lkiaLlt3Ae+2bwvSZAvJKlfJJ33NQ==
-----END CERTIFICATE-----

In summary when an internal certificate rotation takes place Noobaa has the new certificate in its secrets but continues to use the old certificate (presumably from RAM) until a point when it gets restarted. Noobaa should be able to detect when the new certificate arrives and restart it's self in a storage safe way (bearing in mind this is s3 and is used by lots of apps) which incurs no downtime for s3. For reference Quay has the same issue and they have had to fix this as well.

--- Additional comment from RHEL Program Management on 2023-09-07 15:46:17 UTC ---

This bug having no release flag set previously, is now set with release flag 'odf‑4.14.0' to '?', and so is being proposed to be fixed at the ODF 4.14.0 release. Note that the 3 Acks (pm_ack, devel_ack, qa_ack), if any previously set while release flag was missing, have now been reset since the Acks are to be set against a release flag.

--- Additional comment from Sunil Kumar Acharya on 2023-09-12 06:21:42 UTC ---

ODF-4.14 has entered 'blocker only' phase on 12-SEP-2023. Hence, moving the non-blocker BZs to ODF-4.15. If you think this BZ needs to be evaluated for ODF-4.14, please feel free to propose the BZ as a blocker/exception to ODF-4.14 with a justification note.

--- Additional comment from Danny on 2023-09-12 06:37:23 UTC ---

Hi Andy. 

Thanks. We will look into this issue.
What is the status of the customer's system? Is everything working as expected after the endpoint restarts?

Thanks,
Danny

--- Additional comment from Amit Prinz Setter on 2023-09-13 10:44:14 UTC ---

Hi Andy,

Can you please share how you rotated the certificate?
It will help with my tests.

Also the must-gather (or logs from an endpoint) will be helpful to understand what went wrong.

thanks,
Amit

--- Additional comment from Sean Burke on 2023-09-14 10:48:40 UTC ---

Hi Amit,

Unfortunately, Andy has had to take a leave of absence, so I'm picking this up on his behalf.

I asked your query to the customer and here is his response......

=======

We did not manually rotate the cluster certificates. To do this, as documented by Red Hat, is very dangerous as every single POD has to be restarted on the cluster.

The OpenShift code/process rotated them. It did this because the 13 month point had been reached. It should make new certificates with 26 months to go at this point and allow the old certificates to work. This is what is documented.

The actual rotation from just the certificates view point half worked. A new certificate was given with 26 months expiration date BUT the old certificate (now with 13 months time left) was no longer available. Some Red Hat products / services seemed to get the new 26 month certificate. Some did not. The 13 month certificate seems to disappear completely. This meant things were broken.

We then had to re-act to all of this as described in the other tickets referenced in my first update on this ticket. here -> https://access.redhat.com/support/cases/#/case/03607506/discussion?commentId=a0a6R00000VMgWyQAL

I would strongly suggest Engineering do the following to re-produce this.

1. set-up a cluster. Keep the version consistent with versions described in previous tickets.
2. Check when the internal OpenShift managed certificates will expire. The self signer etc. See first entry in this ticket for details of what to check and how.
3. Install ODF + Noobaa S3. Keep the version consistent with versions described in previous tickets.
4. Install Loki. Make sure using Noobaa S3 as storage backend. Keep the version consistent with versions described in previous tickets.
5. Install Quay. Make sure using Noobaa S3 as storage backend. Keep the version consistent with versions described in previous tickets.
6. Check all components are functioning correctly. Do some detailed checks not just loginto the console.
7. Forward the clock on the cluster 1 (maybe 2) days before self signer expires.
8. Allow time to pass so that self signer etc has to get rotated.
9. Check all OpenShift Services that depend on internally managed certs (self signer etc).
10. Check Loki. Can it write to Noobaa S3 storage? Can it read?
11. Check Quay. Can it write to Noobaa S3 storage? Can it read?
12. Fully review the two tickets mentioned when I opened this ticket to see the details of the problem. Is the problem happening now?
13. Make new test processes to cover the scenarios described in tickets referenced when opening this ticket.

At this stage DAFM feel very much like Beta Testers for a lot of this. There seems to be little / no testing for the interoperation of the various Red Hat production running on the cluster. This is frankly an anti dev ops pattern as each product gets thrown over the wall without testing how it interacts with its dependencies above or below it. For something as key to the cluster as internally managed certificates to not work with other products like NooBaa and/or Loki is not good. OpenShift is dubbed as Kubernetes for the Enterprise but these bugs are not consistent with that message.

If engineering have any more questions please share them in this ticket BUT I would strongly suggest they very carefully read the tickets referenced when I opened this ticket. The full detail of the problems occurring and their ultimate work around are described in the previously referenced tickets. To be clear having to deal with this bug and use the work around has cost us a lot of downtime of parts of the cluster and time/effort in writing tickets.

==========

Obviously there is a lot of history here, hence the tone of the response. I've found a MG under case https://access.redhat.com/support/cases/#/case/03586273
and the other case referenced is:
https://access.redhat.com/support/cases/#/case/03595850

Is there anything else you need from the customer at this stage??

Cheers

Sean

--- Additional comment from Amit Prinz Setter on 2023-09-14 11:51:40 UTC ---

Hi,

> Is there anything else you need from the customer at this stage??
No, I'm good.

thanks,
Amit

--- Additional comment from Eran Tamir on 2023-10-26 09:52:07 UTC ---

@dzaken Any ETA for that?

--- Additional comment from Danny on 2023-10-29 19:26:33 UTC ---

@etamir, update the BZ. it is merged to master and will be fixed in 4.15

--- Additional comment from Eran Tamir on 2023-11-05 05:36:12 UTC ---

ACK

--- Additional comment from RHEL Program Management on 2024-01-09 07:24:32 UTC ---

This BZ is being approved for ODF 4.15.0 release, upon receipt of the 3 ACKs (PM,Devel,QA) for the release flag 'odf‑4.15.0

--- Additional comment from RHEL Program Management on 2024-01-09 07:24:32 UTC ---

Since this bug has been approved for ODF 4.15.0 release, through release flag 'odf-4.15.0+', the Target Release is being set to 'ODF 4.15.0

--- Additional comment from Sunil Kumar Acharya on 2024-01-11 08:23:21 UTC ---

Please update the requires_doc_text(RDT) flag/text appropriately.

--- Additional comment from Andy Bartlett on 2024-01-19 16:57:25 UTC ---

Will this fix be backported into older versions of ODF?

Many thanks,

Andy

--- Additional comment from Nimrod Becker on 2024-01-23 09:10:38 UTC ---

@eran

--- Additional comment from Eran Tamir on 2024-01-23 09:52:23 UTC ---

Yes, we will backport it to 4.14 which is EUS.

--- Additional comment from Tiffany Nguyen on 2024-02-05 23:49:33 UTC ---

Follow the step below to create certificate and check if the new certificate is rotation with the current build:

1. Create new certificate:
$ openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -sha256 -days 3650 -nodes -subj "/C=XX/ST=StateName/L=CityName/O=CompanyName/OU=CompanySectionName/CN=CommonNameOrHostname"

2. Create a new secret:
$ kubectl create secret generic noobaa-s3-serving-cert --from-file=cert.pem --from-file=key.pem

3. Check the current used certificate:
$ openssl s_client -connect localhost:10443 -showcerts 2>/dev/null </dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout

Seeing the issue to read certificate on the cluster:
$ openssl s_client -connect localhost:10443 -showcerts 2>/dev/null </dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout
Warning: Reading certificate from stdin since no -in or -new option is given
Could not find certificate from <stdin>

--- Additional comment from Sunil Kumar Acharya on 2024-02-08 06:29:19 UTC ---

As we are approaching RC [on 26-Feb-2024], We need to get this release approved issue fixed before 15-Feb-2024 otherwise it poses risk to the ODF-4.15 release. If you foresee risk in fixing and moving this bz to MODIFIED state before 15-Feb-2024, please let me know.

--- Additional comment from Jacky Albo on 2024-02-08 08:29:56 UTC ---

Hi Tiffany,

Can you explain, what addresses you are using and maybe share must-gather (not sure who is using the 10443 port and what is localhost in your case) 
We need to validate that a new certificate was updated in the pod, so you can check the certificate under /etc/s3-secret in the endpoint pod.

In short, you can follow the steps here:
https://github.com/noobaa/noobaa-core/pull/7502, under Testing Instructions

Thanks,
Jacky

--- Additional comment from Tiffany Nguyen on 2024-02-08 17:40:45 UTC ---

Hi Danny,

Here is the link to must-gather log: http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-2237903/

I can see tls.crt and tls.key in /etc/s3-secret/ in the endpoint pod.  But when execute the command to query the current use certificate, it returns as can't find cert:

$ openssl s_client -connect localhost:10443 -showcerts 2>/dev/null </dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout
Warning: Reading certificate from stdin since no -in or -new option is given
Could not find certificate from <stdin>

--- Additional comment from Jacky Albo on 2024-02-13 11:02:33 UTC ---

As I don't understand the addresses and where you ran it from. will you be able to share a live cluster for examination? 
Thanks

--- Additional comment from Tiffany Nguyen on 2024-02-13 23:16:19 UTC ---


Following the steps provided from Danny: https://github.com/noobaa/noobaa-core/pull/7502

1. Read the currently used certificate by connect from remote host:
$ openssl s_client -connect s3-openshift-storage.apps.tunguyen-i212.ibmcloud2.qe.rh-ocs.com:443 -showcerts 2>/dev/null </dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout

2. Create new certificate:
$ openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -sha256 -days 3650 -nodes -subj "/C=XX/ST=StateName/L=CityName/O=CompanyName/OU=CompanySectionName/CN=CommonNameOrHostname"

3. Delete current secret "noobaa-s3-serving-cert"
$ oc delete secret noobaa-s3-serving-cert

*** The certificate is automatically created right after it got deleted.  I have to scale down noobaa-operator to 0 and also delete S3 service before creating a new secret.

4. Create new secret:
$ oc create secret generic noobaa-s3-serving-cert --from-file=tls.crt --from-file=tls.key 

*** After secret is created, scale back the noobaa-operator to 1 and S3 service will be started.
*** During this time, noobaa-endpoint pod is restarting...

5. Check for new secret in nooba-endpoint pod, new files are created in endpoint's pod /etc/s3-secret
6. Running openssl -showcerts inside noobaa-endpoint pod, it has the new certificate.

The question here is scale down/up the noobaa-operator to ensure secret "noobaa-s3-serving-cert" got created with new cert is it valid?  And this is trigger the endpoint pod to restart which doesn't seem the same scenario with the bug description.  
Please advise.

--- Additional comment from Tiffany Nguyen on 2024-02-13 23:19:11 UTC ---

Hi Jacky,

Live cluster is avail here: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/33676/

--- Additional comment from Jacky Albo on 2024-02-15 08:58:23 UTC ---

Hi Tiffany,

something is not ok with the cluster, as I don't see the secret noobaa-s3-serving-cert and it's not being recreated by OCP for some reason.
From what I read here you don't need to kill the operator nore delete the service. just follow this: https://docs.openshift.com/container-platform/4.14/security/certificates/service-serving-certificate.html#rotate-service-serving_service-serving-certificate
Can you try just deleting the certificate and checking a new one was created to replace it as stated in the doc.
Then go ahead and validate the certs, I ran this from the endpoint pod:
openssl s_client -connect localhost:6443 -showcerts 2>/dev/null </dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout

let me know if that works for you,
Jacky

--- Additional comment from Jacky Albo on 2024-02-18 09:00:19 UTC ---

Hi Tiffani,

I ran the procedure on your cluster and it works for me. Before I had a certificate with expiry of 12.2. 
After deleting the secret, a new one was created and when running the command I sent you before, I see I get a new date:

>openssl s_client -connect localhost:6443 -showcerts 2>/dev/null </dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout
>Version: 3 (0x2)
 ...
> Validity
>     Not Before: Feb 18 08:45:47 2024 GMT
>     Not After : Feb 17 08:45:48 2026 GMT

let me know what you think.

--- Additional comment from Nimrod Becker on 2024-02-20 07:29:41 UTC ---

Based on the latest comments and a DM with Karthick, moving back to ON)QE

--- Additional comment from Tiffany Nguyen on 2024-02-20 22:56:20 UTC ---

Verified with build 4.15.0-144.  After deleted secret "noobaa-s3-serving-cert", new secret is created and certificate is rotated and updated in noobaa-endpoint pod.  Certificate is updated using below command:

$ openssl s_client -connect localhost:6443 -showcerts 2>/dev/null </dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout

...
<             Not Before: Feb 20 03:09:18 2024 GMT
<             Not After : Feb 19 03:09:19 2026 GMT
---
>             Not Before: Feb 20 22:43:16 2024 GMT
>             Not After : Feb 19 22:43:17 2026 GMT
...

--- Additional comment from Giovanni Luca Izzi on 2024-02-28 11:05:45 UTC ---

Hi team, 

this issue is critical for a customer of mine. Please we need to backport it until OCP 4.12.

--- Additional comment from errata-xmlrpc on 2024-03-06 05:09:49 UTC ---

This bug has been added to advisory RHSA-2023:118688 by Deepshikha Khandelwal (dkhandel)

--- Additional comment from RHEL Program Management on 2024-03-07 10:59:46 UTC ---

This bug having no release flag set previously, is now set with release flag 'odf‑4.15.0' to '?', and so is being proposed to be fixed at the ODF 4.15.0 release. Note that the 3 Acks (pm_ack, devel_ack, qa_ack), if any previously set while release flag was missing, have now been reset since the Acks are to be set against a release flag.

--- Additional comment from RHEL Program Management on 2024-03-07 10:59:46 UTC ---

The 'Target Release' is not to be set manually at the Red Hat OpenShift Data Foundation product.

The 'Target Release' will be auto set appropriately, after the 3 Acks (pm,devel,qa) are set to "+" for a specific release flag and that release flag gets auto set to "+".

Comment 10 Sunil Kumar Acharya 2024-06-12 14:03:46 UTC
Please backport the fix to ODF-4.12 and update the RDT flag/text appropriately.


Note You need to log in before you can comment on or make changes to this bug.