1970583 – [GSS][rook] hashicorp vault v2 not supported in current release

Bug 1970583 - [GSS][rook] hashicorp vault v2 not supported in current release

Summary: [GSS][rook] hashicorp vault v2 not supported in current release

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenShift Container Storage
Classification:	Red Hat Storage
Component:	rook
Sub Component:
Version:	4.7
Hardware:	All
OS:	All
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	OCS 4.7.2
Assignee:	Sébastien Han
QA Contact:	Shay Rozen
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-06-10 18:05 UTC by Randy Martinez
Modified:	2024-10-01 18:34 UTC (History)
CC List:	13 users (show)
Fixed In Version:	v4.7.2-429.ci
Doc Type:	Enhancement
Doc Text:	Currently, only HashiCorp Key/Value (KV) Secret Engine API, version 1 is supported for cluster-wide encryption with Key Management System (KMS). With this update, support for HashiCorp KV Secret Engine API, version 2 is added.
Clone Of:
Environment:
Last Closed:	2021-06-30 19:22:50 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift rook pull 254	None	open	Bug 1970583: Fix vault when used with kv version 2	2021-06-15 21:44:12 UTC
Github	red-hat-storage ocs-ci pull 4499	None	closed	Add kv-v2 support for vault deployment	2021-07-23 10:02:01 UTC
Github	rook rook pull 7374	None	closed	ceph: fix vault when used with kv version 2	2021-06-15 21:44:12 UTC
Red Hat Product Errata	RHBA-2021:2632	None	None	None	2021-06-30 19:23:01 UTC

Internal Links: 1970641

Comment 4 Sébastien Han 2021-06-11 07:55:57 UTC

This is fixed in 4.8, so I'm setting the target to 4.7.z, so removing the blocker flag.
Randy, Mudit, are you good with that?

Comment 11 Michael Adam 2021-06-14 17:42:34 UTC

(In reply to Sébastien Han from comment #4)
> This is fixed in 4.8, so I'm setting the target to 4.7.z, so removing the
> blocker flag.
> Randy, Mudit, are you good with that?

Moving to POST since this is fixed in 4.8 / upstream.

Comment 12 Travis Nielsen 2021-06-15 21:44:12 UTC

The upstream/4.8 fix is this PR: https://github.com/rook/rook/pull/7374. I do not see any other related changes to this issue.

I've created the downstream PR for 4.7, ready to merge when the BZ is fully acked.
https://github.com/openshift/rook/pull/254

Comment 16 Mudit Agarwal 2021-06-23 11:17:01 UTC

Please add doc text

Comment 19 Sébastien Han 2021-06-28 15:57:55 UTC

LGTM.

Comment 22 Shay Rozen 2021-06-28 21:42:43 UTC

The osd first failed to initialize and only after 35 minutes the osd are up and encryption is working.
Since the osd is replace I don't have logs from the failed one. Will deploy new cluster soon and have some logs from the failed one.

Comment 23 Shay Rozen 2021-06-28 21:58:28 UTC

Events:
  Type     Reason                 Age                  From               Message
  ----     ------                 ----                 ----               -------
  Normal   Scheduled              2m45s                default-scheduler  Successfully assigned openshift-storage/rook-ceph-osd-0-749f8ddbc8-s7sfq to ip-10-0-135-29.us-east-2.compute.internal
  Normal   SuccessfulMountVolume  2m44s                kubelet            MapVolume.MapPodDevice succeeded for volume "pvc-a57a35c5-3758-4001-a3cf-2fb122a3b4bc" globalMapPath "/var/lib/kubelet/plugins/kubernetes.io/aws-ebs/volumeDevices/aws:/us-east-2a/vol-0f7e552d0a9953032"
  Normal   SuccessfulMountVolume  2m44s                kubelet            MapVolume.MapPodDevice succeeded for volume "pvc-a57a35c5-3758-4001-a3cf-2fb122a3b4bc" volumeMapPath "/var/lib/kubelet/pods/19692ab7-841b-454b-8e8d-a930a39116bf/volumeDevices/kubernetes.io~aws-ebs"
  Normal   AddedInterface         2m43s                multus             Add eth0 [10.131.0.76/23]
  Normal   Pulled                 2m42s                kubelet            Container image "quay.io/rhceph-dev/rhceph@sha256:725f93133acc0fb1ca845bd12e77f20d8629cad0e22d46457b2736578698eb6c" already present on machine
  Normal   Created                2m42s                kubelet            Created container blkdevmapper
  Normal   Started                2m42s                kubelet            Started container blkdevmapper
  Normal   Pulled                 2m (x4 over 2m41s)   kubelet            Container image "quay.io/rhceph-dev/rhceph@sha256:725f93133acc0fb1ca845bd12e77f20d8629cad0e22d46457b2736578698eb6c" already present on machine
  Normal   Created                2m (x4 over 2m41s)   kubelet            Created container encryption-kms-get-kek
  Normal   Started                2m (x4 over 2m41s)   kubelet            Started container encryption-kms-get-kek
  Warning  BackOff                80s (x8 over 2m39s)  kubelet            Back-off restarting failed container

 oc logs rook-ceph-osd-0-749f8ddbc8-s7sfq -c encryption-kms-get-kek

no encryption key rook-ceph-osd-encryption-key-ocs-deviceset-gp2-0-data-0q6whj present in vault
["Invalid path for a versioned K/V secrets engine. See the API docs for the appropriate API endpoints to use. If using the Vault CLI, use 'vault kv get' for this operation."]

rook-ceph-osd-0-749f8ddbc8-s7sfq                                  0/2     Init:CrashLoopBackOff   4          2m25s
rook-ceph-osd-1-55b97c64fc-fb2g6                                  0/2     Init:CrashLoopBackOff   4          2m18s
rook-ceph-osd-2-6b8c95f8d5-tr2g8                                  0/2     Init:CrashLoopBackOff   4          2m15s

Comment 25 Shay Rozen 2021-06-28 22:12:31 UTC

Tested another install and it took 10 minutes before the cluster was up. Attached above rook log "rook log 10 minutes"

Comment 26 Sébastien Han 2021-06-29 12:59:29 UTC

Just to clarify, the reason why we don't see this in 4.8 is that in 4.8 rook will cancel any ongoing orchestration on CR update, this is not the case in 4.7 so the operator runs a provisioning sequence once, then times out and retries with success. Hence the 10min needed to wait.

Comment 28 Shay Rozen 2021-06-30 07:52:30 UTC

https://bugzilla.redhat.com/show_bug.cgi?id=1977609 was raised to handle 4.7.3.
And also https://access.redhat.com/solutions/6150022

Comment 32 errata-xmlrpc 2021-06-30 19:22:50 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Container Storage 4.7.2 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2632

Note You need to log in before you can comment on or make changes to this bug.