Bug 2017485

Summary: [DR] x509: certificate signed by unknown authority when using MCG S3 store
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Sidhant Agrawal <sagrawal>
Component: odf-drAssignee: Shyamsundar <srangana>
odf-dr sub component: ramen QA Contact: Aviad Polak <apolak>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: high    
Priority: unspecified CC: aclewett, apolak, bniver, ebenahar, jelopez, madam, mmuench, muagarwa, nberry, ocs-bugs, odf-bz-bot, olakra, sostapov, srangana, tmuthami
Version: 4.9   
Target Milestone: ---   
Target Release: ODF 4.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-01-07 17:46:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sidhant Agrawal 2021-10-26 15:21:15 UTC
Description of problem (please be detailed as possible and provide log
snippests):
In DR cluster configured with MCG S3 store where it's using self-signed certificate, attempts to use the S3 store fails with x509: certificate signed by unknown authority.

Error messages from ramen-dr-cluster-operator pod 
...
2021-10-26T11:05:11.882Z	INFO	controllers.VolumeReplicationGroup.vrginstance	controllers/volumereplicationgroup_controller.go:841	Restoring PVs to this managed cluster. ProfileList: [s3-on-east s3-on-west]	{"VolumeReplicationGroup": "busybox-sagrawal-c1-simple/busybox-drpc", "State": "primary"}
time="2021-10-26T11:05:11Z" level=info msg="loading Ramen config file name/config/ramen_manager_config.yaml" source="ramenconfig.go:76"
2021-10-26T11:05:12.262Z	ERROR	controllers.VolumeReplicationGroup.vrginstance	controllers/volumereplicationgroup_controller.go:841	error fetching PV cluster data from S3 profile s3-on-east	{"VolumeReplicationGroup": "busybox-sagrawal-c1-simple/busybox-drpc", "State": "primary", "error": "unable to download: busybox-sagrawal-c1-simple-busybox-drpc, unable to ListKeys of type v1.PersistentVolume from endpoint https://s3-openshift-storage.apps.sagrawal-c1.qe.rh-ocs.com bucket busybox-sagrawal-c1-simple-busybox-drpc keyPrefix v1.PersistentVolume/, failed to list objects in bucket busybox-sagrawal-c1-simple-busybox-drpc:v1.PersistentVolume/, RequestError: send request failed\ncaused by: Get \"https://s3-openshift-storage.apps.sagrawal-c1.qe.rh-ocs.com/busybox-sagrawal-c1-simple-busybox-drpc?list-type=2&prefix=v1.PersistentVolume%2F\": x509: certificate signed by unknown authority"}
github.com/ramendr/ramen/controllers.(*VRGInstance).processAsPrimary
	/remote-source/app/controllers/volumereplicationgroup_controller.go:841
github.com/ramendr/ramen/controllers.(*VRGInstance).processVRGActions
	/remote-source/app/controllers/volumereplicationgroup_controller.go:388
github.com/ramendr/ramen/controllers.(*VRGInstance).processVRG
	/remote-source/app/controllers/volumereplicationgroup_controller.go:376
github.com/ramendr/ramen/controllers.(*VolumeReplicationGroupReconciler).Reconcile
	/remote-source/app/controllers/volumereplicationgroup_controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.0-beta.4/pkg/internal/controller/controller.go:298
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.0-beta.4/pkg/internal/controller/controller.go:253
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.0-beta.4/pkg/internal/controller/controller.go:214
time="2021-10-26T11:05:12Z" level=info msg="loading Ramen config file name/config/ramen_manager_config.yaml" source="ramenconfig.go:76"
2021-10-26T11:05:12.614Z	ERROR	controllers.VolumeReplicationGroup.vrginstance	controllers/volumereplicationgroup_controller.go:841	error fetching PV cluster data from S3 profile s3-on-west	{"VolumeReplicationGroup": "busybox-sagrawal-c1-simple/busybox-drpc", "State": "primary", "error": "unable to download: busybox-sagrawal-c1-simple-busybox-drpc, unable to ListKeys of type v1.PersistentVolume from endpoint https://s3-openshift-storage.apps.sagrawal-c2.qe.rh-ocs.com bucket busybox-sagrawal-c1-simple-busybox-drpc keyPrefix v1.PersistentVolume/, failed to list objects in bucket busybox-sagrawal-c1-simple-busybox-drpc:v1.PersistentVolume/, RequestError: send request failed\ncaused by: Get \"https://s3-openshift-storage.apps.sagrawal-c2.qe.rh-ocs.com/busybox-sagrawal-c1-simple-busybox-drpc?list-type=2&prefix=v1.PersistentVolume%2F\": x509: certificate signed by unknown authority"}
...


Version of all relevant components (if applicable):
ODF 4.9.0-202.ci

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes, DR feature is affected due to this.
DR protection will not work

Is there any workaround available to the best of your knowledge?
Yes
As a workaround, we can add CA of managed clusters in all the clusters (hub and managed clusters)
More details in Additional info.

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
3

Can this issue reproducible?
Yes

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:
NA

Steps to Reproduce:
1. Configure test environment for ODR using MCG S3 stores
2. Create sample busybox application, observe that buckets will not be created.
3. Observe ramen-dr-cluster-operator logs in managed cluster
x509: certificate signed by unknown authority messages will be present in the logs


Actual results:
x509: certificate signed by unknown authority when ODR is configured with MCG S3 stores

Expected results:
ODR should work when using MCG S3 stores using self-signed certificates.
There should be option to ignore self-signed certificates.


Additional info:

Workaround details:
1) Extract CA certificates from the managed clusters 
I extracted ca-bundle.crt details from the default-ingress-cert Configmap from both managed clusters
oc get cm default-ingress-cert -n openshift-config-managed -o jsonpath="{['data']['ca-bundle\.crt']}"
2) Follow instructions given in [1]
3) Follow instructions given in [2] 
In this step, we need to edit "odr-cluster-operator.v4.9.0" CSV instead of pod deployment


[1] https://docs.openshift.com/container-platform/4.9/networking/configuring-a-custom-pki.html#nw-proxy-configure-object_configuring-a-custom-pki
[2] https://docs.openshift.com/container-platform/4.9/networking/configuring-a-custom-pki.html#certificate-injection-using-operators_configuring-a-custom-pki