DescriptionSidhant Agrawal
2021-10-26 15:21:15 UTC
Description of problem (please be detailed as possible and provide log
snippests):
In DR cluster configured with MCG S3 store where it's using self-signed certificate, attempts to use the S3 store fails with x509: certificate signed by unknown authority.
Error messages from ramen-dr-cluster-operator pod
...
2021-10-26T11:05:11.882Z INFO controllers.VolumeReplicationGroup.vrginstance controllers/volumereplicationgroup_controller.go:841 Restoring PVs to this managed cluster. ProfileList: [s3-on-east s3-on-west] {"VolumeReplicationGroup": "busybox-sagrawal-c1-simple/busybox-drpc", "State": "primary"}
time="2021-10-26T11:05:11Z" level=info msg="loading Ramen config file name/config/ramen_manager_config.yaml" source="ramenconfig.go:76"
2021-10-26T11:05:12.262Z ERROR controllers.VolumeReplicationGroup.vrginstance controllers/volumereplicationgroup_controller.go:841 error fetching PV cluster data from S3 profile s3-on-east {"VolumeReplicationGroup": "busybox-sagrawal-c1-simple/busybox-drpc", "State": "primary", "error": "unable to download: busybox-sagrawal-c1-simple-busybox-drpc, unable to ListKeys of type v1.PersistentVolume from endpoint https://s3-openshift-storage.apps.sagrawal-c1.qe.rh-ocs.com bucket busybox-sagrawal-c1-simple-busybox-drpc keyPrefix v1.PersistentVolume/, failed to list objects in bucket busybox-sagrawal-c1-simple-busybox-drpc:v1.PersistentVolume/, RequestError: send request failed\ncaused by: Get \"https://s3-openshift-storage.apps.sagrawal-c1.qe.rh-ocs.com/busybox-sagrawal-c1-simple-busybox-drpc?list-type=2&prefix=v1.PersistentVolume%2F\": x509: certificate signed by unknown authority"}
github.com/ramendr/ramen/controllers.(*VRGInstance).processAsPrimary
/remote-source/app/controllers/volumereplicationgroup_controller.go:841
github.com/ramendr/ramen/controllers.(*VRGInstance).processVRGActions
/remote-source/app/controllers/volumereplicationgroup_controller.go:388
github.com/ramendr/ramen/controllers.(*VRGInstance).processVRG
/remote-source/app/controllers/volumereplicationgroup_controller.go:376
github.com/ramendr/ramen/controllers.(*VolumeReplicationGroupReconciler).Reconcile
/remote-source/app/controllers/volumereplicationgroup_controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.0-beta.4/pkg/internal/controller/controller.go:298
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.0-beta.4/pkg/internal/controller/controller.go:253
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.0-beta.4/pkg/internal/controller/controller.go:214
time="2021-10-26T11:05:12Z" level=info msg="loading Ramen config file name/config/ramen_manager_config.yaml" source="ramenconfig.go:76"
2021-10-26T11:05:12.614Z ERROR controllers.VolumeReplicationGroup.vrginstance controllers/volumereplicationgroup_controller.go:841 error fetching PV cluster data from S3 profile s3-on-west {"VolumeReplicationGroup": "busybox-sagrawal-c1-simple/busybox-drpc", "State": "primary", "error": "unable to download: busybox-sagrawal-c1-simple-busybox-drpc, unable to ListKeys of type v1.PersistentVolume from endpoint https://s3-openshift-storage.apps.sagrawal-c2.qe.rh-ocs.com bucket busybox-sagrawal-c1-simple-busybox-drpc keyPrefix v1.PersistentVolume/, failed to list objects in bucket busybox-sagrawal-c1-simple-busybox-drpc:v1.PersistentVolume/, RequestError: send request failed\ncaused by: Get \"https://s3-openshift-storage.apps.sagrawal-c2.qe.rh-ocs.com/busybox-sagrawal-c1-simple-busybox-drpc?list-type=2&prefix=v1.PersistentVolume%2F\": x509: certificate signed by unknown authority"}
...
Version of all relevant components (if applicable):
ODF 4.9.0-202.ci
Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes, DR feature is affected due to this.
DR protection will not work
Is there any workaround available to the best of your knowledge?
Yes
As a workaround, we can add CA of managed clusters in all the clusters (hub and managed clusters)
More details in Additional info.
Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
3
Can this issue reproducible?
Yes
Can this issue reproduce from the UI?
If this is a regression, please provide more details to justify this:
NA
Steps to Reproduce:
1. Configure test environment for ODR using MCG S3 stores
2. Create sample busybox application, observe that buckets will not be created.
3. Observe ramen-dr-cluster-operator logs in managed cluster
x509: certificate signed by unknown authority messages will be present in the logs
Actual results:
x509: certificate signed by unknown authority when ODR is configured with MCG S3 stores
Expected results:
ODR should work when using MCG S3 stores using self-signed certificates.
There should be option to ignore self-signed certificates.
Additional info:
Workaround details:
1) Extract CA certificates from the managed clusters
I extracted ca-bundle.crt details from the default-ingress-cert Configmap from both managed clusters
oc get cm default-ingress-cert -n openshift-config-managed -o jsonpath="{['data']['ca-bundle\.crt']}"
2) Follow instructions given in [1]
3) Follow instructions given in [2]
In this step, we need to edit "odr-cluster-operator.v4.9.0" CSV instead of pod deployment
[1] https://docs.openshift.com/container-platform/4.9/networking/configuring-a-custom-pki.html#nw-proxy-configure-object_configuring-a-custom-pki
[2] https://docs.openshift.com/container-platform/4.9/networking/configuring-a-custom-pki.html#certificate-injection-using-operators_configuring-a-custom-pki
Description of problem (please be detailed as possible and provide log snippests): In DR cluster configured with MCG S3 store where it's using self-signed certificate, attempts to use the S3 store fails with x509: certificate signed by unknown authority. Error messages from ramen-dr-cluster-operator pod ... 2021-10-26T11:05:11.882Z INFO controllers.VolumeReplicationGroup.vrginstance controllers/volumereplicationgroup_controller.go:841 Restoring PVs to this managed cluster. ProfileList: [s3-on-east s3-on-west] {"VolumeReplicationGroup": "busybox-sagrawal-c1-simple/busybox-drpc", "State": "primary"} time="2021-10-26T11:05:11Z" level=info msg="loading Ramen config file name/config/ramen_manager_config.yaml" source="ramenconfig.go:76" 2021-10-26T11:05:12.262Z ERROR controllers.VolumeReplicationGroup.vrginstance controllers/volumereplicationgroup_controller.go:841 error fetching PV cluster data from S3 profile s3-on-east {"VolumeReplicationGroup": "busybox-sagrawal-c1-simple/busybox-drpc", "State": "primary", "error": "unable to download: busybox-sagrawal-c1-simple-busybox-drpc, unable to ListKeys of type v1.PersistentVolume from endpoint https://s3-openshift-storage.apps.sagrawal-c1.qe.rh-ocs.com bucket busybox-sagrawal-c1-simple-busybox-drpc keyPrefix v1.PersistentVolume/, failed to list objects in bucket busybox-sagrawal-c1-simple-busybox-drpc:v1.PersistentVolume/, RequestError: send request failed\ncaused by: Get \"https://s3-openshift-storage.apps.sagrawal-c1.qe.rh-ocs.com/busybox-sagrawal-c1-simple-busybox-drpc?list-type=2&prefix=v1.PersistentVolume%2F\": x509: certificate signed by unknown authority"} github.com/ramendr/ramen/controllers.(*VRGInstance).processAsPrimary /remote-source/app/controllers/volumereplicationgroup_controller.go:841 github.com/ramendr/ramen/controllers.(*VRGInstance).processVRGActions /remote-source/app/controllers/volumereplicationgroup_controller.go:388 github.com/ramendr/ramen/controllers.(*VRGInstance).processVRG /remote-source/app/controllers/volumereplicationgroup_controller.go:376 github.com/ramendr/ramen/controllers.(*VolumeReplicationGroupReconciler).Reconcile /remote-source/app/controllers/volumereplicationgroup_controller.go:311 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.0-beta.4/pkg/internal/controller/controller.go:298 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.0-beta.4/pkg/internal/controller/controller.go:253 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.0-beta.4/pkg/internal/controller/controller.go:214 time="2021-10-26T11:05:12Z" level=info msg="loading Ramen config file name/config/ramen_manager_config.yaml" source="ramenconfig.go:76" 2021-10-26T11:05:12.614Z ERROR controllers.VolumeReplicationGroup.vrginstance controllers/volumereplicationgroup_controller.go:841 error fetching PV cluster data from S3 profile s3-on-west {"VolumeReplicationGroup": "busybox-sagrawal-c1-simple/busybox-drpc", "State": "primary", "error": "unable to download: busybox-sagrawal-c1-simple-busybox-drpc, unable to ListKeys of type v1.PersistentVolume from endpoint https://s3-openshift-storage.apps.sagrawal-c2.qe.rh-ocs.com bucket busybox-sagrawal-c1-simple-busybox-drpc keyPrefix v1.PersistentVolume/, failed to list objects in bucket busybox-sagrawal-c1-simple-busybox-drpc:v1.PersistentVolume/, RequestError: send request failed\ncaused by: Get \"https://s3-openshift-storage.apps.sagrawal-c2.qe.rh-ocs.com/busybox-sagrawal-c1-simple-busybox-drpc?list-type=2&prefix=v1.PersistentVolume%2F\": x509: certificate signed by unknown authority"} ... Version of all relevant components (if applicable): ODF 4.9.0-202.ci Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Yes, DR feature is affected due to this. DR protection will not work Is there any workaround available to the best of your knowledge? Yes As a workaround, we can add CA of managed clusters in all the clusters (hub and managed clusters) More details in Additional info. Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 3 Can this issue reproducible? Yes Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: NA Steps to Reproduce: 1. Configure test environment for ODR using MCG S3 stores 2. Create sample busybox application, observe that buckets will not be created. 3. Observe ramen-dr-cluster-operator logs in managed cluster x509: certificate signed by unknown authority messages will be present in the logs Actual results: x509: certificate signed by unknown authority when ODR is configured with MCG S3 stores Expected results: ODR should work when using MCG S3 stores using self-signed certificates. There should be option to ignore self-signed certificates. Additional info: Workaround details: 1) Extract CA certificates from the managed clusters I extracted ca-bundle.crt details from the default-ingress-cert Configmap from both managed clusters oc get cm default-ingress-cert -n openshift-config-managed -o jsonpath="{['data']['ca-bundle\.crt']}" 2) Follow instructions given in [1] 3) Follow instructions given in [2] In this step, we need to edit "odr-cluster-operator.v4.9.0" CSV instead of pod deployment [1] https://docs.openshift.com/container-platform/4.9/networking/configuring-a-custom-pki.html#nw-proxy-configure-object_configuring-a-custom-pki [2] https://docs.openshift.com/container-platform/4.9/networking/configuring-a-custom-pki.html#certificate-injection-using-operators_configuring-a-custom-pki