Description of problem (please be detailed as possible and provide log snippets): When trying to sync objects as part of the `test_multiregion_mirror` test, the sync command *sometimes* fails with the following error: E Error is download failed: s3://oc-bucket-583d3a79d59242aaaa24cacefa4d18ec/random1.txt to temp/random1.txt An error occurred (502) when calling the GetObject operation (reached max retries: 2): Bad Gateway E download failed: s3://oc-bucket-583d3a79d59242aaaa24cacefa4d18ec/danny2.webm to temp/danny2.webm An error occurred (502) when calling the GetObject operation (reached max retries: 2): Bad Gateway E download failed: s3://oc-bucket-583d3a79d59242aaaa24cacefa4d18ec/danny.webm to temp/danny.webm An error occurred (502) when calling the GetObject operation (reached max retries: 2): Bad Gateway E download failed: s3://oc-bucket-583d3a79d59242aaaa24cacefa4d18ec/airbus.jpg to temp/airbus.jpg An error occurred (502) when calling the GetObject operation (reached max retries: 2): Bad Gateway E download failed: s3://oc-bucket-583d3a79d59242aaaa24cacefa4d18ec/random3.txt to temp/random3.txt An error occurred (502) when calling the GetObject operation (reached max retries: 2): Bad Gateway E download failed: s3://oc-bucket-583d3a79d59242aaaa24cacefa4d18ec/random6.txt to temp/random6.txt An error occurred (502) when calling the GetObject operation (reached max retries: 2): Bad Gateway E download failed: s3://oc-bucket-583d3a79d59242aaaa24cacefa4d18ec/random2.txt to temp/random2.txt An error occurred (502) when calling the GetObject operation (reached max retries: 2): Bad Gateway E download failed: s3://oc-bucket-583d3a79d59242aaaa24cacefa4d18ec/goldman.webm to temp/goldman.webm An error occurred (502) when calling the GetObject operation (reached max retries: 2): Bad Gateway E download failed: s3://oc-bucket-583d3a79d59242aaaa24cacefa4d18ec/random4.txt to temp/random4.txt An error occurred (502) when calling the GetObject operation (reached max retries: 2): Bad Gateway E download failed: s3://oc-bucket-583d3a79d59242aaaa24cacefa4d18ec/random7.txt to temp/random7.txt An error occurred (502) when calling the GetObject operation (reached max retries: 2): Bad Gateway E download failed: s3://oc-bucket-583d3a79d59242aaaa24cacefa4d18ec/random5.txt to temp/random5.txt An error occurred (502) when calling the GetObject operation (reached max retries: 2): Bad Gateway E download failed: s3://oc-bucket-583d3a79d59242aaaa24cacefa4d18ec/rome.jpg to temp/rome.jpg An error occurred (502) when calling the GetObject operation (reached max retries: 2): Bad Gateway E download failed: s3://oc-bucket-583d3a79d59242aaaa24cacefa4d18ec/random8.txt to temp/random8.txt An error occurred (502) when calling the GetObject operation (reached max retries: 2): Bad Gateway E download failed: s3://oc-bucket-583d3a79d59242aaaa24cacefa4d18ec/steve.webm to temp/steve.webm An error occurred (502) when calling the GetObject operation (reached max retries: 2): Bad Gateway E download failed: s3://oc-bucket-583d3a79d59242aaaa24cacefa4d18ec/random9.txt to temp/random9.txt An error occurred (502) when calling the GetObject operation (reached max retries: 2): Bad Gateway All occurrences seem to have happened in the same part of the test. One bucket with data on it uses two backingstores - one is blocked, the other one is healthy. We unblock the blocked one, and then block the healthy one. Then, we proceed to try and download data from the bucket. That's when the issue happens. Version of all relevant components (if applicable): OCP 4.5.0-0.nightly-2020-06-11-183238 OCS 4.5.0-448.ci Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Partially. In all cases up until now, an additional attempt to run the sync command was successful. However, this disrupts the user flow. Is there any workaround available to the best of your knowledge? Retry to sync. Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 5 Can this issue reproducible? Yes. I'd say it happens in about 1 of 3 runs. The easiest way to reproduce is by running `test_multiregion_mirror`. Can this issue reproduce from the UI? No If this is a regression, please provide more details to justify this: It is - we did not run into Bad Gateway errors in OCS 4.4. Steps to Reproduce: 1. Create two AWS backingstores 2. Create a bucketclass that uses them as mirrors 3. Create an OBC that uses the bucketclass 4. Upload all objects in the AWS bucket `ocsci-test-files` to the bucket 5. Block all IO to the one of the backingstore by applying this bucket policy on its target bucket: { "Version": "2012-10-17", "Id": "DenyReadWrite", "Statement": [ { "Effect": "Deny", "Principal": { "AWS": "*" }, "Action": [ "s3:GetObject", "s3:PutObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::BUCKETNAME/*", "arn:aws:s3:::BUCKETNAME" ] } ] } 6. Download all objects from the NooBaa bucket, compare their hashes to the files from ocsci-test-files 7. Remove the bucket policy from the backingstore you blocked, place it on the other backingstore's target bucket 8. Download all objects from the NooBaa bucket Actual results: Bad gateway error is shown Expected results: Download is successful Additional info:
logs are before the extended collection which was added, can we have the new logs?
We did not yet run into a reproduction, this bug seems to be a 4.5 exclusive, so we'll have to wait until 4.5 is deployable and widely tested again.
(In reply to Ben Eli from comment #7) > We did not yet run into a reproduction, this bug seems to be a 4.5 > exclusive, so we'll have to wait until 4.5 is deployable and widely tested > again. NEEDINFO on reporter to reproduce.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Container Storage 4.5.0 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3754