The bug blocks a lot of automation test runs and needs to be backported to all z streams. This bug is to track the backport for 4.12.z +++ This bug was initially created as a clone of Bug #2246333 +++ The bug blocks a lot of automation test runs and needs to be backported to all z streams. This bug is to track the backport for 4.13.z +++ This bug was initially created as a clone of Bug #2227835 +++ Description of problem (please be detailed as possible and provide log snippests): ---------------------------------------------------------------------------- In certain OCS-CI tests we still use the RPC API to list the objects in a bucket, and recent 4.13 regression analysis showed that they failed with the same error I got when trying to reproduce it on 4.14: $ ~/ocs-ci/data/mcg-cli api object_api list_objects '{"bucket": "first.bucket"}' -ojson -n openshift-storage INFO[0001] ✅ Exists: NooBaa "noobaa" INFO[0001] ✅ Exists: Service "noobaa-mgmt" INFO[0002] ✅ Exists: Secret "noobaa-operator" INFO[0002] ✅ Exists: Secret "noobaa-admin" INFO[0002] ✈️ RPC: object.list_objects() Request: map[bucket:first.bucket] WARN[0002] RPC: GetConnection creating connection to wss://localhost:0/rpc/ 0xc000f37b60 INFO[0002] RPC: Connecting websocket (0xc000f37b60) &{RPC:0xc0002f86e0 Address:wss://localhost:0/rpc/ State:init WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:1 sema:0} ReconnectDelay:0s cancelPings:<nil>} ERRO[0002] RPC: closing connection (0xc000f37b60) &{RPC:0xc0002f86e0 Address:wss://localhost:0/rpc/ State:init WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:1 sema:0} ReconnectDelay:0s cancelPings:<nil>} WARN[0002] RPC: RemoveConnection wss://localhost:0/rpc/ current=0xc000f37b60 conn=0xc000f37b60 ERRO[0002] ⚠️ RPC: object.list_objects() Call failed: failed to WebSocket dial: failed to send handshake request: Get "https://localhost:0/rpc/": dial tcp [::1]:0: connect: can't assign requested address FATA[0002] ❌ failed to WebSocket dial: failed to send handshake request: Get "https://localhost:0/rpc/": dial tcp [::1]:0: connect: can't assign requested address Other RPC queries such as create_auth and read_system still work as expected in both 4.13 and 4.14. Version of all relevant components (if applicable): ---------------------------------------------------------------------------- OC version: Client Version: 4.12.0-ec.5 Kustomize Version: v4.5.7 Server Version: 4.14.0-0.nightly-2023-07-30-191504 Kubernetes Version: v1.27.3+4aaeaec OCS verison: ocs-operator.v4.14.0-90.stable OpenShift Container Storage 4.14.0-90.stable Succeeded Cluster version NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.14.0-0.nightly-2023-07-30-191504 True False 5h49m Cluster version is 4.14.0-0.nightly-2023-07-30-191504 Rook version: rook: v4.14.0-0.a2658b13fd55bc922f3e2c00eb45fc03735ce8c2 go: go1.20.5 Ceph version: ceph version 17.2.6-100.el9cp (ea4e3ef8df2cf26540aae06479df031dcfc80343) quincy (stable) Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? ---------------------------------------------------------------------------- Yes, it's failing OCS-CI tests. Is there any workaround available to the best of your knowledge? ---------------------------------------------------------------------------- Other methods of listing the bucket, such as "aws s3 ls s3://first.bucket" Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? ---------------------------------------------------------------------------- 1 Can this issue reproducible? ---------------------------------------------------------------------------- Yes Can this issue reproduce from the UI? ---------------------------------------------------------------------------- No If this is a regression, please provide more details to justify this: ---------------------------------------------------------------------------- Yes, the same error was not present in 4.12 regression runs. Steps to Reproduce: 1. Run the following command via the MCG-CLI: $ ~/ocs-ci/data/mcg-cli api object_api list_objects '{"bucket": "first.bucket"}' -ojson -n openshift-storage Actual results: ---------------------------------------------------------------------------- The query fails and the objects are not listed Expected results: ---------------------------------------------------------------------------- The bucket's list of objects Additional info: ---------------------------------------------------------------------------- One of the TCs that show this issue (specifically in check_if_mirroring_is_done): https://github.com/red-hat-storage/ocs-ci/blob/master/tests/manage/mcg/test_multi_region.py Example RP links: https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#ocs/launches/465/13108/599033/599034/599035/log https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#ocs/launches/465/13085/598040/598041/598042/log https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#ocs/launches/465/12808/587239/587252/587253/log Example ocs-must-gather logs: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-001vu1cms33-t4a/j-001vu1cms33-t4a_20230702T223827/logs/failed_testcase_ocs_logs_1688341290/test_multiregion_mirror_ocs_logs/j-001vu1cms33-t4a/ http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-027aikt1c33-t4a/j-027aikt1c33-t4a_20230628T101648/logs/failed_testcase_ocs_logs_1687950522/test_multiregion_mirror_ocs_logs/j-027aikt1c33-t4a/ http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-223ai3c33-uo/j-223ai3c33-uo_20230706T071042/logs/failed_testcase_ocs_logs_1688630684/test_fill_bucket_ocs_logs/j-223ai3c33-uo/ --- Additional comment from RHEL Program Management on 2023-07-31 15:52:34 UTC --- This bug having no release flag set previously, is now set with release flag 'odf‑4.14.0' to '?', and so is being proposed to be fixed at the ODF 4.14.0 release. Note that the 3 Acks (pm_ack, devel_ack, qa_ack), if any previously set while release flag was missing, have now been reset since the Acks are to be set against a release flag. --- Additional comment from RHEL Program Management on 2023-07-31 15:53:32 UTC --- This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being proposed as a blocker for this release. Please resolve ASAP. --- Additional comment from Red Hat Bugzilla on 2023-08-03 08:30:40 UTC --- Account disabled by LDAP Audit --- Additional comment from Sagi Hirshfeld on 2023-08-14 13:22:15 UTC --- Apparently, this is also failing a couple of tier1 that cover noobaa caching scenarios: - https://github.com/red-hat-storage/ocs-ci/blob/5ef38b97a6c2f594ea7f07b64414c2f44eb83491/tests/manage/mcg/test_namespace_crd.py#L425-L498 - https://github.com/red-hat-storage/ocs-ci/blob/5ef38b97a6c2f594ea7f07b64414c2f44eb83491/tests/manage/mcg/test_namespace_crd.py#L500-L584 Raising the priority to "high" since this is blocking the coverage of the scenarios. --- Additional comment from Elad on 2023-08-21 09:00:18 UTC --- Multiple test scenarios are currently blocked and lack coverage. Therefore, setting TestBlocker keyword --- Additional comment from Danny on 2023-08-21 09:55:42 UTC --- hi @ --- Additional comment from Danny on 2023-08-21 09:59:48 UTC --- hi @shirshfe what is the API that is failing with the same error? we currently identified a few APIs (like list_objects) that are not supported by "noobaa api" CLI. we are trying to resolve it but this is not a regression. Existing tests should not fail with the same issue. --- Additional comment from RHEL Program Management on 2023-08-22 07:40:37 UTC --- This BZ is being approved for ODF 4.14.0 release, upon receipt of the 3 ACKs (PM,Devel,QA) for the release flag 'odf‑4.14.0 --- Additional comment from RHEL Program Management on 2023-08-22 07:40:37 UTC --- Since this bug has been approved for ODF 4.14.0 release, through release flag 'odf-4.14.0+', the Target Release is being set to 'ODF 4.14.0 --- Additional comment from errata-xmlrpc on 2023-08-23 06:20:16 UTC --- This bug has been added to advisory RHBA-2023:115514 by ceph-build service account (ceph-build.COM) --- Additional comment from Sagi Hirshfeld on 2023-08-23 09:06:32 UTC --- Hi dzaken, the API that is failing in the additional TCs that I added in my previous comment is the same: object_api::list_objects. At the start of this quarter we had to change the way we make RPC queries so the use the MCG-CLI instead of the HTTP calls due to the deprecation of the old route, which would explain why existing tests have failed. I falsely assumed that both methods ultimately interact with the underlying API in the same manner, thus the Regression KeyWord. I'll remove it now that I better understand the difference. --- Additional comment from Sagi Hirshfeld on 2023-08-23 17:05:00 UTC --- Verified on 4.14.0-114: all the above TCs have passed when ran locally. --- Additional comment from Sunil Kumar Acharya on 2023-09-21 05:54:14 UTC --- Please update the requires_doc_text(RDT) flag/text appropriately. --- Additional comment from Mahesh Shetty on 2023-10-04 13:14:23 UTC --- As dicsussed, this issue still exists in ODF 4.14-139 build $ noobaa api object_api list_objects '{"bucket": "oc-bucket-25c5ab874d9043aea4fc2a41a5fb16"}' -ojson -n openshift-storage INFO[0003] ✅ Exists: NooBaa "noobaa" INFO[0003] ✅ Exists: Service "noobaa-mgmt" INFO[0004] ✅ Exists: Secret "noobaa-operator" INFO[0004] ✅ Exists: Secret "noobaa-admin" INFO[0006] ✈️ RPC: object.list_objects() Request: map[bucket:oc-bucket-25c5ab874d9043aea4fc2a41a5fb16] WARN[0006] RPC: GetConnection creating connection to wss://localhost:0/rpc/ 0xc00067efc0 INFO[0006] RPC: Connecting websocket (0xc00067efc0) &{RPC:0xc0000b1950 Address:wss://localhost:0/rpc/ State:init WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:1 sema:0} ReconnectDelay:0s cancelPings:<nil>} ERRO[0006] RPC: closing connection (0xc00067efc0) &{RPC:0xc0000b1950 Address:wss://localhost:0/rpc/ State:init WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:1 sema:0} ReconnectDelay:0s cancelPings:<nil>} WARN[0006] RPC: RemoveConnection wss://localhost:0/rpc/ current=0xc00067efc0 conn=0xc00067efc0 ERRO[0006] ⚠️ RPC: object.list_objects() Call failed: failed to websocket dial: failed to send handshake request: Get "https://localhost:0/rpc/": dial tcp [::1]:0: connect: connection refused FATA[0006] ❌ failed to websocket dial: failed to send handshake request: Get "https://localhost:0/rpc/": dial tcp [::1]:0: connect: connection refused --- Additional comment from Mahesh Shetty on 2023-10-04 14:20:57 UTC --- Logs here: http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz2227835/ --- Additional comment from Aayush Chouhan on 2023-10-11 07:51:04 UTC --- This above issue(raised by Mahesh) occured because of the CLI version mismatch. Closing the bug now. Thanks --- Additional comment from RHEL Program Management on 2023-10-26 10:15:49 UTC --- This bug having no release flag set previously, is now set with release flag 'odf‑4.14.0' to '?', and so is being proposed to be fixed at the ODF 4.14.0 release. Note that the 3 Acks (pm_ack, devel_ack, qa_ack), if any previously set while release flag was missing, have now been reset since the Acks are to be set against a release flag. --- Additional comment from RHEL Program Management on 2023-10-26 10:15:49 UTC --- The 'Target Release' is not to be set manually at the Red Hat OpenShift Data Foundation product. The 'Target Release' will be auto set appropriately, after the 3 Acks (pm,devel,qa) are set to "+" for a specific release flag and that release flag gets auto set to "+". --- Additional comment from RHEL Program Management on 2023-10-26 10:15:49 UTC --- This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being proposed as a blocker for this release. Please resolve ASAP.
*** Bug 2238933 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Data Foundation 4.12.10 Bug Fix Update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:7820