Bug 2209616

Summary: test_rgw_kafka_notifications test fails with "RGW bucket notification is not working as expected." Error
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Abdul Kandathil (IBM) <akandath>
Component: cephAssignee: Yuval Lifshitz <ylifshit>
ceph sub component: RGW QA Contact: Vishakha Kathole <vkathole>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: unspecified CC: bniver, hnallurv, jthottan, mbenjamin, muagarwa, odf-bz-bot, sheggodu, sostapov, vkathole, ylifshit
Version: 4.13   
Target Milestone: ---   
Target Release: ODF 4.15.0   
Hardware: s390x   
OS: Linux   
Whiteboard:
Fixed In Version: 4.14.0-110 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-03-19 15:21:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Abdul Kandathil (IBM) 2023-05-24 09:02:43 UTC
Description of problem (please be detailed as possible and provide log
snippets):
Tier1 test "tests/e2e/workloads/app/amq/test_rgw_kafka_notifications.py::TestRGWAndKafkaNotifications::test_rgw_kafka_notifications" fails with below error.

>           raise Exception(
                "Error: Messages are not recieved from Kafka side."
                "RGW bucket notification is not working as expected."
            )
E           Exception: Error: Messages are not recieved from Kafka side.RGW bucket notification is not working as expected.



Version of all relevant components (if applicable):
OCP 4.13.0-rc.6
ODF 4.13.0-203


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?
yes

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Install OCP and ODF 4.13
2. Execute tier1 test "tests/e2e/workloads/app/amq/test_rgw_kafka_notifications.py::TestRGWAndKafkaNotifications::test_rgw_kafka_notifications"
3.


Actual results:


Expected results:


Additional info:
Logs available in google drive : https://drive.google.com/file/d/180nzo7xi8cWU_wy3aVb1MteOPYI1AiNp/view?usp=sharing

Comment 2 Mudit Agarwal 2023-05-25 08:30:02 UTC
Jiffin, PTAL

Comment 4 Abdul Kandathil (IBM) 2023-05-30 11:57:40 UTC
Reran the test again after setting debug level as mentioned.

sh-5.1$ ceph config set client.rgw.ocs.storagecluster.cephobjectstore.a debug_rgw 20
sh-5.1$ ceph config get client.rgw.ocs.storagecluster.cephobjectstore.a debug_rgw
20/20
sh-5.1$


please find the logs here : https://drive.google.com/file/d/1em1jGjkYh3Gax6-JMP_4dGpMwbcNdu_n/view?usp=sharing

Comment 21 Vishakha Kathole 2023-09-19 09:37:13 UTC
Marking as failed QA since "tests/e2e/workloads/app/amq/test_rgw_kafka_notifications.py::TestRGWAndKafkaNotifications::test_rgw_kafka_notifications" is still failing in our runs. 

Versions tested on:
OCP: 4.14.0-0.nightly-2023-09-15-055234
ODF: 4.14.0-134
ceph version 17.2.6-138.el9cp (b488c8dad42b2ecffcd96f3d76eeeecce48b8590) quincy (stable)
Logs:
 http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/mashetty-svm15/mashetty-svm15_20230915T122502/logs/ocs-ci-logs-1695113189/by_outcome/failed/tests/e2e/workloads/app/amq/test_rgw_kafka_notifications.py/


Versions tested on:
OCP: 4.14.0-0.nightly-2023-08-11-055332
ODF: 4.14.0-115
Logs: 
 http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-001vii1cs30-t1/j-001vii1cs30-t1_20230824T043853/logs/ocs-ci-logs-1692855470/by_outcome/failed/tests/e2e/workloads/app/amq/test_rgw_kafka_notifications.py/

Comment 22 Mudit Agarwal 2023-09-19 10:01:37 UTC
Moving the non-blocker out of 4.14

Comment 24 Harish NV Rao 2023-09-20 07:32:44 UTC
(In reply to Mudit Agarwal from comment #22)
> Moving the non-blocker out of 4.14

The test that found this failure was added based on a customer BZ(https://bugzilla.redhat.com/show_bug.cgi?id=1937100) earlier where notifications were not working. 
Requesting this bz to be fixed in 4.14 as it breaks the earlier working functionality.

Comment 25 Harish NV Rao 2023-09-28 13:58:27 UTC
(In reply to Harish NV Rao from comment #24)
> (In reply to Mudit Agarwal from comment #22)
> > Moving the non-blocker out of 4.14
> 
> The test that found this failure was added based on a customer
> BZ(https://bugzilla.redhat.com/show_bug.cgi?id=1937100) earlier where
> notifications were not working. 
> Requesting this bz to be fixed in 4.14 as it breaks the earlier working
> functionality.

requesting this to be fixed in 4.14.

Comment 31 Mudit Agarwal 2023-10-10 12:02:05 UTC
Ceph fix is pushed to 7.0, I don't see how we can fix this in 4.14
Keeping the needinfo on me to create a ceph tracker

Comment 37 Mudit Agarwal 2023-10-10 13:11:51 UTC
Hi Matt,

>> Does this issue only affect s390x?
Harish, can you please reply to this question

>> Also, you state that the fix is "pushed to 7.0"--but it's actually fixed on current rhcs-6.1.
My bad, I read the comments wrong, thanks for checking

This is actually a failed qa so we should debug https://bugzilla.redhat.com/show_bug.cgi?id=2209616#c21

Can someone please take a look?

Thanks
Mudit

I am still keeping it at 4.15 given that the fix will be in Ceph and we don't have a release vehicle.

Comment 54 errata-xmlrpc 2024-03-19 15:21:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.15.0 security, enhancement, & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:1383