Bug 1932396 - [TRACKER for BZ #1943619] - RGW does not handle "Expect: 100-continue" answers from http requests not needing it
Summary: [TRACKER for BZ #1943619] - RGW does not handle "Expect: 100-continue" answer...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ceph
Version: 4.6
Hardware: x86_64
OS: Unspecified
unspecified
low
Target Milestone: ---
: ODF 4.9.0
Assignee: Yuval Lifshitz
QA Contact: Tiffany Nguyen
URL:
Whiteboard:
Depends On:
Blocks: 1943619
TreeView+ depends on / blocked
 
Reported: 2021-02-24 14:57 UTC by Guillaume Moutier
Modified: 2023-08-09 16:37 UTC (History)
12 users (show)

Fixed In Version: v4.9.0-164.ci
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 1943619 (view as bug list)
Environment:
Last Closed: 2021-12-13 17:44:30 UTC
Embargoed:
tunguyen: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2021:5086 0 None None None 2021-12-13 17:44:43 UTC

Description Guillaume Moutier 2021-02-24 14:57:54 UTC
Description of problem:
When trying to send bucket notifications to ElasticSearch HTTP endpoint (http://elasticsearch:9200/index/_doc/), the following error appears in RGW logs:
---
debug 2021-02-24 14:34:34.180 7f6185119700  1 ====== starting new request req=0x7f629e30a680 =====
debug 2021-02-24 14:34:34.190 7f6183916700  1 push to endpoint HTTP/S Endpoint
URI: http://elasticsearch-sample-es-http.rgw-es.svc:9200/s3_index/_doc/
Ack Level: 
don't verify SSL failed, with error: -5
debug 2021-02-24 14:34:34.190 7f6183916700  1 ====== req done req=0x7f629e30a680 op status=0 http_status=200 latency=0.0100002s ======
debug 2021-02-24 14:34:34.190 7f6183916700  1 beast: 0x7f629e30a680: 10.131.2.41 - - [2021-02-24 14:34:34.0.190742s] "PUT /std-user-bucket1/Pipfile HTTP/1.1" 200 180 - "Boto3/1.17.14 Python/3.6.8 Linux/4.18.0-193.41.1.el8_2.x86_64 Botocore/1.20.14" -
---
Notes:
- The error is not linked to SSL, this is a generic error message. Endpoint is working over http. Plus SSL verify is disabled from the bucket notification configuration.
- Direct curl command over endpoint works flawlessly.
- After discussion with Yuval Lifshitz, "its is because the client send "Expect: 100-continue" - If a server respect that field, and actually answer with 100 Continue result code, the RGW treat that as an -EIO. -EIO = -5"
- This bug seems to have been corrected by this upstream and in the 4.2z1 branch: https://github.com/ceph/ceph/pull/34414

So apparently it would only need to make its way to OCS.


Version-Release number of selected component (if applicable): OCS 4.6.2


How reproducible:
Send an RGW bucket notification to an ElasticSearch endpoint.


Actual results:
- Notification sending error.


Expected results:
- Notification sent and logged into ElasticSearch.


Additional info:
This problem should occur with any http endpoint respecting the "Expect: 100-continue" flag in the request.

Comment 2 Yuval Lifshitz 2021-02-24 15:01:55 UTC
issue was fixed by the following commit:

commit 75b17dd1193d63f60eb677d7523321717626299c
Author: Yuval Lifshitz <yuvalif>
Date:   Mon Apr 6 12:50:37 2020 +0300

    rgw/http: add timeout to http client
    
    also, prevent "Expect: 100-continue" from being sent
    when not needed
    
    Signed-off-by: Yuval Lifshitz <yuvalif>
    (cherry picked from commit dd49cc83078c7e268ce3de7ab0bfbf3035ed5d50)

Comment 3 Mudit Agarwal 2021-03-02 09:27:41 UTC
Confirmed with Yuval, above commit is already there in RHCS4.2z1. Moving this BZ to MODIFIED.

Comment 11 Tiffany Nguyen 2021-03-23 19:50:20 UTC
Not able to verifiy this bz due to put object error.  See https://bugzilla.redhat.com/show_bug.cgi?id=1932396#c10 for more detail.

Comment 12 Mudit Agarwal 2021-03-24 06:16:57 UTC
Hi Yuval,

PTAL, do you know in which exact ceph version this fix went in?

Thanks

Comment 14 Mudit Agarwal 2021-03-26 12:39:45 UTC
Hi Yuval,

I have tracked the commit, it is here: https://gitlab.cee.redhat.com/ceph/ceph/-/commit/75b17dd1193d63f60eb677d7523321717626299c
Which means the build we are testing with has the patch but if we are still hitting the issue we may have to investigate more.

Let me know if I have to open a Ceph BZ for this.

Thanks

Comment 16 Scott Ostapovicz 2021-03-26 15:57:27 UTC
This seems to be a tracker for an RHCS issue that does not have a BZ.  Please create an RHCS BZ for this issue and link it here.

Comment 17 Mudit Agarwal 2021-03-26 16:04:34 UTC
Created a tracker and moving out of 4.7 as we need a fix in Ceph.
Will set the acks accordingly.

Comment 23 Mudit Agarwal 2021-09-22 09:19:42 UTC
Fix should be available in the latest ODF builds

Comment 32 errata-xmlrpc 2021-12-13 17:44:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Data Foundation 4.9.0 enhancement, security, and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:5086


Note You need to log in before you can comment on or make changes to this bug.