Bug 2254678 - [4.14.z tracker for https://bugzilla.redhat.com/show_bug.cgi?id=2254303][RGW] s3cmd get is failing for multipart objects with "KeyError: 'etag'"
Summary: [4.14.z tracker for https://bugzilla.redhat.com/show_bug.cgi?id=2254303][RGW]...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ceph
Version: 4.13
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ODF 4.14.3
Assignee: Matt Benjamin (redhat)
QA Contact: Uday kurundwade
URL:
Whiteboard:
Depends On: 2254303
Blocks: 2254685
TreeView+ depends on / blocked
 
Reported: 2023-12-15 05:37 UTC by krishnaram Karthick
Modified: 2023-12-15 17:48 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of: 2254303
: 2254685 (view as bug list)
Environment:
Last Closed: 2023-12-15 17:48:44 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2023:7869 0 None None None 2023-12-15 17:48:51 UTC

Description krishnaram Karthick 2023-12-15 05:37:52 UTC
Cloning the bug to include the fix in 4.14.3 z stream 

+++ This bug was initially created as a clone of Bug #2254303 +++

Description of problem:
seeing "KeyError: 'etag'" for multipart object download through s3cmd


Version-Release number of selected component (if applicable):
ceph version 17.2.6-167.el9cp
S3cmd:   2.4.0 and 2.3.0
python:   3.9.18


How reproducible:
3/3


Steps to Reproduce:
1. Create user and configure s3cmd
2. create bucket and upload small and multipart objects
3. download all objects in the buckets


Actual results:
s3cmd get is successful for objects of size <= 15m, observing keyerror "etag" for object size >= 16m 


Expected results:
s3cmd get for all object should be successful


Additional info:
log: 
http://magna002.ceph.redhat.com/ceph-qe-logs/anuchaithra/s3cmd/s3cmd_get_issue.log
http://magna002.ceph.redhat.com/cephci-jenkins/test-runs/17.2.6-167/Regression/rgw/21/tier-2_rgw_test-using-s3cmd/S3CMD_large_object_download_with_GC_0.log
http://magna002.ceph.redhat.com/cephci-jenkins/test-runs/17.2.6-167/Regression/rgw/21/tier-2_rgw_test-using-s3cmd/S3CMD_object_download_0.log
RGW log: http://magna002.ceph.redhat.com/ceph-qe-logs/anuchaithra/s3cmd/rgw.log


snippet of downloading multipart object "obj-20m-1" from bucket "test-bucket-1" through s3cmd

(venv) [root@ceph-testing-s3cmd-829g77-node6 cephuser]# s3cmd get s3://test-bucket-1/obj-20m-1 obj-20m-1
......................

Traceback (most recent call last):
  File "/home/cephuser/venv/bin/s3cmd", line 3627, in <module>
    rc = main()
  File "/home/cephuser/venv/bin/s3cmd", line 3524, in main
    rc = cmd_func(args)
  File "/home/cephuser/venv/bin/s3cmd", line 548, in cmd_object_get
    remote_list, exclude_list, remote_total_size = fetch_remote_list(
  File "/home/cephuser/venv/lib64/python3.9/site-packages/S3/FileLists.py", line 552, in fetch_remote_list
    _get_remote_attribs(uri, remote_item)
  File "/home/cephuser/venv/lib64/python3.9/site-packages/S3/FileLists.py", line 411, in _get_remote_attribs
    'md5': response['headers']['etag'].strip('"\''),
KeyError: 'etag'

...............



Note: able download multipart object through awscli

snippet of downloading multipart object "obj-20m-1" from bucket "test-bucket-1" through awscli
Working i
(venv) [root@ceph-testing-s3cmd-829g77-node6 cephuser]# /usr/local/bin/aws s3api get-object --bucket test-bucket-1 --key obj-20m-1 object-20m --endpoint http://10.0.205.205:80
{
    "AcceptRanges": "bytes",
    "LastModified": "Wed, 13 Dec 2023 07:03:44 GMT",
    "ContentLength": 20971520,
    "ContentType": "application/octet-stream",
    "Metadata": {
        "s3cmd-attrs": "atime:1702446799/ctime:1702448154/gid:0/gname:root/md5:8f4e33f3dc3e414ff94e5fb6905cba8c/mode:33188/mtime:1702448154/uid:0/uname:root"
    }
}
(venv) [root@ceph-testing-s3cmd-829g77-node6 cephuser]#

--- Additional comment from RHEL Program Management on 2023-12-13 07:25:14 UTC ---

This bug report has Keywords: Regression or TestBlocker.

Since no regressions or test blockers are allowed between releases, it is being proposed as a blocker for this release.

Please resolve \triage ASAP.

--- Additional comment from Madhavi Kasturi on 2023-12-13 11:33:45 UTC ---

Adding an update:

on ceph version 17.2.6-166.el9cp, download of multipart object is successful.  
s3cmd : 2.3.0

snippet:
[root@ceph-node1 ~]# ceph -v
ceph version 17.2.6-166.el9cp (6c669fc65a39a8e924fdf5e3f24a1539c6de8753) quincy (stable)
[root@ceph-node1 ~]# s3cmd --version
s3cmd version 2.3.0
[root@ceph-node1 ~]# fallocate -l 50m obj50m
[root@ceph-node1 ~]# 
[root@ceph-node1 ~]# s3cmd put obj50m s3://kvm-bkt1/mp-obj1
upload: 'obj50m' -> 's3://kvm-bkt1/mp-obj1'  [part 1 of 4, 15MB] [1 of 1]
 15728640 of 15728640   100% in    0s    56.74 MB/s  done
upload: 'obj50m' -> 's3://kvm-bkt1/mp-obj1'  [part 2 of 4, 15MB] [1 of 1]
 15728640 of 15728640   100% in    0s    61.01 MB/s  done
upload: 'obj50m' -> 's3://kvm-bkt1/mp-obj1'  [part 3 of 4, 15MB] [1 of 1]
 15728640 of 15728640   100% in    0s    62.53 MB/s  done
upload: 'obj50m' -> 's3://kvm-bkt1/mp-obj1'  [part 4 of 4, 5MB] [1 of 1]
 5242880 of 5242880   100% in    0s    42.97 MB/s  done
[root@ceph-node1 ~]# s3cmd get s3://kvm-bkt1/mp-obj1
download: 's3://kvm-bkt1/mp-obj1' -> './mp-obj1'  [1 of 1]
 52428800 of 52428800   100% in    0s   276.94 MB/s  done

--- Additional comment from Anuchaithra on 2023-12-13 11:41:31 UTC ---

Note: able download multipart object through s5cmd and curl


snippet of downloading multipart object "obj-20m-1" from bucket "test-bucket-1" through s5cmd

(venv) [root@ceph-testing-s3cmd-829g77-node6 cephuser]# s5cmd --endpoint-url http://10.0.205.205:80 --credentials-file ~/.aws/credentials cp s3://test-bucket-1/obj-20m-1 .
cp s3://test-bucket-1/obj-20m-1 obj-20m-1
(venv) [root@ceph-testing-s3cmd-829g77-node6 cephuser]#




snippet of downloading multipart object "obj-20m-1" from bucket "test-bucket-1" through curl

(venv) [root@ceph-testing-s3cmd-829g77-node6 curl-8.1.2]# curl --ipv4 --http1.1 --aws-sigv4 aws:amz:us-east-1:s3 -u '30SW2EAEAZGW0Y675GRZ:2RpQ3IRrjq63F9vTtRF4A8WiSpwuP7jgQ5tdzkYV' -H x-amz-content-sha256:UNSIGNED-PAYLOAD -o object-290m-curl http://10.0.205.205:80/test-bucket-1/obj-20m-1 -v
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 10.0.205.205:80...
* Connected to 10.0.205.205 (10.0.205.205) port 80 (#0)
* Server auth using AWS_SIGV4 with user '30SW2EAEAZGW0Y675GRZ'
> GET /test-bucket-1/obj-20m-1 HTTP/1.1
> Host: 10.0.205.205
> Authorization: AWS4-HMAC-SHA256 Credential=30SW2EAEAZGW0Y675GRZ/20231213/0/10/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=1a67f0211813c04e6d088a0999c1e800bafb643ed323856accefcd5b7bdcdb5f
> X-Amz-Date: 20231213T113733Z
> User-Agent: curl/8.1.2
> Accept: */*
> x-amz-content-sha256:UNSIGNED-PAYLOAD
>
< HTTP/1.1 200 OK
< Content-Length: 20971520
< Accept-Ranges: bytes
< Last-Modified: Wed, 13 Dec 2023 07:03:44 GMT
< x-rgw-object-type: Normal
< x-amz-meta-s3cmd-attrs: atime:1702446799/ctime:1702448154/gid:0/gname:root/md5:8f4e33f3dc3e414ff94e5fb6905cba8c/mode:33188/mtime:1702448154/uid:0/uname:root
< x-amz-request-id: tx000002ac63f494fb15545-006579977d-d3a3-default
< Content-Type: application/octet-stream
< Date: Wed, 13 Dec 2023 11:37:33 GMT
< Connection: Keep-Alive
<
{ [15928 bytes data]
100 20.0M  100 20.0M    0     0   575M      0 --:--:-- --:--:-- --:--:--  588M
* Connection #0 to host 10.0.205.205 left intact
(venv) [root@ceph-testing-s3cmd-829g77-node6 curl-8.1.2]#

--- Additional comment from errata-xmlrpc on 2023-12-13 23:14:44 UTC ---

This bug has been added to advisory RHBA-2023:125487 by Thomas Serlin (tserlin)

--- Additional comment from errata-xmlrpc on 2023-12-13 23:14:45 UTC ---

Bug report changed to ON_QA status by Errata System.
A QE request has been submitted for advisory RHBA-2023:125487-01
https://errata.devel.redhat.com/advisory/125487

--- Additional comment from errata-xmlrpc on 2023-12-13 23:14:55 UTC ---

This bug has been added to advisory RHBA-2023:125487 by Thomas Serlin (tserlin)

--- Additional comment from Anuchaithra on 2023-12-14 10:30:54 UTC ---

Verified with ceph Version 17.2.6-169.el9cp, issue reproduced.

log: http://magna002.ceph.redhat.com/ceph-qe-logs/anuchaithra/cephci-run-FO6K6B/S3CMD_large_object_download_with_GC_0.log

moving back to Assigned state.

--- Additional comment from Anuchaithra on 2023-12-14 11:49:46 UTC ---

Upgraded cluster from ceph version 17.2.6-167.el9cp to 17.2.6-169.el9cp, verified manually its working fine

[root@ceph-testing-s3cmd-829g77-node6 ~]# ceph versions
{
    "mon": {
        "ceph version 17.2.6-169.el9cp (b44674040916179786357afe4f17ad407051029c) quincy (stable)": 3
    },
    "mgr": {
        "ceph version 17.2.6-169.el9cp (b44674040916179786357afe4f17ad407051029c) quincy (stable)": 2
    },
    "osd": {
        "ceph version 17.2.6-169.el9cp (b44674040916179786357afe4f17ad407051029c) quincy (stable)": 16
    },
    "mds": {},
    "rgw": {
        "ceph version 17.2.6-169.el9cp (b44674040916179786357afe4f17ad407051029c) quincy (stable)": 1
    },
    "overall": {
        "ceph version 17.2.6-169.el9cp (b44674040916179786357afe4f17ad407051029c) quincy (stable)": 22
    }
}

[root@ceph-testing-s3cmd-829g77-node6 ~]# s3cmd put obj-20m s3://test-bucket-1/obj-20m
upload: 'obj-20m' -> 's3://test-bucket-1/obj-20m'  [part 1 of 2, 15MB] [1 of 1]
 15728640 of 15728640   100% in    0s    75.32 MB/s  done
upload: 'obj-20m' -> 's3://test-bucket-1/obj-20m'  [part 2 of 2, 5MB] [1 of 1]
 5242880 of 5242880   100% in    0s    36.17 MB/s  done

[root@ceph-testing-s3cmd-829g77-node6 ~]# s3cmd get s3://test-bucket-1/obj-20m s3cmd-20m-obj
download: 's3://test-bucket-1/obj-20m' -> 's3cmd-20m-obj'  [1 of 1]
 20971520 of 20971520   100% in    0s   251.10 MB/s  done
[root@ceph-testing-s3cmd-829g77-node6 ~]#


Sorry for the wrong information provided in comment#7, build fetched there was older one i.e, 17.2.6-167.el9cp.

Hence requesting  you to kindly move BZ back to ON_QA.

--- Additional comment from Anuchaithra on 2023-12-14 12:56:51 UTC ---

Verified on ceph version: 17.2.6-169.el9cp, Issue did not reproduce.

log: http://magna002.ceph.redhat.com/ceph-qe-logs/anuchaithra/cephci-run-LTH65K/

moving BZ to verified.

--- Additional comment from Ken Dreyer (Red Hat) on 2023-12-14 16:27:41 UTC ---

QE verified this on 17.2.6-169.el9cp. That build also contained changes from another bug, rhbz#2167318.

We want to ship an update for this etag issue quickly, so Thomas and I edited the branch to remove 2167318's changes.

Today we are building 17.2.6-170.el9cp. It will only have this single bug's changes on top of what we shipped earlier this week for RH Ceph Storage 6.1z3.

--- Additional comment from  on 2023-12-14 20:14:33 UTC ---

We should re-verify this bug on the expected build for the 6.1 z3 async release: ceph-17.2.6-170.el9cp

Moving back to ON_QA.

Thomas

Comment 11 errata-xmlrpc 2023-12-15 17:48:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Data Foundation 4.14.3 Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:7869


Note You need to log in before you can comment on or make changes to this bug.