Bug 2310487 - [rgw] post upgrade from 7.1 to 8.0, old objects download failed with http_status 500
Summary: [rgw] post upgrade from 7.1 to 8.0, old objects download failed with http_sta...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RGW
Version: 8.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 8.0
Assignee: Matt Benjamin (redhat)
QA Contact: Hemanth Sai
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-09-06 18:35 UTC by Hemanth Sai
Modified: 2024-11-25 09:09 UTC (History)
5 users (show)

Fixed In Version: ceph-19.1.1-26.el9cp
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-11-25 09:09:13 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-9721 0 None None None 2024-09-06 18:37:57 UTC
Red Hat Product Errata RHBA-2024:10216 0 None None None 2024-11-25 09:09:15 UTC

Description Hemanth Sai 2024-09-06 18:35:58 UTC
Description of problem:
post upgrade from 7.1(ceph version 18.2.1-229.el9cp) to 8.0(ceph version 19.1.0-71.el9cp), old objects download failed with http_status 500


[cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ aws --endpoint-url http://10.0.65.167:80 s3 cp s3://bkt1/obj4KB obj4KB.download
fatal error: An error occurred (500) when calling the HeadObject operation (reached max retries: 4): Internal Server Error
[cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ 
[cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ aws --endpoint-url http://10.0.65.167:80 s3api head-object --bucket bkt1 --key obj20MB

An error occurred (500) when calling the HeadObject operation (reached max retries: 4): Internal Server Error
[cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ 
[cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ aws --endpoint-url http://10.0.65.167:80 s3 cp s3://versioned-bkt2/obj3MB obj3MB_versioned-bkt2.download
fatal error: An error occurred (500) when calling the HeadObject operation (reached max retries: 4): Internal Server Error
[cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ 





I can see this error in rgw log at debug_level 20:

2024-09-06T15:28:47.368+0000 7fd97b714640 15 req 966458567729292545 0.000000000s s3:get_obj server signature=4943204ef65ed22b90486930629958c025cb07bee0053e64a4068cc3303b28e2
2024-09-06T15:28:47.368+0000 7fd97b714640 15 req 966458567729292545 0.000000000s s3:get_obj client signature=4943204ef65ed22b90486930629958c025cb07bee0053e64a4068cc3303b28e2
2024-09-06T15:28:47.368+0000 7fd97b714640 15 req 966458567729292545 0.000000000s s3:get_obj compare=0
2024-09-06T15:28:47.368+0000 7fd97b714640 20 req 966458567729292545 0.000000000s s3:get_obj rgw::auth::s3::LocalEngine granted access
2024-09-06T15:28:47.368+0000 7fd97b714640 20 req 966458567729292545 0.000000000s s3:get_obj rgw::auth::s3::AWSAuthStrategy granted access
2024-09-06T15:28:47.368+0000 7fd97b714640  2 req 966458567729292545 0.000000000s s3:get_obj normalizing buckets and tenants
2024-09-06T15:28:47.368+0000 7fd97b714640 10 req 966458567729292545 0.000000000s s->object=obj5GB s->bucket=bkt1
2024-09-06T15:28:47.368+0000 7fd97b714640  2 req 966458567729292545 0.000000000s s3:get_obj init permissions
2024-09-06T15:28:47.368+0000 7fd97b714640 10 req 966458567729292545 0.000000000s s3:get_obj cache get: name=default.rgw.meta+root+bkt1 : hit (requested=0x11, cached=0x11)
2024-09-06T15:28:47.368+0000 7fd97b714640 15 req 966458567729292545 0.000000000s s3:get_obj decode_policy Read AccessControlPolicy<AccessControlPolicy xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>hsm</ID><DisplayName>hsm</DisplayName></Owner><AccessControlList><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="CanonicalUser"><ID>hsm</ID><DisplayName>hsm</DisplayName></Grantee><Permission>FULL_CONTROL</Permission></Grant></AccessControlList></AccessControlPolicy>
2024-09-06T15:28:47.368+0000 7fd97b714640  2 req 966458567729292545 0.000000000s s3:get_obj recalculating target
2024-09-06T15:28:47.368+0000 7fd97b714640  2 req 966458567729292545 0.000000000s s3:get_obj reading permissions
2024-09-06T15:28:47.368+0000 7fd97b714640 20 req 966458567729292545 0.000000000s s3:get_obj get_obj_state: octx=0x561f2edf5b20 obj=bkt1:obj5GB state=0x561f2eae6de8 s->prefetch_data=0
2024-09-06T15:28:47.369+0000 7fd987f2d640  0 req 966458567729292545 0.001000006s s3:get_obj ERROR: couldn't decode manifest
2024-09-06T15:28:47.369+0000 7fd987f2d640 10 req 966458567729292545 0.001000006s s3:get_obj read_permissions on :bkt1[0b347b55-60cb-4fcc-b442-9b23509505b3.15129.3])::bkt1[0b347b55-60cb-4fcc-b442-9b23509505b3.15129.3]):obj5GB only_bucket=0 ret=-5
2024-09-06T15:28:47.369+0000 7fd987f2d640 20 req 966458567729292545 0.001000006s op->ERRORHANDLER: err_no=-5 new_err_no=-5
2024-09-06T15:28:47.369+0000 7fd987f2d640  0 WARNING: set_req_state_err err_no=5 resorting to 500
2024-09-06T15:28:47.369+0000 7fd987f2d640 10 req 966458567729292545 0.001000006s cache get: name=default.rgw.log++script.postrequest. : hit (negative entry)
2024-09-06T15:28:47.369+0000 7fd987f2d640  2 req 966458567729292545 0.001000006s s3:get_obj op status=0
2024-09-06T15:28:47.369+0000 7fd987f2d640  2 req 966458567729292545 0.001000006s s3:get_obj http status=500
2024-09-06T15:28:47.369+0000 7fd987f2d640  1 ====== req done req=0x7fd8d03994a0 op status=0 http_status=500 latency=0.001000006s ======
2024-09-06T15:28:47.370+0000 7fd987f2d640  1 beast: 0x7fd8d03994a0: 10.0.67.37 - hsm [06/Sep/2024:15:28:47.368 +0000] "HEAD /bkt1/obj5GB HTTP/1.1" 500 0 - "aws-cli/1.34.13 md/Botocore#1.35.13 ua/2.0 os/linux#5.14.0-427.33.1.el9_4.x86_64 md/arch#x86_64 lang/python#3.9.18 md/pyimpl#CPython cfg/retry-mode#legacy botocore/1.35.13" - latency=0.001000006s
2024-09-06T15:28:47.371+0000 7fd985f29640 20 failed to read header: end of stream





but new objects upload and download are successful after upgrade:

[cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ aws --endpoint-url http://10.0.65.167:80 s3 cp obj3MB s3://bkt1/obj3MB_after_upgrade
upload: ./obj3MB to s3://bkt1/obj3MB_after_upgrade               
[cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ 
[cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ aws --endpoint-url http://10.0.65.167:80 s3api head-object --bucket bkt1 --key obj3MB_after_upgrade
{
    "AcceptRanges": "bytes",
    "LastModified": "Fri, 06 Sep 2024 15:13:53 GMT",
    "ContentLength": 3000000,
    "ETag": "\"c9fc2d3dd83ab67a129ac10b09c9ebbb\"",
    "ContentType": "binary/octet-stream",
    "Metadata": {}
}
[cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ 
[cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ aws --endpoint-url http://10.0.65.167:80 s3 cp s3://bkt1/obj3MB_after_upgrade obj3MB_after_upgrade.download
download: s3://bkt1/obj3MB_after_upgrade to ./obj3MB_after_upgrade.download
[cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ 



Version-Release number of selected component (if applicable):
ceph version 19.1.0-71.el9cp

How reproducible:
1/1

Steps to Reproduce:
1.deploy cluster on 7.1 with rgw daemon
2.create a bucket and upload objects
3.upgrade the cluster to 8.0
4.download the old objects after upgrade. they are failing with http_status 500

Actual results:
HeadObject/GetObject of old objects is failing with 500 Internal Server Error after upgrade to 8.0

Expected results:
expected old objects download is successful after upgrade to 8.0


Additional info:

cluster details:
client node: 10.0.67.37
rgw node: 10.0.65.167
user/pass: root/passwd, cephuser/cephuser

rgw logs at debug_level 20: http://magna002.ceph.redhat.com/cephci-jenkins/hsm/squid_upgrade_object_get_fail/ceph-client.rgw.rgw.1.ceph-hsm-upgrade-1cb6kq-node5.bwmesq.log

Comment 8 errata-xmlrpc 2024-11-25 09:09:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 8.0 security, bug fix, and enhancement updates), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:10216


Note You need to log in before you can comment on or make changes to this bug.