Description of problem: post upgrade from 7.1(ceph version 18.2.1-229.el9cp) to 8.0(ceph version 19.1.0-71.el9cp), old objects download failed with http_status 500 [cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ aws --endpoint-url http://10.0.65.167:80 s3 cp s3://bkt1/obj4KB obj4KB.download fatal error: An error occurred (500) when calling the HeadObject operation (reached max retries: 4): Internal Server Error [cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ [cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ aws --endpoint-url http://10.0.65.167:80 s3api head-object --bucket bkt1 --key obj20MB An error occurred (500) when calling the HeadObject operation (reached max retries: 4): Internal Server Error [cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ [cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ aws --endpoint-url http://10.0.65.167:80 s3 cp s3://versioned-bkt2/obj3MB obj3MB_versioned-bkt2.download fatal error: An error occurred (500) when calling the HeadObject operation (reached max retries: 4): Internal Server Error [cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ I can see this error in rgw log at debug_level 20: 2024-09-06T15:28:47.368+0000 7fd97b714640 15 req 966458567729292545 0.000000000s s3:get_obj server signature=4943204ef65ed22b90486930629958c025cb07bee0053e64a4068cc3303b28e2 2024-09-06T15:28:47.368+0000 7fd97b714640 15 req 966458567729292545 0.000000000s s3:get_obj client signature=4943204ef65ed22b90486930629958c025cb07bee0053e64a4068cc3303b28e2 2024-09-06T15:28:47.368+0000 7fd97b714640 15 req 966458567729292545 0.000000000s s3:get_obj compare=0 2024-09-06T15:28:47.368+0000 7fd97b714640 20 req 966458567729292545 0.000000000s s3:get_obj rgw::auth::s3::LocalEngine granted access 2024-09-06T15:28:47.368+0000 7fd97b714640 20 req 966458567729292545 0.000000000s s3:get_obj rgw::auth::s3::AWSAuthStrategy granted access 2024-09-06T15:28:47.368+0000 7fd97b714640 2 req 966458567729292545 0.000000000s s3:get_obj normalizing buckets and tenants 2024-09-06T15:28:47.368+0000 7fd97b714640 10 req 966458567729292545 0.000000000s s->object=obj5GB s->bucket=bkt1 2024-09-06T15:28:47.368+0000 7fd97b714640 2 req 966458567729292545 0.000000000s s3:get_obj init permissions 2024-09-06T15:28:47.368+0000 7fd97b714640 10 req 966458567729292545 0.000000000s s3:get_obj cache get: name=default.rgw.meta+root+bkt1 : hit (requested=0x11, cached=0x11) 2024-09-06T15:28:47.368+0000 7fd97b714640 15 req 966458567729292545 0.000000000s s3:get_obj decode_policy Read AccessControlPolicy<AccessControlPolicy xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>hsm</ID><DisplayName>hsm</DisplayName></Owner><AccessControlList><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="CanonicalUser"><ID>hsm</ID><DisplayName>hsm</DisplayName></Grantee><Permission>FULL_CONTROL</Permission></Grant></AccessControlList></AccessControlPolicy> 2024-09-06T15:28:47.368+0000 7fd97b714640 2 req 966458567729292545 0.000000000s s3:get_obj recalculating target 2024-09-06T15:28:47.368+0000 7fd97b714640 2 req 966458567729292545 0.000000000s s3:get_obj reading permissions 2024-09-06T15:28:47.368+0000 7fd97b714640 20 req 966458567729292545 0.000000000s s3:get_obj get_obj_state: octx=0x561f2edf5b20 obj=bkt1:obj5GB state=0x561f2eae6de8 s->prefetch_data=0 2024-09-06T15:28:47.369+0000 7fd987f2d640 0 req 966458567729292545 0.001000006s s3:get_obj ERROR: couldn't decode manifest 2024-09-06T15:28:47.369+0000 7fd987f2d640 10 req 966458567729292545 0.001000006s s3:get_obj read_permissions on :bkt1[0b347b55-60cb-4fcc-b442-9b23509505b3.15129.3])::bkt1[0b347b55-60cb-4fcc-b442-9b23509505b3.15129.3]):obj5GB only_bucket=0 ret=-5 2024-09-06T15:28:47.369+0000 7fd987f2d640 20 req 966458567729292545 0.001000006s op->ERRORHANDLER: err_no=-5 new_err_no=-5 2024-09-06T15:28:47.369+0000 7fd987f2d640 0 WARNING: set_req_state_err err_no=5 resorting to 500 2024-09-06T15:28:47.369+0000 7fd987f2d640 10 req 966458567729292545 0.001000006s cache get: name=default.rgw.log++script.postrequest. : hit (negative entry) 2024-09-06T15:28:47.369+0000 7fd987f2d640 2 req 966458567729292545 0.001000006s s3:get_obj op status=0 2024-09-06T15:28:47.369+0000 7fd987f2d640 2 req 966458567729292545 0.001000006s s3:get_obj http status=500 2024-09-06T15:28:47.369+0000 7fd987f2d640 1 ====== req done req=0x7fd8d03994a0 op status=0 http_status=500 latency=0.001000006s ====== 2024-09-06T15:28:47.370+0000 7fd987f2d640 1 beast: 0x7fd8d03994a0: 10.0.67.37 - hsm [06/Sep/2024:15:28:47.368 +0000] "HEAD /bkt1/obj5GB HTTP/1.1" 500 0 - "aws-cli/1.34.13 md/Botocore#1.35.13 ua/2.0 os/linux#5.14.0-427.33.1.el9_4.x86_64 md/arch#x86_64 lang/python#3.9.18 md/pyimpl#CPython cfg/retry-mode#legacy botocore/1.35.13" - latency=0.001000006s 2024-09-06T15:28:47.371+0000 7fd985f29640 20 failed to read header: end of stream but new objects upload and download are successful after upgrade: [cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ aws --endpoint-url http://10.0.65.167:80 s3 cp obj3MB s3://bkt1/obj3MB_after_upgrade upload: ./obj3MB to s3://bkt1/obj3MB_after_upgrade [cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ [cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ aws --endpoint-url http://10.0.65.167:80 s3api head-object --bucket bkt1 --key obj3MB_after_upgrade { "AcceptRanges": "bytes", "LastModified": "Fri, 06 Sep 2024 15:13:53 GMT", "ContentLength": 3000000, "ETag": "\"c9fc2d3dd83ab67a129ac10b09c9ebbb\"", "ContentType": "binary/octet-stream", "Metadata": {} } [cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ [cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ aws --endpoint-url http://10.0.65.167:80 s3 cp s3://bkt1/obj3MB_after_upgrade obj3MB_after_upgrade.download download: s3://bkt1/obj3MB_after_upgrade to ./obj3MB_after_upgrade.download [cephuser@ceph-hsm-upgrade-1cb6kq-node6 ~]$ Version-Release number of selected component (if applicable): ceph version 19.1.0-71.el9cp How reproducible: 1/1 Steps to Reproduce: 1.deploy cluster on 7.1 with rgw daemon 2.create a bucket and upload objects 3.upgrade the cluster to 8.0 4.download the old objects after upgrade. they are failing with http_status 500 Actual results: HeadObject/GetObject of old objects is failing with 500 Internal Server Error after upgrade to 8.0 Expected results: expected old objects download is successful after upgrade to 8.0 Additional info: cluster details: client node: 10.0.67.37 rgw node: 10.0.65.167 user/pass: root/passwd, cephuser/cephuser rgw logs at debug_level 20: http://magna002.ceph.redhat.com/cephci-jenkins/hsm/squid_upgrade_object_get_fail/ceph-client.rgw.rgw.1.ceph-hsm-upgrade-1cb6kq-node5.bwmesq.log
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 8.0 security, bug fix, and enhancement updates), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:10216