+++ This bug was initially created as a clone of Bug #2271595 +++ +++ This bug was initially created as a clone of Bug #2271399 +++ Description of problem: In a multisite environment, the creation of a role, attachment of a role policy on the primary site, and assume role API calls on the 'secondary site' are successful. However, when attempting s3 operations such as bucket creation using the session token generated post the assume role API call on the secondary site, the operation fails with http_status = 403. Version-Release number of selected component (if applicable): ceph version 18.2.1-77.el9cp How reproducible: 2/2 Steps to Reproduce: 1. Set up a multisite environment, and perform the below steps on the primary site - create 2 users 'lynna.271' and 'annief.469' - add the role capability to the user 'lynna.271' - create a role 'S3RoleOf.lynna.271' and attach a role policy (venv) [root@pluto003 rgw]# radosgw-admin role get --role-name S3RoleOf.lynna.271 { "RoleId": "fe7ddcc3-e2d8-41dc-a80c-2f4cecfb4313", "RoleName": "S3RoleOf.lynna.271", "Path": "/", "Arn": "arn:aws:iam:::role/S3RoleOf.lynna.271", "CreateDate": "2024-03-25T03:53:45.404Z", "MaxSessionDuration": 3600, "AssumeRolePolicyDocument": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"AWS\":[\"arn:aws:iam:::user/annief.469\"]},\"Action\":[\"sts:AssumeRole\"]}]}", "PermissionPolicies": [ { "PolicyName": "policy.lynna.271", "PolicyValue": "{\"Version\":\"2012-10-17\",\"Statement\":{\"Effect\":\"Allow\",\"Action\":\"s3:*\",\"Resource\":\"arn:aws:s3:::*\"}}" } ] } (venv) [root@pluto003 rgw]# radosgw-admin role-policy get --role-name S3RoleOf.lynna.271 --policy-name policy.lynna.271 { "Permission policy": "{\"Version\":\"2012-10-17\",\"Statement\":{\"Effect\":\"Allow\",\"Action\":\"s3:*\",\"Resource\":\"arn:aws:s3:::*\"}}" } 2. Ensure the role and the role policy are synced to the secondary site. 3. Perform the below steps on the secondary site - Perform the assume role API call, - Obtain the session token generated from the assume role API call. - Attempt to perform s3 operations such as creating a bucket using the obtained session token on the secondary site. ------------------ Use the below boto script on the secondary site---------- import boto3 import json import logging from botocore.exceptions import ClientError logging.basicConfig(filename="boto.log", level=logging.DEBUG) from botocore.handlers import validate_bucket_name sts_client = boto3.client('sts', aws_access_key_id='user_annief.469_access_key', aws_secret_access_key='user_annief.469_secret_key', endpoint_url='http://secondary_site_endpoint', region_name='shared', ) response = sts_client.assume_role( RoleArn='arn:aws:iam:::role/S3RoleOf.lynna.271', RoleSessionName='Bob', DurationSeconds=3600 ) print(f"print the assume role response {response}") s3client = boto3.client('s3', aws_access_key_id = response['Credentials']['AccessKeyId'], aws_secret_access_key = response['Credentials']['SecretAccessKey'], aws_session_token = response['Credentials']['SessionToken'], endpoint_url='http://secondary_site_endpoint', region_name='shared',) bucket_name = 'sec-my-bucket' s3bucket = s3client.create_bucket(Bucket=bucket_name) resp = s3client.list_buckets() print(resp) Actual results: Observed the hhtp_status 403 failure for s3 create_bucket, despite a successful assume role API call on the secondary site. Expected results: s3 operations should pass based on the role policy on both sites. Additional info: On the primary site, the same script (with primary zone endpoint) runs successfully without any issues. --- Additional comment from Matt Benjamin (redhat) on 2024-03-25 12:58:48 UTC --- This looks like it could be a configuration problem, to be honest, but we need to triage quickly. thanks! Matt --- Additional comment from Matt Benjamin (redhat) on 2024-03-25 13:02:15 UTC --- vidushi, It's possible we will want to debug the secondary cluster before it's torn down. Can you help with that? thanks, Matt --- Additional comment from Vidushi Mishra on 2024-03-25 13:13:11 UTC --- (In reply to Matt Benjamin (redhat) from comment #2) > vidushi, > > It's possible we will want to debug the secondary cluster before it's torn > down. Can you help with that? > > thanks, > > Matt Hi Matt, Yes, we have collected the debug logs on the secondary site. setup details : - primary site : client node/haproxy node: 10.8.129.103 on port 5000 rgw nodes : 10.8.129.102 and 10.8.129.103 on port 80 - secondary site: rgw nodes : client node/ haproxy : 10.8.129.106 on port 5000 10.8.129.105 and 10.8.129.106 on port 80 secondary site debug rgw logs location: 10.8.129.106:/var/log/ceph/d20b30e4-abad-11ee-b6fa-0cc47a6ee050/ceph-client.rgw.india.sec.80.pluto006.ysecdl.log creds for all the nodes are (root/r) Vidushi --- Additional comment from Pritha Srivastava on 2024-03-25 16:28:01 UTC --- I see the create bucket getting forwarded to the master zonegroup from the logs on secondary site. I am pasting the log snippet below: 2024-03-25T05:54:13.478+0000 7f3fbe286640 10 req 11924054439652643273 0.001000004s s3:create_bucket user=annief.469 bucket=:sec-my-bucket[]) 2024-03-25T05:54:13.478+0000 7f3fbe286640 10 req 11924054439652643273 0.001000004s s3:create_bucket cache get: name=secondary.rgw.meta+root+sec-my-bucket : hit (negative entry) 2024-03-25T05:54:13.478+0000 7f3fbe286640 0 req 11924054439652643273 0.001000004s s3:create_bucket sending request to master zonegroup <------------------------------------ 2024-03-25T05:54:13.478+0000 7f3fbe286640 20 get_url picked endpoint=http://pluto003:5000 <---------------------------------------- 2024-03-25T05:54:13.478+0000 7f3fbe286640 20 req 11924054439652643273 0.001000004s s3:create_bucket sign_request_v4():> HTTP_DATE -> Mon Mar 25 05:54:13 2024 2024-03-25T05:54:13.478+0000 7f3fbe286640 20 req 11924054439652643273 0.001000004s s3:create_bucket sign_request_v4():> HTTP_X_AMZ_CONTENT_SHA256 -> e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 2024-03-25T05:54:13.478+0000 7f3fbe286640 10 req 11924054439652643273 0.001000004s s3:create_bucket canonical headers format = date:Mon Mar 25 05:54:13 2024 x-amz-content-sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 x-amz-date:20240325T055413Z 2024-03-25T05:54:13.478+0000 7f3fbe286640 10 req 11924054439652643273 0.001000004s s3:create_bucket payload request hash = e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 2024-03-25T05:54:13.478+0000 7f3fbe286640 10 req 11924054439652643273 0.001000004s s3:create_bucket canonical request = PUT /sec-my-bucket rgwx-uid=annief.469&rgwx-zonegroup=c4089c7f-b087-4dbc-9478-bfb16d4789e9 date:Mon Mar 25 05:54:13 2024 x-amz-content-sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 x-amz-date:20240325T055413Z date;x-amz-content-sha256;x-amz-date e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 2024-03-25T05:54:13.478+0000 7f3fbe286640 10 req 11924054439652643273 0.001000004s s3:create_bucket canonical request hash = dde979eaed8a169f88bd26b7c91de2b461744de580835dacdeafac5992f3f462 2024-03-25T05:54:13.478+0000 7f3fbe286640 10 req 11924054439652643273 0.001000004s s3:create_bucket string to sign = AWS4-HMAC-SHA256 20240325T055413Z 20240325/shared/s3/aws4_request dde979eaed8a169f88bd26b7c91de2b461744de580835dacdeafac5992f3f462 2024-03-25T05:54:13.478+0000 7f3fbe286640 10 req 11924054439652643273 0.001000004s s3:create_bucket date_k = 685bc0e82cefda6f4119596ac36bbec935498b218516f4689515358c8a5358ec 2024-03-25T05:54:13.478+0000 7f3fbe286640 10 req 11924054439652643273 0.001000004s s3:create_bucket region_k = c90ac74c8bbc53e40414986134ced75410b57964ad9b8958fc77e829f78b8b3b 2024-03-25T05:54:13.478+0000 7f3fbe286640 10 req 11924054439652643273 0.001000004s s3:create_bucket service_k = 4c9bc646b43adb49540c1b5fa9be19d52b8d28f4656b9b7c2d8227485ab81dbb 2024-03-25T05:54:13.478+0000 7f3fbe286640 10 req 11924054439652643273 0.001000004s s3:create_bucket signing_k = f04eeb37fa5a02e614d7cc494b9e21580342a28bd29a4f051010512a4f3861e1 2024-03-25T05:54:13.478+0000 7f3fbe286640 10 req 11924054439652643273 0.001000004s s3:create_bucket generated signature = 6c1c5763ebf1d038296db79acfbf2f95b195cefe360a41f53e5937ccfc9eced4 2024-03-25T05:54:13.478+0000 7f3fbe286640 20 req 11924054439652643273 0.001000004s s3:create_bucket sign_request_v4(): sigv4 header: Authorization: AWS4-HMAC-SHA256 Credential=a123/20240325/shared/s3/aws4_request,SignedHeaders=date;x-amz-content-sha256;x-amz-date,Signature=6c1c5763ebf1d038296db79acfbf2f95b195cefe360a41f53e5937ccfc9eced4 2024-03-25T05:54:13.478+0000 7f3fbe286640 20 req 11924054439652643273 0.001000004s s3:create_bucket sign_request_v4(): sigv4 header: x-amz-content-sha256: UNSIGNED-PAYLOAD 2024-03-25T05:54:13.478+0000 7f3fbe286640 20 req 11924054439652643273 0.001000004s s3:create_bucket sign_request_v4(): sigv4 header: x-amz-date: 20240325T055413Z 2024-03-25T05:54:13.478+0000 7f3fbe286640 20 sending request to http://pluto003:5000/sec-my-bucket?rgwx-uid=annief.469&rgwx-zonegroup=c4089c7f-b087-4dbc-9478-bfb16d4789e9 2024-03-25T05:54:13.478+0000 7f3fbe286640 20 register_request mgr=0x55cee0fb4b40 req_data->id=3, curl_handle=0x55cee5c34940 2024-03-25T05:54:13.478+0000 7f40d1d16640 20 link_request req_data=0x55cee32a9680 req_data->id=3, curl_handle=0x55cee5c34940 2024-03-25T05:54:13.481+0000 7f3fe62d6640 20 req 11924054439652643273 0.004000016s s3:create_bucket rgw_create_bucket returned ret=-13 bucket=<NULL> 2024-03-25T05:54:13.481+0000 7f3fe62d6640 2 req 11924054439652643273 0.004000016s s3:create_bucket completing 2024-03-25T05:54:13.481+0000 7f3fe62d6640 10 req 11924054439652643273 0.004000016s cache get: name=secondary.rgw.log++script.postrequest. : hit (negative entry) 2024-03-25T05:54:13.481+0000 7f3fe62d6640 2 req 11924054439652643273 0.004000016s s3:create_bucket op status=-13 2024-03-25T05:54:13.481+0000 7f3fe62d6640 2 req 11924054439652643273 0.004000016s s3:create_bucket http status=403 I am not sure why this happens, Shilpa is this expected behaviour? do create bucket requests on secondary get forwarded to the master? --- Additional comment from Pritha Srivastava on 2024-03-25 16:30:14 UTC --- Vidushi, has this been tested on any earlier downstream version? Also are the primary logs intact anywhere, in case we need to inspect them? Thanks, Pritha --- Additional comment from Vidushi Mishra on 2024-03-25 18:39:28 UTC --- (In reply to Pritha Srivastava from comment #5) > Vidushi, has this been tested on any earlier downstream version? Also are > the primary logs intact anywhere, in case we need to inspect them? > > Thanks, > Pritha Hi Pritha, This should be present in the previous version as well, we will confirm shortly. Thanks Vidushi --- Additional comment from shilpa on 2024-03-25 18:49:55 UTC --- it seems like we are trying to authenticate the create bucket request forwarded to primary against multisite system user and failing with 403. log snippet from primary matching the create bucket call on secondary: 2024-03-25T18:02:50.252+0000 7efc55087640 20 req 7724498153786276837 0.000000000s s3:create_bucket rgw::auth::s3::AWSAuthStrategy: trying rgw::auth::s3 ::STSAuthStrategy 2024-03-25T18:02:50.252+0000 7efc55087640 20 req 7724498153786276837 0.000000000s s3:create_bucket rgw::auth::s3::STSAuthStrategy: trying rgw::auth::s3 ::STSEngine 2024-03-25T18:02:50.252+0000 7efc55087640 10 req 7724498153786276837 0.000000000s v4 signature format = e28e0e1d86b0a3cac47843f0af6e9a474a33ff51fcbb11e 7960aae7bc359c14e 2024-03-25T18:02:50.252+0000 7efc55087640 10 req 7724498153786276837 0.000000000s v4 credential format = a123/20240325/shared/s3/aws4_request 2024-03-25T18:02:50.252+0000 7efc55087640 10 req 7724498153786276837 0.000000000s access key id = a123 2024-03-25T18:02:50.252+0000 7efc55087640 0 req 7724498153786276837 0.000000000s s3:create_bucket Invalid access key 2024-03-25T18:02:50.252+0000 7efc55087640 20 req 7724498153786276837 0.000000000s s3:create_bucket rgw::auth::s3::STSEngine rejected with reason=-1 2024-03-25T18:02:50.252+0000 7efc55087640 20 req 7724498153786276837 0.000000000s s3:create_bucket rgw::auth::s3::STSAuthStrategy rejected with reason=-1 2024-03-25T18:02:50.252+0000 7efc55087640 20 req 7724498153786276837 0.000000000s s3:create_bucket rgw::auth::s3::AWSAuthStrategy rejected with reason=-1 2024-03-25T18:02:50.252+0000 7efc55087640 5 req 7724498153786276837 0.000000000s s3:create_bucket Failed the auth strategy, reason=-1 2024-03-25T18:02:50.252+0000 7efc55087640 10 failed to authorize request 2024-03-25T18:02:50.252+0000 7efc55087640 1 req 7724498153786276837 0.000000000s op->ERRORHANDLER: err_no=-1 new_err_no=-1 2024-03-25T18:02:50.252+0000 7efc55087640 10 req 7724498153786276837 0.000000000s cache get: name=primary.rgw.log++script.postrequest. : hit (negative entry) 2024-03-25T18:02:50.252+0000 7efc55087640 2 req 7724498153786276837 0.000000000s s3:create_bucket op status=0 2024-03-25T18:02:50.252+0000 7efc55087640 2 req 7724498153786276837 0.000000000s s3:create_bucket http status=403 the access_key 'a123' is that of the system user called 'repuser' --- Additional comment from Pritha Srivastava on 2024-03-26 04:52:58 UTC --- Shilpa, did you try this out using the same script that Vidushi has, or did you create a bucket on primary with some other credentials (with that of the system user a123?) --- Additional comment from Pritha Srivastava on 2024-03-26 07:07:51 UTC --- Vidushi ran the test script again with debug logs enabled for both primary and secondary (Thank You Vidushi) and I see the following: On secondary, while forwarding the request to primary, the user is set to annief.469 - this happens because during AssumeRole, user is set to the user trying to assume the role. 2024-03-26T06:30:25.109+0000 7f3fb226e640 20 sending request to http://pluto003:5000/sec-new-bucket?rgwx-uid=annief.469&rgwx-zonegroup=c4089c7f-b087-4dbc-9478-bfb16d4789e9 <----------- uid is of user annief.469 2024-03-26T06:30:25.109+0000 7f3fb226e640 20 register_request mgr=0x55cee0fb4b40 req_data->id=630, curl_handle=0x55cee6de14e0 2024-03-26T06:30:25.109+0000 7f40d1d16640 20 link_request req_data=0x55cee769be00 req_data->id=630, curl_handle=0x55cee6de14e0 2024-03-26T06:30:25.112+0000 7f408f428640 20 req 13335054384456997502 0.005000020s s3:create_bucket rgw_create_bucket returned ret=-13 bucket=<NULL> On primary, the incoming request is treated as an STS request because of the presence of HTTP_X_AMZ_SECURITY_TOKEN header in the request, but the credentials are that of 'repuser' with access key id 'a123', as pointed out by Shilpa above. This is confusing because the credentials should either have been of the user annief.469, or the temporary credentials. Ideally the temporary credentials should have been used to sign the request while forwarding it to the master. STSEngine checks to see if the access key id used to sign the request is the same as that in the session token. Since its a mismatch here we get an "Invalid access key" error. The logs are below for reference: 2024-03-26T06:30:25.086+0000 7f05cd7b4640 10 req 17534750427505361531 0.000000000s meta>> HTTP_X_AMZ_SECURITY_TOKEN <------------------------------- STS session token is present in the request 2024-03-26T06:30:25.086+0000 7f05cd7b4640 10 req 17534750427505361531 0.000000000s x>> x-amz-content-sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 2024-03-26T06:30:25.086+0000 7f05cd7b4640 10 req 17534750427505361531 0.000000000s x>> x-amz-date:20240326T063025Z 2024-03-26T06:30:25.086+0000 7f05cd7b4640 10 req 17534750427505361531 0.000000000s x>> x-amz-security-token:S64iy9e4HOPYyyl3b50BME/aG4X7GUnTWiNlOnW/4MUaSyHyme5rqIyYhVd1uXET3u6UGWXK5xU7khL2DoBffiettCjvm8gESPTpUomDtCaUif7ibVYKMffUm6yUrjmpVL O9URg9ctFmEIVwUDWNmub5mO18idhJYMMtHAiXAYxgBpjKdJths9gLJXDk0vOKCosiwVM7dmQoOgp8FrJZbjFKM5+SRvZtFwtPZskFhiU/UUTGmzegx4q2b5ECMLpqpQWQcSEukms8xFhBbo3pvWYIwzV3WokXV1soYlMpsL3UwGlVNOfrhIbPfeTezRVnR75TEiF6J5/G+nyRr/DYaQ== 2024-03-26T06:30:25.086+0000 7f05cd7b4640 10 req 17534750427505361531 0.000000000s handler=25RGWHandler_REST_Bucket_S3 2024-03-26T06:30:25.086+0000 7f05cd7b4640 2 req 17534750427505361531 0.000000000s getting op 1 2024-03-26T06:30:25.086+0000 7f05cd7b4640 10 req 17534750427505361531 0.000000000s cache get: name=primary.rgw.log++script.prerequest. : hit (negative entry) 2024-03-26T06:30:25.086+0000 7f05cd7b4640 10 req 17534750427505361531 0.000000000s s3:create_bucket scheduling with throttler client=3 cost=1 2024-03-26T06:30:25.086+0000 7f05cd7b4640 10 req 17534750427505361531 0.000000000s s3:create_bucket op=27RGWCreateBucket_ObjStore_S3 2024-03-26T06:30:25.086+0000 7f05cd7b4640 2 req 17534750427505361531 0.000000000s s3:create_bucket verifying requester 2024-03-26T06:30:25.086+0000 7f05cd7b4640 20 req 17534750427505361531 0.000000000s s3:create_bucket rgw::auth::StrategyRegistry::s3_main_strategy_t: trying rgw::auth::s3::AWSAuthStrategy 2024-03-26T06:30:25.086+0000 7f05cd7b4640 20 req 17534750427505361531 0.000000000s s3:create_bucket rgw::auth::s3::AWSAuthStrategy: trying rgw::auth::s3::S3AnonymousEngine 2024-03-26T06:30:25.086+0000 7f05cd7b4640 20 req 17534750427505361531 0.000000000s s3:create_bucket rgw::auth::s3::S3AnonymousEngine denied with reason=-1 2024-03-26T06:30:25.086+0000 7f05cd7b4640 20 req 17534750427505361531 0.000000000s s3:create_bucket rgw::auth::s3::AWSAuthStrategy: trying rgw::auth::s3::STSAuthStrategy 2024-03-26T06:30:25.086+0000 7f05cd7b4640 20 req 17534750427505361531 0.000000000s s3:create_bucket rgw::auth::s3::STSAuthStrategy: trying rgw::auth::s3::STSEngine 2024-03-26T06:30:25.086+0000 7f05cd7b4640 10 req 17534750427505361531 0.000000000s v4 signature format = 2db8a1db16eb8b759824af1fd942ba1e54075d6e5e4f4bd9ba0c38b28f6200bb 2024-03-26T06:30:25.086+0000 7f05cd7b4640 10 req 17534750427505361531 0.000000000s v4 credential format = a123/20240326/shared/s3/aws4_request 2024-03-26T06:30:25.086+0000 7f05cd7b4640 10 req 17534750427505361531 0.000000000s access key id = a123 <------------------------------- credentials used to sign the request is of 'repuser' 2024-03-26T06:30:25.086+0000 7f05cd7b4640 10 req 17534750427505361531 0.000000000s credential scope = 20240326/shared/s3/aws4_request 2024-03-26T06:30:25.086+0000 7f05cd7b4640 10 req 17534750427505361531 0.000000000s canonical headers format = date:Tue Mar 26 06:30:25 2024 x-amz-content-sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 x-amz-date:20240326T063025Z 2024-03-26T06:30:25.086+0000 7f05cd7b4640 10 req 17534750427505361531 0.000000000s payload request hash = e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 2024-03-26T06:30:25.086+0000 7f05cd7b4640 10 req 17534750427505361531 0.000000000s canonical request = PUT /sec-new-bucket rgwx-uid=annief.469&rgwx-zonegroup=c4089c7f-b087-4dbc-9478-bfb16d4789e9 date:Tue Mar 26 06:30:25 2024 x-amz-content-sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 x-amz-date:20240326T063025Z date;x-amz-content-sha256;x-amz-date e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 2024-03-26T06:30:25.086+0000 7f05cd7b4640 10 req 17534750427505361531 0.000000000s canonical request hash = 5ad29d1d17600c71885fb686ab66bc347cf49ad16f8f1496e32493b5dc42b975 2024-03-26T06:30:25.086+0000 7f05cd7b4640 10 req 17534750427505361531 0.000000000s string to sign = AWS4-HMAC-SHA256 20240326T063025Z 20240326/shared/s3/aws4_request 5ad29d1d17600c71885fb686ab66bc347cf49ad16f8f1496e32493b5dc42b975 2024-03-26T06:30:25.086+0000 7f05cd7b4640 10 req 17534750427505361531 0.000000000s get_auth_data_v4: UNSIGNED-PAYLOAD or other v4 no-completer case 2024-03-26T06:30:25.086+0000 7f05cd7b4640 0 req 17534750427505361531 0.000000000s s3:create_bucket Invalid access key <---------------- Invalid access key error returned by STS engine. The solution here is to correct the credentials used to sign the forwarded request. Also the session token contains the roleId so that will be used to fetch the role policy/policies from the backend store, so they will be automatically added to the permissions while authenticating with STSEngine.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 7.0 Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:2743
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days