Bug 2362258 - RGW crash during S3 CopyObject of encrypted multipart objects in thread notif-worker0 [NEEDINFO]
Summary: RGW crash during S3 CopyObject of encrypted multipart objects in thread notif...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RGW
Version: 7.1
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 7.1z4
Assignee: Matt Benjamin (redhat)
QA Contact: Vidushi Mishra
URL:
Whiteboard:
Depends On:
Blocks: 2362274
TreeView+ depends on / blocked
 
Reported: 2025-04-25 07:24 UTC by Vidushi Mishra
Modified: 2025-05-07 12:49 UTC (History)
8 users (show)

Fixed In Version: ceph-18.2.1-328.el9cp
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2362274 (view as bug list)
Environment:
Last Closed: 2025-05-07 12:48:55 UTC
Embargoed:
vimishra: needinfo? (mbenjamin)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-11258 0 None None None 2025-04-25 07:25:48 UTC
Red Hat Product Errata RHSA-2025:4664 0 None None None 2025-05-07 12:49:03 UTC

Description Vidushi Mishra 2025-04-25 07:24:27 UTC
Description of problem:


During an S3 CopyObject request involving encrypted multipart objects, RGW crashes with a SIGABRT in the notif-worker0 thread.

This crash does not occur with unencrypted or non-multipart uploaded objects, indicating the issue is specific to handling multipart + encryption during copy operations.

Version-Release number of selected component (if applicable):

ceph version 18.2.1-327.el9cp 

How reproducible:

Always 

Steps to Reproduce:
1. Create a bucket with SSE-S3 bucket encryption enabled.
# s3cmd mb s3://testenc1

#  aws --endpoint-url http://ceph-pri-vim-single-site-71-nf1bqi-node5:80 s3api put-bucket-encryption --bucket testenc1 --server-side-encryption-configuration '{"Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]}'

# aws --endpoint-url http://ceph-pri-vim-single-site-71-nf1bqi-node5:80 s3api get-bucket-encryption --bucket testenc1 
{
    "ServerSideEncryptionConfiguration": {
        "Rules": [
            {
                "ApplyServerSideEncryptionByDefault": {
                    "SSEAlgorithm": "AES256"
                }
            }
        ]
    }
}

2. Upload a multipart object

# s3cmd put obj20m s3://testenc1
upload: 'obj20m' -> 's3://testenc1/obj20m'  [part 1 of 2, 15MB] [1 of 1]
 15728640 of 15728640   100% in    0s    57.90 MB/s  done
upload: 'obj20m' -> 's3://testenc1/obj20m'  [part 2 of 2, 5MB] [1 of 1]
 5242880 of 5242880   100% in    0s    38.52 MB/s  done

# s3cmd info  s3://testenc1/obj20m
s3://testenc1/obj20m (object):
   File size: 20971520
   Last mod:  Fri, 25 Apr 2025 06:34:34 GMT
   MIME type: application/octet-stream
   Storage:   STANDARD
   MD5 sum:   8f4e33f3dc3e414ff94e5fb6905cba8c
   SSE:       AES256
   Policy:    none
   CORS:      none
   ACL:       winona hein: FULL_CONTROL
   x-amz-meta-s3cmd-attrs: atime:1745562289/ctime:1745562280/gid:0/gname:root/md5:8f4e33f3dc3e414ff94e5fb6905cba8c/mode:33188/mtime:1745562280/uid:0/uname:root


3. Perform S3 CopyObject

# s3cmd cp s3://testenc1/obj20m s3://testenc1/obj20m_copy
WARNING: Retrying failed request: /obj20m_copy (Remote end closed connection without response)
WARNING: Waiting 3 sec...


4. Observe the RGW crash

2025-04-25T06:04:01.541+0000 7f1b64491640 20 req 479854084246211760 0.009000130s s3:copy_obj max_chunk_size=4194304
2025-04-25T06:04:01.541+0000 7f1b64491640 20 req 479854084246211760 0.009000130s s3:copy_obj get_obj_state: rctx=0x55be905db490 obj=dolliel.691-bucky-3206-0:prefix1key_dolliel.691-bucky-3206-0_24 state=0x55be8ec445e8 s->prefetch_data=1
2025-04-25T06:04:01.541+0000 7f1b64491640 20 req 479854084246211760 0.009000130s s3:copy_obj max_chunk_size=4194304
2025-04-25T06:04:01.541+0000 7f1b64491640 20 req 479854084246211760 0.009000130s s3:copy_obj rados->read obj-ofs=0 read_ofs=0 read_len=4194304
2025-04-25T06:04:01.548+0000 7f1b64491640 20 req 479854084246211760 0.016000232s s3:copy_obj rados->read r=0 bl.length=4194304
2025-04-25T06:04:01.557+0000 7f1b64491640 20 req 479854084246211760 0.025000360s s3:copy_obj get_obj_state: rctx=0x55be905db490 obj=dolliel.691-bucky-3206-0:prefix1key_dolliel.691-bucky-3206-0_24 state=0x55be8ec445e8 s->prefetch_data=1
2025-04-25T06:04:01.557+0000 7f1b64491640 20 req 479854084246211760 0.025000360s s3:copy_obj max_chunk_size=4194304
2025-04-25T06:04:01.557+0000 7f1b64491640 20 req 479854084246211760 0.025000360s s3:copy_obj rados->read obj-ofs=4194304 read_ofs=0 read_len=1048576
2025-04-25T06:04:01.560+0000 7f1b64491640 20 req 479854084246211760 0.028000403s s3:copy_obj rados->read r=0 bl.length=1048576
2025-04-25T06:04:01.563+0000 7f1b64491640 -1 *** Caught signal (Aborted) **
 in thread 7f1b64491640 thread_name:notif-worker0

 ceph version 18.2.1-327.el9cp (8231b23b5fc431238517dc3adc21a3b5a4a7ce71) reef (stable)
 1: /lib64/libc.so.6(+0x3e730) [0x7f1c69a19730]
 2: /lib64/libc.so.6(+0x8b52c) [0x7f1c69a6652c]
 3: raise()
 4: abort()
 5: /lib64/libstdc++.so.6(+0xa1b21) [0x7f1c69d7ab21]
 6: /lib64/libstdc++.so.6(+0xad52c) [0x7f1c69d8652c]
 7: /lib64/libstdc++.so.6(+0xac4f9) [0x7f1c69d854f9]
 8: __gxx_personality_v0()
 9: /lib64/libgcc_s.so.1(+0x112d4) [0x7f1c69bf52d4]
 10: _Unwind_Resume()
 11: /usr/bin/radosgw(+0x2eb4a5) [0x55be88b0f4a5]
 12: /usr/bin/radosgw(+0x3e25e0) [0x55be88c065e0]
 13: /lib64/libstdc++.so.6(+0xdbad4) [0x7f1c69db4ad4]
 14: /lib64/libc.so.6(+0x897e2) [0x7f1c69a647e2]
 15: /lib64/libc.so.6(+0x10e800) [0x7f1c69ae9800]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
 -9999> 2025-04-25T06:04:00.290+0000 7f1b179f4640 10 req 5723659324235957503 0.000000000s x>> x-amz-content-sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
 -9998> 2025-04-25T06:04:00.290+0000 7f1b179f4640 10 req 5723659324235957503 0.000000000s x>> x-amz-date:20250425T060400Z
 -9997> 2025-04-25T06:04:00.290+0000 7f1b179f4640 10 req 5723659324235957503 0.000000000s handler=14RGWHandler_Log
 -9996> 2025-04-25T06:04:00.290+0000 7f1b179f4640  2 req 5723659324235957503 0.000000000s getting op 0
 -9995> 2025-04-25T06:04:00.290+0000 7f1b179f4640 10 req 5723659324235957503 0.000000000s cache get: name=primary.rgw.log++script.prerequest. : hit (negative entry)
 -9994> 2025-04-25T06:04:00.290+0000 7f1b179f4640 10 req 5723659324235957503 0.000000000s :list_data_changes_log scheduling with throttler client=0 cost=1
 -9993> 2025-04-25T06:04:00.290+0000 7f1b179f4640 10 req 5723659324235957503 0.000000000s :list_data_changes_log op=18RGWOp_DATALog_List


Actual results:

RGW crashes with signal SIGABRT. Crash occurs in notif-worker0 thread

Expected results:

CopyObject should complete successfully, regardless of encryption or multipart status.

Additional info:

The last successful run for the S3 copy of encrypted multipart objects was seen in 7.1 (18.2.1-298) as per our automation results.

Comment 18 errata-xmlrpc 2025-05-07 12:48:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Ceph Storage 7.1 security, bug fix, and enhancement updates), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2025:4664


Note You need to log in before you can comment on or make changes to this bug.