Bug 2262907

Summary: Segmentation fault encountered during object manipulation using ceph-objectstore-tool
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Harsh Kumar <hakumar>
Component: RADOSAssignee: Adam Kupczyk <akupczyk>
Status: POST --- QA Contact: Harsh Kumar <hakumar>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.1CC: bhubbard, ceph-eng-bugs, cephqe-warriors, ngangadh, nojha, rzarzyns, vumrao
Target Milestone: ---   
Target Release: 7.2   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Harsh Kumar 2024-02-06 03:58:53 UTC
Description of problem:
Segmentation fault encountered during manipulation of an object's content using ceph-objectstore-tool
Reference from upstream documentation: https://docs.ceph.com/en/latest/man/8/ceph-objectstore-tool/#manipulating-an-object-s-content

Segmentation fault was specifically observed only during get-bytes and set-bytes operations.

Get bytes:
	ceph-objectstore-tool --data-path $PATH_TO_OSD --pgid $PG_ID $OBJECT get-bytes > $OBJECT_FILE_NAME

Set bytes:
	ceph-objectstore-tool --data-path $PATH_TO_OSD --pgid $PG_ID $OBJECT set-bytes < $OBJECT_FILE_NAME


Version-Release number of selected component (if applicable):
- 18.2.1-10.el9cp


How reproducible: 2/5


Steps to Reproduce:
1. Create a pool and write data to it, may use rados bench or any other tool
2. Stop an OSD service on any of the OSD nodes
3. From inside the OSD container use ceph-objectstore-tool to list list of objects
	# cephadm shell --name osd.1 -- ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-1 --op list
4. Choose an object from the list obtained in the previous step
5. Obtain the content of the object using get-bytes operation
	# cephadm shell --name osd.1 --mount /tmp/ -- ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-1 --pgid 29.1d '{"oid": "benchmark_last_metadata", "key": "", "snapid": -2, "hash": 2390394397, "max": 0, "pool": 29, "namespace": ""}' get-bytes > /mnt/obj_work
6. Modify or leave alone the file where object data was redirected.
7. Push content to the object using set-bytes operation
	# cephadm shell --name osd.1 --mount /tmp/ -- ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-1 --pgid 29.1d '{"oid": "benchmark_last_metadata", "key": "", "snapid": -2, "hash": 2390394397, "max": 0, "pool": 29, "namespace": ""}' set-bytes < /mnt/obj_work

Actual results:
Observed segmentation fault during few executions of the above mentioned operations


Expected results:
No segmentation faults expected even in case of failures

Additional info:
Logs:

	get-bytes
--------------------------------------------------------------------------------------------------------------------------------------------

	http://magna002.ceph.redhat.com/ceph-qe-logs/harsh_magna/cot-seg-fault/cephci-run-EGRKQ9/ceph-objectstore-tool_utility_0.log
	
	Exception hit while command execution. cephadm shell --name osd.6 --mount /tmp/ -- ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-6 --pgid 24.17 '{"oid": "benchmark_data_ceph-hakumar-h01h6v-node1-ins_3_object88", "key": "", "snapid": -2, "hash": 4154654743, "max": 0, "pool": 24, "namespace": ""}' get-bytes > /mnt/obj_backup Error:  Inferring fsid 628d2de6-c0d7-11ee-af8c-fa163e4240c7
	Inferring config /var/lib/ceph/628d2de6-c0d7-11ee-af8c-fa163e4240c7/osd.6/config
	Using ceph image with id '18a49f4e73b3' and tag '<none>' created on 2024-01-31 00:23:56 +0000 UTC
	registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:5e19702546ffe42b24b5c05936fae05045083a2103a54fb9400a37fabdcd2e50
	*** Caught signal (Segmentation fault) **
	 in thread 7ff6642ba580 thread_name:ceph-objectstor
	 ceph version 18.2.1-10.el9cp (ccf42acecc9e7ec19c8994e4d2ca0180b612ad1e) reef (stable)
	 1: /lib64/libc.so.6(+0x54db0) [0x7ff6648c2db0]
	 2: __pthread_rwlock_rdlock()
	 3: (BlueStore::collection_bits(boost::intrusive_ptr<ObjectStore::CollectionImpl>&)+0x48) [0x558623e3ff38]
	 4: main()
	 5: /lib64/libc.so.6(+0x3feb0) [0x7ff6648adeb0]
	 6: __libc_start_main()
	 7: _start()


	http://magna002.ceph.redhat.com/ceph-qe-logs/harsh_magna/cot-seg-fault/cephci-run-F819AS/ceph-objectstore-tool_utility_0.log
	
	Exception hit while command execution. cephadm shell --name osd.11 --mount /tmp/ -- ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-11 --pgid 21.12 "{'oid': 'benchmark_data_ceph-hakumar-h01h6v-node1-ins_3_object131', 'key': '', 'snapid': -2, 'hash': 685328530, 'max': 0, 'pool': 21, 'namespace': ''}" get-bytes > /mnt/obj_backup Error:  Inferring fsid 628d2de6-c0d7-11ee-af8c-fa163e4240c7
	Inferring config /var/lib/ceph/628d2de6-c0d7-11ee-af8c-fa163e4240c7/osd.11/config
	Using ceph image with id '18a49f4e73b3' and tag '<none>' created on 2024-01-31 00:23:56 +0000 UTC
	registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:5e19702546ffe42b24b5c05936fae05045083a2103a54fb9400a37fabdcd2e50
	*** Caught signal (Segmentation fault) **
	 in thread 7f93cf8de580 thread_name:ceph-objectstor
	 ceph version 18.2.1-10.el9cp (ccf42acecc9e7ec19c8994e4d2ca0180b612ad1e) reef (stable)
	 1: /lib64/libc.so.6(+0x54db0) [0x7f93cfee6db0]
	 2: (BlueStore::collection_list(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ghobject_t const&, ghobject_t const&, int, std::vector<ghobject_t, std::allocator<ghobject_t> >*, ghobject_t*)+0x4b) [0x55c4cc5d6e5b]
	 3: (_action_on_all_objects_in_pg(ObjectStore*, coll_t, action_on_object_t&, bool)+0x4cc) [0x55c4cc1217dc]
	 4: (action_on_all_objects_in_exact_pg(ObjectStore*, coll_t, action_on_object_t&, bool)+0x64) [0x55c4cc1226c4]
	 5: main()
	 6: /lib64/libc.so.6(+0x3feb0) [0x7f93cfed1eb0]
	 7: __libc_start_main()
	 8: _start()



	set-bytes
--------------------------------------------------------------------------------------------------------------------------------------------

	http://magna002.ceph.redhat.com/ceph-qe-logs/harsh_magna/cot-seg-fault/cephci-run-O7JK37/ceph-objectstore-tool_utility_0.log
	
	Exception hit while command execution. cephadm shell --name osd.3 --mount /tmp/ -- ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-3 --pgid 30.16 '{"oid": "benchmark_data_ceph-hakumar-h01h6v-node1-ins_3_object96", "key": "", "snapid": -2, "hash": 2879658006, "max": 0, "pool": 30, "namespace": ""}' set-bytes < /mnt/obj_work Error:  Inferring fsid 628d2de6-c0d7-11ee-af8c-fa163e4240c7
	Inferring config /var/lib/ceph/628d2de6-c0d7-11ee-af8c-fa163e4240c7/osd.3/config
	Using ceph image with id '18a49f4e73b3' and tag '<none>' created on 2024-01-31 00:23:56 +0000 UTC
	registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:5e19702546ffe42b24b5c05936fae05045083a2103a54fb9400a37fabdcd2e50
	*** Caught signal (Segmentation fault) **
	 in thread 7f9452740580 thread_name:ceph-objectstor
	 ceph version 18.2.1-10.el9cp (ccf42acecc9e7ec19c8994e4d2ca0180b612ad1e) reef (stable)
	 1: /lib64/libc.so.6(+0x54db0) [0x7f9452d48db0]
	 2: __pthread_rwlock_rdlock()
	 3: (BlueStore::collection_bits(boost::intrusive_ptr<ObjectStore::CollectionImpl>&)+0x48) [0x55a6e489ff38]
	 4: main()
	 5: /lib64/libc.so.6(+0x3feb0) [0x7f9452d33eb0]
	 6: __libc_start_main()
	 7: _start()
	
	http://magna002.ceph.redhat.com/ceph-qe-logs/harsh_magna/cot-seg-fault/cephci-run-WSITZT/ceph-objectstore-tool_utility_0.log
	
	Exception hit while command execution. cephadm shell --name osd.1 --mount /tmp/ -- ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-1 --pgid 29.1d '{"oid": "benchmark_last_metadata", "key": "", "snapid": -2, "hash": 2390394397, "max": 0, "pool": 29, "namespace": ""}' set-bytes < /mnt/obj_work Error:  Inferring fsid 628d2de6-c0d7-11ee-af8c-fa163e4240c7
	Inferring config /var/lib/ceph/628d2de6-c0d7-11ee-af8c-fa163e4240c7/osd.1/config
	Using ceph image with id '18a49f4e73b3' and tag '<none>' created on 2024-01-31 00:23:56 +0000 UTC
	registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:5e19702546ffe42b24b5c05936fae05045083a2103a54fb9400a37fabdcd2e50
	*** Caught signal (Segmentation fault) **
	 in thread 7fed4fc40580 thread_name:ceph-objectstor
	 ceph version 18.2.1-10.el9cp (ccf42acecc9e7ec19c8994e4d2ca0180b612ad1e) reef (stable)
	 1: /lib64/libc.so.6(+0x54db0) [0x7fed50248db0]
	 2: __pthread_rwlock_rdlock()
	 3: (BlueStore::collection_bits(boost::intrusive_ptr<ObjectStore::CollectionImpl>&)+0x48) [0x5644a49bbf38]
	 4: main()
	 5: /lib64/libc.so.6(+0x3feb0) [0x7fed50233eb0]
	 6: __libc_start_main()
	 7: _start()