2193399 – ceph: Corruption: unknown checksum type 4 (ceph-osd fails to start)

Bug 2193399 - ceph: Corruption: unknown checksum type 4 (ceph-osd fails to start)

Summary: ceph: Corruption: unknown checksum type 4 (ceph-osd fails to start)

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	ceph
Sub Component:
Version:	rawhide
Hardware:	Unspecified
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Assignee:	Kaleb KEITHLEY
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2023-05-05 14:13 UTC by Tomasz Torcz
Modified:	2023-06-19 09:41 UTC (History)
CC List:	8 users (show)
Fixed In Version:	ceph-18.1.0-0.1
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2023-06-19 09:41:21 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
ceph-osd.2.log (104.21 KB, text/plain) 2023-05-05 14:14 UTC, Tomasz Torcz	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Ceph Project Bug Tracker	59660	0	None	None	None	2023-05-05 15:16:58 UTC

Description Tomasz Torcz 2023-05-05 14:13:43 UTC

After updating ceph packages to ceph-osd-17.2.6-5.fc39.x86_64, OSD no longer starts.
Previously working version was ceph-osd-2:17.2.5-13.fc39.x86_64.

Relevant ceph-osd logs (full logfile attached):
#v+
…
2023-05-05T15:49:55.044+0200 7fabe775a6c0  2 rocksdb: [table/block_based/block_based_table_reader.cc:1161] Encountered error while reading data from properties block Corruption: unknown checksum type 4 from footer of db/002868.sst, while checking block at offset 68813363 size 86
2023-05-05T15:49:55.044+0200 7fabf99d92c0  4 rocksdb: [db/db_impl/db_impl.cc:446] Shutdown: canceling all background work
2023-05-05T15:49:55.044+0200 7fabf99d92c0  4 rocksdb: [db/db_impl/db_impl.cc:625] Shutdown complete
2023-05-05T15:49:55.044+0200 7fabf99d92c0 -1 rocksdb: Corruption: unknown checksum type 4 from footer of db/002878.sst, while checking block at offset 17055611 size 86
2023-05-05T15:49:55.044+0200 7fabf99d92c0 -1 bluestore(/var/lib/ceph/osd/ceph-2) _open_db erroring opening db: 
2023-05-05T15:49:55.044+0200 7fabf99d92c0  1 bluefs umount
2023-05-05T15:49:55.044+0200 7fabf99d92c0  1 bdev(0x55ac9fce9800 /var/lib/ceph/osd/ceph-2/block) close
2023-05-05T15:49:55.055+0200 7fabf99d92c0  1 bdev(0x55ac9fce8000 /var/lib/ceph/osd/ceph-2/block) close
2023-05-05T15:49:55.316+0200 7fabf99d92c0 -1 osd.2 0 OSD:init: unable to mount object store
2023-05-05T15:49:55.316+0200 7fabf99d92c0 -1 ^[[0;31m ** ERROR: osd init failed: (5) Input/output error^[[0m
#v-

Reproducible: Always

Steps to Reproduce:
1. Start ceph-osd
2.
3.
Actual Results:  
Ceph-OSD fails to start.

Expected Results:  
Ceph-OSD running.

Comment 1 Tomasz Torcz 2023-05-05 14:14:31 UTC

Created attachment 1962578 [details]
ceph-osd.2.log

Comment 2 Tomasz Torcz 2023-05-05 14:38:58 UTC

I see that rocksdb was updated, too: rocksdb-7.8.3-2.fc39.x86_64 → rocksdb-8.1.1-1.fc39.x86_64. 
Adding rocksdb maintainer.

Comment 3 Tomasz Torcz 2023-05-05 15:11:03 UTC

Checksum type 4 is kXXH3, it was added to RocksDB in 6.27.0 (2021-11-19). It was later made default checksum type.
When you switched CEPH compilation to bundled RocksDB, it resulted in using RocksDB version: 6.15.5, which does not know type 4. Thus ceph clusters broke.
What can be done for a distribution? Either Ceph will be ported to RocksDB 8.1, or bundled rocksdb should be updated to at least 6.27.0. Or maybe new checksum type could be backported to bundled 6.15.5?

Comment 4 Tomasz Torcz 2023-05-05 15:13:59 UTC

Patch adding XXH3: https://github.com/facebook/rocksdb/pull/9069/files

Comment 5 Kaleb KEITHLEY 2023-05-05 15:16:59 UTC

Well, ceph doesn't use the system rocksdb (as noted above). See https://src.fedoraproject.org/rpms/ceph/blob/rawhide/f/ceph.spec#_1389 It uses its own bundled rocksdb because it doesn't build with the latest, system rocksb after rocksdb was updated in rawhide recently.

(Yes, it has BR for rockdb-devel, and it could argued that that's a bug. It's a bit misleading at best.)

It's unknown when ceph will refresh the bundled rocksdb or update ceph to work with newer rocksb.

Comment 6 Tomasz Torcz 2023-05-05 15:25:16 UTC

Just to be clear, ceph is not using system rocksdb NOW. But you switched to using bundled only two weeks ago (https://src.fedoraproject.org/rpms/ceph/c/e5f159485648a6d52f19d47e799c924eeb787fe8?branch=rawhide). Before it was using system RocksDB.
This means ceph clusters which were created more than 2 weeks ago are all broken.

Comment 7 Kaleb KEITHLEY 2023-05-05 15:48:28 UTC

(In reply to Tomasz Torcz from comment #6)
> Just to be clear, ceph is not using system rocksdb NOW. But you switched to
> using bundled only two weeks ago

Correct. That's when rockdb was updated in rawhide to rocksdb-8.x

> (https://src.fedoraproject.org/rpms/ceph/c/
> e5f159485648a6d52f19d47e799c924eeb787fe8?branch=rawhide). Before it was
> using system RocksDB.

That's correct. rocksdb was updated in rawhide and ceph 17.2.5 and 17.2.6 do not build with rocksdb-8.x.

> This means ceph clusters which were created more than 2 weeks ago are all
> broken.

ceph clusters created on Fedora Rawhide in the last two weeks!

You're certainly welcome to escalate with the ceph developers. The ceph tracker corresponding to this BZ is above. Anyone using rawhide, and who has built a new ceph cluster in the last two weeks will just have to rebuild. AFAIC that's just how it is on rawhide. Nobody in their right mind should deploy a production environment on rawhide.

Comment 8 Tomasz Torcz 2023-05-05 19:19:36 UTC

> ceph clusters created on Fedora Rawhide in the last two weeks!

No, that's not correct. My cluster was created in 2018 and it has stopped working because of this change. While using system-rocksdb, months ago, the checksum used was updated to kXXH3. Now, when Fedora CEPH is using bundled rocksdb which is old and does not know kXXH3, OSD doesn't start.

My cluster is not critical (I fire it up once a month to store some backups) and it's kept on Rawhide to notice such compatibility problems early. I understand that fixing this in Fedora is a lot of work and may not be worth. I'll wait until upstream Ceph updates bundled rocksdb or gets compatible with 8.1. No worries.

I appreciate the work you do with keeping Ceph running in Rawhide. Let's leave this bug as an explanation for other people who may stumble on this checksum issue.

Comment 9 Kaleb KEITHLEY 2023-06-19 09:41:21 UTC

should be fixed in ceph-18.1.0-0.1.fc39 (RC1)w3.ibm.com/w3publisher/mhhv-happenings/2023-u-s-holiday-schedule

Note You need to log in before you can comment on or make changes to this bug.