Bug 2208738

Summary: MariaDB 10.3.28 crashes on page corruption during restoration process
Product: Red Hat OpenStack Reporter: Robin Cernin <rcernin>
Component: mariadbAssignee: Damien Ciabrini <dciabrin>
Status: CLOSED NOTABUG QA Contact: dabarzil
Severity: high Docs Contact:
Priority: medium    
Version: 16.1 (Train)CC: bshephar, lmiccini, mbayer, spapa
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-08-11 12:36:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Robin Cernin 2023-05-20 05:59:35 UTC
Description of problem:

Crash during page corruption recovery.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
/usr/libexec/mysqld(my_print_stacktrace+0x41)[0x558df4087201]
/usr/libexec/mysqld(handle_fatal_signal+0x4e5)[0x558df3bb3c45]
sigaction.c:0(__restore_rt)[0x7feabb61fdd0]
/usr/libexec/mysqld(+0xa62a57)[0x558df3e65a57]
/usr/libexec/mysqld(+0xa523a4)[0x558df3e553a4]
/usr/libexec/mysqld(+0xa5d46a)[0x558df3e6046a]
/usr/libexec/mysqld(+0x4e17bb)[0x558df38e47bb]
/usr/libexec/mysqld(+0x93e217)[0x558df3d41217]
/usr/libexec/mysqld(_Z24ha_initialize_handlertonP13st_plugin_int+0x6c)[0x558df3bb656c]
/usr/libexec/mysqld(+0x5e3fba)[0x558df39e6fba]
/usr/libexec/mysqld(_Z11plugin_initPiPPci+0x9d2)[0x558df39e81b2]
/usr/libexec/mysqld(+0x51ee05)[0x558df3921e05]
/usr/libexec/mysqld(_Z11mysqld_mainiPPc+0x401)[0x558df3928c21]
/lib64/libc.so.6(__libc_start_main+0xf3)[0x7feab93cb6a3]
/usr/libexec/mysqld(_start+0x2e)[0x558df391ba6e]

Version-Release number of selected component (if applicable):

MariaDB 10.3.28

How reproducible:

We have copied corrupted DB from /var/lib/mysql, changed the ownership to mysql and turned off the selinux to make sure we don't hit any weird permission issues.

cp -r /var/lib/mysql /var/lib/mysql-save
chown mysql:mysql -R /var/lib/mysql-save
setenforce 0
mysqld_safe --datadir=/var/lib/mysql-save --socket=/var/lib/mysql-save/mysql.sock --wsrep-recover --tc-heuristic-recover=ROLLBACK &


2023-05-20 12:14:32 0 [Note] InnoDB: Uncompressed page, stored checksum in field1 2214280803, calculated checksums for field1: crc32 2214280803, innodb 3080421157,  page type 2 == UNDO LOG.none 3735928559, stored checksum in field2 2872062493, calculated checksums for field2: crc32 2214280803, innodb 2227542473, none 3735928559,  page LSN 18 727911671, low 4 bytes of LSN at page end 727968105, page number (if stored to page already) 776, space id (if created with >= MySQL-4.1.1 and stored already) 0
InnoDB: Page may be an undo log page
2023-05-20 12:14:32 0 [Note] InnoDB: It is also possible that your operating system has corrupted its own file cache and rebooting your computer removes the error. If the corrupt page is an index page. You can also try to fix the corruption by dumping, dropping, and reimporting the corrupt table. You can use CHECK TABLE to scan your table for corruption. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
230520 12:14:32 [ERROR] mysqld got signal 11 ;


We are seeing repeating errors about page 776 page corruption, we also tried starting the mariadb with:

mysqld_safe --datadir=/var/lib/mysql-save --socket=/var/lib/mysql-save/mysql.sock --innodb_force_recovery 3 --innodb_purge_threads 0