Created attachment 1838803 [details] dmesg-5.14.15.txt (low utility) 1. Please describe the problem: While using the kernel IO-uring interface, a write request is lost resulting in MariaDB asserting after 10 minutes because it is never received. 2. What is the Version-Release number of the kernel: 5.14.15-200.fc34.x86_64 and previously 5.14.14-200.fc34.x86_64 3. Did it work previously in Fedora? If so, what kernel version did the issue *first* appear? Old kernels are available for download at https://koji.fedoraproject.org/koji/packageinfo?packageID=8 : noticed in 5.13.16-200.fc34.x86_64 (MDEV-26555) as faulty. This was working at some stage as I was building MariaDB-10.6 and testing frequently on fc33 and fc34 without incident. Generally repeatable on other distros, 5.11 appears unaffected. Sometime after that. 5.15-rc kernel, it is possible to produce sometimes, much less reliable however. 4. Can you reproduce this issue? If so, please provide the steps to reproduce the issue below: Yes. https://jira.mariadb.org/browse/MDEV-26555 https://jira.mariadb.org/browse/MDEV-26674 Marko from MariaDB has validated the user space track traces are missing write requests. In these MDEV I've tested against a variety of distro and locally built liburing without differing test results. I have started engagement upstream - https://marc.info/?l=linux-block&m=163489378723217&w=2 A recent build set from our CI: https://ci.mariadb.org/19583/amd64-fedora-34-rpm-autobake/rpms/ MariaDB-server and MariaDB-test (might need client,common and shared). Validate that ldd /usr/sbin/mariadbd includes liburing. To test run: cd /usr/share/mysql/mysql-test ./mtr --vardir=/tmp/var --parallel=4 encryption.innochecksum{,,,,,} ./mtr --vardir=/tmp/var --parallel=4 stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb A test failure after timeout (10 min) results in the mariadb error: 2021-10-21 9:08:43 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ More complete example: https://marc.info/?l=linux-block&m=163516012119400&w=2 MariaDB-10.6.5+ (due out soon has a kernel check that disables native_aio by default and issues warning if forced) unit test: $ mysql-test/mtr --mysqld=--innodb_use_native_aio=1 --nowarnings --parallel=4 --force encryption.innochecksum{,,,,,} 5. Does this problem occur with the latest Rawhide kernel? To install the Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by ``sudo dnf update --enablerepo=rawhide kernel``: Did cursory test on 5.15.0-0.rc7.20211028git1fc596a56b33.56.fc36.x86_64 without incident however will test gain. 6. Are you running any modules that not shipped with directly Fedora's kernel?: No.
tested unit tests and sysbench oltp_update_index and was unable to reproduce on 5.15.0-0.rc7.20211028git1fc596a56b33.56.fc36.x86_64
Reproducer test case (as container): https://lore.kernel.org/linux-block/627629af-d8ed-416a-cbef-4d74bdeee031@kernel.dk/T/#m18b89d93af61f55be2fc4ea696e755844a237bf9 5.14 fixes: https://www.spinics.net/lists/stable/msg511279.html 5.15/16 fix: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=io_uring-5.16
Still producing this: $ uname -a Linux localhost.localdomain 5.14.20-200.fc34.x86_64 #1 SMP Thu Nov 18 22:03:20 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux Name : kernel Version : 5.14.20 Release : 200.fc34 Architecture : x86_64 Size : 0.0 Source : kernel-5.14.20-200.fc34.src.rpm Thought it was a 5.14.19 fix https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.14.19 Will follow up with upstream. $ mysql-test/mtr --mysqld=--innodb_use_native_aio=1 --nowarnings --parallel=4 --force encryption.innochecksum{,,,,,} Logging: /home/dan/repos/mariadb-server-10.6/mysql-test/mariadb-test-run.pl --mysqld=--innodb_use_native_aio=1 --nowarnings --parallel=4 --force encryption.innochecksum encryption.innochecksum encryption.innochecksum encryption.innochecksum encryption.innochecksum encryption.innochecksum VS config: vardir: /home/dan/repos/build-mariadb-server-10.6/mysql-test/var Checking leftover processes... Removing old var directory... Creating var directory '/home/dan/repos/build-mariadb-server-10.6/mysql-test/var'... Checking supported features... MariaDB Version 10.6.6-MariaDB - SSL connections supported - binaries built with wsrep patch Collecting tests... Installing system database... ============================================================================== TEST WORKER RESULT TIME (ms) or COMMENT -------------------------------------------------------------------------- worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019 worker[2] Using MTR_BUILD_THREAD 301, with reserved ports 16020..16039 worker[3] Using MTR_BUILD_THREAD 302, with reserved ports 16040..16059 worker[4] Using MTR_BUILD_THREAD 303, with reserved ports 16060..16079 encryption.innochecksum '16k,cbc,innodb,strict_crc32' w1 [ pass ] 5797 encryption.innochecksum '16k,cbc,innodb,strict_crc32' w2 [ pass ] 5812 .... encryption.innochecksum '16k,ctr,innodb,strict_crc32' w4 [ fail ] Test ended at 2021-11-24 12:53:17 CURRENT_TEST: encryption.innochecksum mysqltest: At line 40: query 'INSERT INTO t2 SELECT * FROM t1' failed: <Unknown> (2013): Lost connection to server during query The result from queries just before the failure was: SET GLOBAL innodb_file_per_table = ON; set global innodb_compression_algorithm = 1; # Create and populate a tables CREATE TABLE t1 (a INT AUTO_INCREMENT PRIMARY KEY, b TEXT) ENGINE=InnoDB ENCRYPTED=YES ENCRYPTION_KEY_ID=4; CREATE TABLE t2 (a INT AUTO_INCREMENT PRIMARY KEY, b TEXT) ENGINE=InnoDB ROW_FORMAT=COMPRESSED ENCRYPTED=YES ENCRYPTION_KEY_ID=4; CREATE TABLE t3 (a INT AUTO_INCREMENT PRIMARY KEY, b TEXT) ENGINE=InnoDB ROW_FORMAT=COMPRESSED ENCRYPTED=NO; CREATE TABLE t4 (a INT AUTO_INCREMENT PRIMARY KEY, b TEXT) ENGINE=InnoDB PAGE_COMPRESSED=1; CREATE TABLE t5 (a INT AUTO_INCREMENT PRIMARY KEY, b TEXT) ENGINE=InnoDB PAGE_COMPRESSED=1 ENCRYPTED=YES ENCRYPTION_KEY_ID=4; CREATE TABLE t6 (a INT AUTO_INCREMENT PRIMARY KEY, b TEXT) ENGINE=InnoDB; Server [mysqld.1 - pid: 79580, winpid: 79580, exit: 256] failed during test run Server log from this test: ----------SERVER LOG START----------- 2021-11-24 12:53:16 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ 211124 12:53:16 [ERROR] mysqld got signal 6 ;
one final patch needed to make this happen for 5.14 - https://lore.kernel.org/linux-block/CABVffEOXe=mhyW_-Ynz4Z9g_UxvVAms662vQjN9UBfF9NhWu8g@mail.gmail.com/T/#m480893c8e4f5f007f03f8505b404c701c0e90d2d With the stable series for 5.14.20 finished, if further 5.14 kernels are coming then including this patch is recommended. Otherwise a 5.15.3+ kernel is also sufficient to close this.
(In reply to Daniel Black from comment #4) > > With the stable series for 5.14.20 finished, if further 5.14 kernels are > coming then including this patch is recommended. > > Otherwise a 5.15.3+ kernel is also sufficient to close this. All Fedora releases are on 5.15.4+ at this point if someone would like to verify the fix and close this.
yes, sorry for delay. all kernel fixes in 5.15.4+ are good.
This message is a reminder that Fedora Linux 34 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora Linux 34 on 2022-06-07. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a 'version' of '34'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, change the 'version' to a later Fedora Linux version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora Linux 34 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora Linux, you are encouraged to change the 'version' to a later version prior to this bug being closed.
Fedora Linux 34 entered end-of-life (EOL) status on 2022-06-07. Fedora Linux 34 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. Thank you for reporting this bug and we are sorry it could not be fixed.