Bug 2242391 - Kernel worker thread on 100% CPU core utilisation and one btrfs file system completely unusable
Summary: Kernel worker thread on 100% CPU core utilisation and one btrfs file system c...
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 38
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-10-05 20:25 UTC by Joshua Noeske
Modified: 2023-10-11 09:13 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: ---
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: ---
Embargoed:


Attachments (Terms of Use)
journalctl -k output of the boot with the problem. (88.09 KB, text/plain)
2023-10-05 20:26 UTC, Joshua Noeske
no flags Details

Description Joshua Noeske 2023-10-05 20:25:24 UTC
1. Please describe the problem:
Hello, on kernel 6.5.5, for the second time I encountered the problem that my one kernel worker thread named `kworker/u8:12+flush-btrfs-2` uses one core to 100% and that subsequent reads/writes to one of the attached btrfs filesystems are completely impossible. Both of the times, this happened after more than one day of operation. The machine on which I observed this is used as a server.
All attached drives report no SMART errors and after the first occurrence, I ran `btrfs check` on both filesystems, which did not report any errors.

2. What is the Version-Release number of the kernel:
6.5.5

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :
I have never encountered this problem before kernel 6.5.5. Now, I am running kernel 6.4.15 again and I'll report if the problem exists there as well.

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:
I am not completely sure if it has to do something with it, but I guess that the error was triggered by btrbk transferring snapshots from one drive to another. This involves btrfs send and receive operations. I assume that because on later invocations of btrbk, it always stated that the subvolume of the snapshot existed but that the received UUID was not set yet.

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:
To be quite honest, I don't really want to run a rawhide kernel on my server.


6. Are you running any modules that not shipped with directly Fedora's kernel?:
No

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

Reproducible: Always

Comment 1 Joshua Noeske 2023-10-05 20:26:15 UTC
Created attachment 1992294 [details]
journalctl -k output of the boot with the problem.

Comment 2 Joshua Noeske 2023-10-05 20:43:40 UTC
I forgot to mention that unmounting the affected filesystem fails since the device is busy. Even after lazily unmounting the filesystem, the machine did not shut down correctly and I had to perform a hard reset.

Comment 3 Joshua Noeske 2023-10-05 20:52:20 UTC
The CPU usage graph of Cockpit supports my theory regarding btrbk and the snapshot transfer. Both times, the error apparently occurred during the transfer of the snapshots to another drive.

Comment 4 Eduard Kohler 2023-10-08 10:30:20 UTC
This also happens on a F37 system, EXT4 partition over a mdadm raid1 (HDD 4T).

Whole kernel line 6.5 (tested 6.5.4, 6.5.5 and 6.5.6 that are available on koji) display this behaviour, while previous line 6.4 (tested 6.4.4, 6.4.15) doesn't.

What triggers this behaviour in my case is creating small files on the raid array's partition, ie:

#for i in {0001..0200}; echo "some text" > "file_${i}.txt"

After a few seconds the kworker/flush kicks in for a variable amount of time dependent of the number of created files. During the time the kworker/flush is 100% CPU, trying to delete these files is more or less impossible.

Removing these files (once the kworker/flush goes away) is fast and doesn't trig this behaviour.

Writing one huge file (dd if=/dev/zero of=/raid/file) doesn't seem to trig this behaviour.

I also experienced the behaviour in Comment #2, which lead to a reconstruction of the raid array, youpii.

On the same system, a small SSD (16G) is installed for the system with a EXT4 partition, no raid. Writing smalls file on this SSD partition doesn't trig the kworker/flush to eat 100% CPU.

I am willing to test kernels as long as they work on F37 (for now) and I don't have to build them. Building Fedora kernels are not an option for me. Last time I tried it took several hours just to fail after filling remaining 16G disk space on a I7 laptop (ok not last generation, but still).

Comment 5 Joshua Noeske 2023-10-11 09:13:51 UTC
I can confirm that 6.4.15 does not show this behaviour. The error has not occurred with this kernel version yet.


Note You need to log in before you can comment on or make changes to this bug.