Bug 1759879 - System hang up when memory swapping (kswapd deadlock)
Summary: System hang up when memory swapping (kswapd deadlock)
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 31
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-10-09 10:12 UTC by Mirek Svoboda
Modified: 2019-11-03 13:51 UTC (History)
24 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:


Attachments (Terms of Use)
kernel 5.3.5 log (87.43 KB, text/plain)
2019-10-09 11:58 UTC, Mirek Svoboda
no flags Details
kernel log 5.4.0-0.rc1.git1.1.fc32.x86_64 (177.88 KB, text/plain)
2019-10-09 12:40 UTC, Mirek Svoboda
no flags Details

Description Mirek Svoboda 2019-10-09 10:12:24 UTC
1. Please describe the problem:
I run Fedora 31 with KDE Plasma DE.
Swapping with 5.2.17 works fine, even when using 4GB of swap for several days. Swapping with 5.3.2 causes a complete freeze, usually soon after swapping occures, i.e. screen freezes up, no mouse movement, no TTY access. I did not try SYSRQ keys.

System under test:
HP Elitebook 850 G4
CPU: Intel i5-7200U with embedded GPU
RAM: 4GB unbuffered, memtest OK
Disk: SSD Samsung PM961 (256GB), LVM+LUKS, root volume with XFS filesystem
Swap: swapping to file of size 20GB at path /swapfile

2. What is the Version-Release number of the kernel:
kernel-5.3.2-300.fc30.x86_64

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :
It works with 5.2.17

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:
- enable swap
- allocate enough RAM so system starts swapping, use e.g. multiple web browser tabs with a memory consuming webpages
- after while, usually in less than two hours, the system freezes

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:
I did not try yet.

6. Are you running any modules that not shipped with directly Fedora's kernel?:
Yes, kmod-VirtualBox, in both 5.2.17 and 5.3.2.
The issue happens in 5.3.2 even when VirtualBox is never used.
The issue does not happen in 5.2.17 even when VirtualBox is used and causes a lot of swapping.

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.
There is no message in the kernel log at the time of the freeze.
Will try to reproduce with 5.3.5 and attach the log.

Comment 1 Mirek Svoboda 2019-10-09 11:03:51 UTC
The issue happens also on non-tainted kernel-5.3.5-300.fc31.x86_64.

Comment 2 Mirek Svoboda 2019-10-09 11:57:26 UTC
I/O Scheduler: BFQ

While the issue is happening, the disk LED indicates no disk activity, although the LED works otherwise.

Attaching kernel log dmesg-5.3.5.txt from an affected run of 5.3.5 kernel.

Comment 3 Mirek Svoboda 2019-10-09 11:58:19 UTC
Created attachment 1623777 [details]
kernel 5.3.5 log

Comment 4 Mirek Svoboda 2019-10-09 12:39:59 UTC
Trying rawhide kernel 5.4.0-0.rc1.git1.1.fc32.x86_64. It is also affected by this bug. I see following warning in dmesg:

<truncated>
Oct 09 13:47:08 kernel: WARNING: possible circular locking dependency detected
Oct 09 13:47:08 kernel: 5.4.0-0.rc1.git1.1.fc32.x86_64 #1 Not tainted

<truncated>

*** DEADLOCK ***
Oct 09 13:47:08 kernel: 4 locks held by kswapd0/157:
Oct 09 13:47:08 kernel:  #0: ffffffff83781540 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30
Oct 09 13:47:08 kernel:  #1: ffffffff837743d8 (shrinker_rwsem){++++}, at: shrink_slab+0x134/0x2b0
Oct 09 13:47:08 kernel:  #2: ffff8e2e0dd920e8 (&type->s_umount_key#56){++++}, at: trylock_super+0x16/0x50
Oct 09 13:47:08 kernel:  #3: ffff8e2e0dd57a58 (&pag->pag_ici_reclaim_lock){+.+.}, at: xfs_reclaim_inodes_ag+0x95/0x450 [xfs]

Full dmesg is attached.

Comment 5 Mirek Svoboda 2019-10-09 12:40:47 UTC
Created attachment 1623793 [details]
kernel log 5.4.0-0.rc1.git1.1.fc32.x86_64

Comment 6 Mirek Svoboda 2019-10-09 12:52:20 UTC
I have opened an upstream bug https://bugzilla.kernel.org/show_bug.cgi?id=205135

Comment 7 Mirek Svoboda 2019-10-10 08:57:32 UTC
Only known recovery from the freeze is a hard reset. This is not a temporary freeze, but system hang up.

Comment 8 samoht0 2019-10-10 19:17:18 UTC
This unreproduced bot crash sounds related to me:

https://lore.kernel.org/linux-mm/20190910071804.2944-1-hdanton@sina.com/

Something for the maintainers to look into.

Comment 9 Mirek Svoboda 2019-10-11 07:40:33 UTC
kernel-5.4.0-0.rc2.git1.1.fc32.x86_64 is also affected.

Comment 10 Mirek Svoboda 2019-10-22 08:59:24 UTC
Everyone who uses a swapfile on XFS filesystem seem affected by this hang up.


Note You need to log in before you can comment on or make changes to this bug.