Bug 1696087

Summary: BUG: scheduling while atomic in zswap
Product: Red Hat Enterprise Linux 7 Reporter: Ping Fang <pifang>
Component: kernel-rtAssignee: Luis Claudio R. Goncalves <lgoncalv>
kernel-rt sub component: Scheduler QA Contact: Mike Stowell <mstowell>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: high CC: bhu, chuhu, cye, daolivei, dhoward, liwan, mstowell, qzhao, rt-maint, williams
Version: 7.7Keywords: ZStream
Target Milestone: rc   
Target Release: 7.8   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-rt-3.10.0-1063.rt56.1023.el7 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 1724657 1737371 1737372 (view as bug list) Environment:
Last Closed: 2020-03-31 19:48:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1693411, 1724657, 1726362, 1737371, 1737372    

Description Ping Fang 2019-04-04 06:55:21 UTC
Description of problem:
While testing zswap with kernel-rt-3.10.0-957.12.1.rt56.925.el7.x86_64
Trigger BUG: scheduling while atomic
vmcore-dmesg.txt:

[77085.661306] BUG: scheduling while atomic: kswapd0/94/0x00000002
[77085.661331] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc iTCO_wdt gpio_ich iTCO_vendor_support intel_powerclamp coretemp intel_rapl ipmi_ssif cdc_ether iosf_mbi crc32_pclmul usbnet mii ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg ipmi_si lpc_ich ipmi_devintf ipmi_msghandler i2c_i801 ie31200_edac pcspkr ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect crct10dif_pclmul sysimgblt crct10dif_common fb_sys_fops crc32c_intel ttm drm drm_panel_orientation_quirks mpt2sas e1000e raid_class ata_piix scsi_transport_sas ptp libata pps_core dm_mirror dm_region_hash dm_log dm_mod
[77085.661333] CPU: 5 PID: 94 Comm: kswapd0 Kdump: loaded Not tainted 3.10.0-957.12.1.rt56.925.el7.x86_64 #1
[77085.661333] Hardware name: IBM IBM System X3250 M4 -[2583I22]-/00D3729, BIOS -[JQE142CUS-1.01]- 05/14/2012
[77085.661334] Call Trace:
[77085.661340]  [<ffffffff8db547cb>] dump_stack+0x19/0x1b
[77085.661343]  [<ffffffff8db4f018>] __schedule_bug+0x64/0x72
[77085.661344]  [<ffffffff8db59c88>] __schedule+0x798/0x920
[77085.661345]  [<ffffffff8db59e40>] schedule+0x30/0x96
[77085.661347]  [<ffffffff8db5ab55>] rt_spin_lock_slowlock_locked+0xf5/0x2d0
[77085.661348]  [<ffffffff8db5ad87>] rt_spin_lock_slowlock+0x57/0x90
[77085.661350]  [<ffffffff8db5c665>] rt_spin_lock+0x25/0x30
[77085.661352]  [<ffffffff8d5b9beb>] get_page_from_freelist+0x66b/0xb70
[77085.661354]  [<ffffffff8d5ba703>] __alloc_pages_nodemask+0x613/0xab0
[77085.661356]  [<ffffffff8d605008>] alloc_pages_current+0x98/0x110
[77085.661358]  [<ffffffff8d6279ee>] zbud_alloc+0xae/0x220
[77085.661359]  [<ffffffff8d627b6e>] zbud_zpool_malloc+0xe/0x10
[77085.661361]  [<ffffffff8d6276da>] zpool_malloc+0x1a/0x20
[77085.661363]  [<ffffffff8d5fcbb3>] zswap_frontswap_store+0x1d3/0x380
[77085.661364]  [<ffffffff8d5fbd00>] __frontswap_store+0x80/0x100
[77085.661366]  [<ffffffff8d5f5b13>] swap_writepage+0x23/0x80
[77085.661368]  [<ffffffff8d5c4bb8>] shrink_page_list+0x888/0xc80
[77085.661370]  [<ffffffff8d5c557e>] shrink_inactive_list+0x1be/0x5e0
[77085.661372]  [<ffffffff8d5c6095>] shrink_lruvec+0x385/0x730
[77085.661374]  [<ffffffff8d5c64b6>] shrink_zone+0x76/0x100
[77085.661375]  [<ffffffff8d5c7654>] balance_pgdat+0x3a4/0x5a0
[77085.661377]  [<ffffffff8d5c79cf>] kswapd+0x17f/0x4b0
[77085.661380]  [<ffffffff8d4b4f60>] ? wake_up_atomic_t+0x30/0x30
[77085.661381]  [<ffffffff8d5c7850>] ? balance_pgdat+0x5a0/0x5a0
[77085.661383]  [<ffffffff8d4b4141>] kthread+0xd1/0xe0
[77085.661384]  [<ffffffff8d4b4070>] ? kthread_worker_fn+0x170/0x170
[77085.661386]  [<ffffffff8db65eb7>] ret_from_fork_nospec_begin+0x21/0x21
[77085.661387]  [<ffffffff8d4b4070>] ? kthread_worker_fn+0x170/0x170



Version-Release number of selected component (if applicable):
kernel-rt-3.10.0-957.12.1.rt56.925.el7.x86_64

also reproduced with 3.10.0-1034.rt56.993.el7.x86_64.

How reproducible:


Steps to Reproduce:
1. #grubby --args "zswap.enabled=1" --update-kernel DEFAULT
2. #reboot
3. install stress
4. #stress --vm 1 --vm-bytes $MemAvailable --timeout 240s     //MemAvailable from /proc/meminfo

Actual results:
Panic.

Expected results:
Finish zswap test.

Additional info:

3.10.0-957.12.1.rt56.925.el7.x86_64 vmcore:
http://ibm-x3250m4-03.rhts.eng.pek2.redhat.com/vmcore/pifang/3.10.0-957.12.1.rt56.925.el7.x86_64/3451665/ibm-x3250m4-04.rhts.eng.pek2.redhat.com/10.73.4.221-2019-04-04-01:51:51/
3.10.0-1034.rt56.993.el7.x86_64 vmcore:
http://ibm-x3250m4-03.rhts.eng.pek2.redhat.com/vmcore/pifang/3.10.0-1034.rt56.993.el7.x86_64/3452071/dell-pr7610-01.khw.lab.eng.bos.redhat.com/10.16.185.172-2019-04-03-06:35:30/

Comment 21 errata-xmlrpc 2020-03-31 19:48:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:1070