RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2015755 - zram: zram leak with warning when running zram02.sh in ltp
Summary: zram: zram leak with warning when running zram02.sh in ltp
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: kernel
Version: 8.5
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Ming Lei
QA Contact: ChanghuiZhong
URL:
Whiteboard:
Depends On: 2015754
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-10-20 02:45 UTC by Ming Lei
Modified: 2022-05-10 15:53 UTC (History)
3 users (show)

Fixed In Version: kernel-4.18.0-353.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2015754
Environment:
Last Closed: 2022-05-10 15:02:54 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/rhel/src/kernel rhel-8 merge_requests 1662 0 None None None 2021-11-15 13:33:20 UTC
Red Hat Issue Tracker RHELPLAN-100303 0 None None None 2021-10-20 02:49:37 UTC
Red Hat Product Errata RHSA-2022:1988 0 None None None 2022-05-10 15:03:35 UTC

Description Ming Lei 2021-10-20 02:45:46 UTC
+++ This bug was initially created as a clone of Bug #2015754 +++

Description of problem:

https://lore.kernel.org/linux-block/20210927163805.808907-1-mcgrof@kernel.org/T/#m12965e4b1c6ef5ae19f5fc019493a37b1993c2f6

When running the following script, kernel warning of 'Error: Removing
state 63 which has instances left.' will be triggered and the rhel9
test VM will reboot:

cd testcases/kernel/device-drivers/zram

while true; do
        PATH=$PATH:$PWD:$PWD/../../../lib/ ./zram02.sh;
done &

while true; do
        PATH=$MYPATH:$PWD:$PWD/../../../lib/ ./zram02.sh;
done


[   38.765210] ------------[ cut here ]------------^M
[   38.766161] Error: Removing state 63 which has instances left.^M
[   38.767287] WARNING: CPU: 15 PID: 1602 at kernel/cpu.c:2127 __cpuhp_remove_state_cpuslocked+0xea/0xf0^M
[   38.769042] Modules linked in: zram(-) rfkill nls_utf8 isofs vfat fat intel_rapl_msr intel_rapl_common isst_if_common nfit libnvdimm kvm_intel bochs_drm drm_vram_helper drm_ttm_helper kvm ttm drm_kms_helper irqbypass rapl ppdev syscopyarea iTCO_wdt sysfillrect sysimgblt iTCO_vendor_support fb_sys_fops i2c_i801 cec parport_pc i2c_smbus lpc_ich joydev parport pcspkr drm fuse xfs libcrc32c sr_mod sd_mod cdrom sg ahci libahci crct10dif_pclmul crc32_pclmul nvme uas crc32c_intel libata virtio_net nvme_core usb_storage ghash_clmulni_intel serio_raw virtio_scsi net_failover virtio_blk t10_pi failover sunrpc dm_mirror dm_region_hash dm_log dm_mod^M
[   38.779575] CPU: 15 PID: 1602 Comm: rmmod Kdump: loaded Not tainted 5.14.0-1.el9.x86_64 #1^M
[   38.781691] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-1.fc33 04/01/2014^M
[   38.783771] RIP: 0010:__cpuhp_remove_state_cpuslocked+0xea/0xf0^M
[   38.785266] Code: c6 43 21 00 48 c7 43 18 00 00 00 00 5b 5d 41 5c 41 5d 41 5e e9 87 fd 95 00 0f 0b 44 89 e6 48 c7 c7 d8 62 52 b8 e8 d9 b4 90 00 <0f> 0b eb b0 66 90 0f 1f 44 00 00 55 89 fd 53 89 f3 e8 20 f1 95 00^M
[   38.789385] RSP: 0018:ffffaac300efbe98 EFLAGS: 00010282^M



Version-Release number of selected component (if applicable):


How reproducible:

100%


Steps to Reproduce:

See above.


Actual results:

kernel warning and panic

Expected results:

no kernel warning and panic


Additional info:

The issue can be reproduced on upstream v5.15-rc kernel.

Comment 2 Ming Lei 2021-11-16 00:21:43 UTC
Hi Yi,

Can you help to review and ack this BZ since Changhui is on PTO this week?

BTW, this one blocks another MR(!1678) too.

Thanks,

Comment 3 ChanghuiZhong 2021-11-16 01:06:08 UTC
(In reply to Ming Lei from comment #2)
> Hi Yi,
> 
> Can you help to review and ack this BZ since Changhui is on PTO this week?
> 
> BTW, this one blocks another MR(!1678) too.
> 
> Thanks,

thanks Ming and Yi,Yi is also in PTO this week.
I will feedback the test result later

thanks

Comment 4 Ming Lei 2021-11-16 01:29:54 UTC
(In reply to ChanghuiZhong from comment #3)
> (In reply to Ming Lei from comment #2)
> > Hi Yi,
> > 
> > Can you help to review and ack this BZ since Changhui is on PTO this week?
> > 
> > BTW, this one blocks another MR(!1678) too.
> > 
> > Thanks,
> 
> thanks Ming and Yi,Yi is also in PTO this week.
> I will feedback the test result later

Sorry for disturbing you guys, and thanks for handling these things, have a
nice PTO!

Comment 5 ChanghuiZhong 2021-11-16 14:36:48 UTC
reproduce this issue on 4.18.0-348.6.el8.x86_64

[ 1576.883529] ------------[ cut here ]------------ 
[ 1576.907691] Error: Removing state 61 which has instances left. 
[ 1576.937921] WARNING: CPU: 11 PID: 31190 at kernel/cpu.c:1905 __cpuhp_remove_state_cpuslocked+0xaf/0x100 
[ 1576.980289] Modules linked in: zram(-) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc vfat fat dm_multipath intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm ipmi_ssif iTCO_wdt irqbypass iTCO_vendor_support crct10dif_pclmul crc32_pclmul mgag200 i2c_algo_bit ghash_clmulni_intel rapl drm_kms_helper intel_cstate syscopyarea sysfillrect intel_uncore pcspkr sysimgblt fb_sys_fops i2c_i801 drm lpc_ich acpi_ipmi hpilo hpwdt ioatdma ipmi_si dca wmi ipmi_devintf ipmi_msghandler acpi_tad acpi_power_meter xfs libcrc32c sd_mod sg ahci libahci crc32c_intel tg3 libata nvme hpsa nvme_core scsi_transport_sas t10_pi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: zram] 
[ 1577.279325] CPU: 11 PID: 31190 Comm: rmmod Kdump: loaded Not tainted 4.18.0-348.6.el8.x86_64 #1 
[ 1577.318439] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 05/21/2018 
[ 1577.355698] RIP: 0010:__cpuhp_remove_state_cpuslocked+0xaf/0x100 
[ 1577.382727] Code: c9 31 d2 44 89 e6 89 df e8 1e f9 ff ff eb c4 48 8b 85 b8 ce c4 85 48 85 c0 74 11 44 89 e6 48 c7 c7 d8 f7 6c 85 e8 fa df ff ff <0f> 0b 5b 48 c7 85 a8 ce c4 85 00 00 00 00 48 c7 c7 60 ee c4 85 48 
[ 1577.473309] RSP: 0018:ffffacfd4303bea8 EFLAGS: 00010286 
[ 1577.496420] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 
[ 1577.528556] RDX: ffff9ba7afce7320 RSI: ffff9ba7afcd6858 RDI: ffff9ba7afcd6858 
[ 1577.560641] RBP: 0000000000000988 R08: 0000000000000000 R09: c0000000ffff7fff 
[ 1577.592656] R10: 0000000000000001 R11: ffffacfd4303bcc0 R12: 000000000000003d 
[ 1577.625195] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 
[ 1577.657508] FS:  00007f864ae27740(0000) GS:ffff9ba7afcc0000(0000) knlGS:0000000000000000 
[ 1577.693998] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[ 1577.719819] CR2: 00007f8649d40b70 CR3: 0000000139f46006 CR4: 00000000001706e0 
[ 1577.751920] Call Trace: 
[ 1577.762890]  __cpuhp_remove_state+0x2e/0x80 
[ 1577.781667]  __x64_sys_delete_module+0x139/0x280 
[ 1577.802372]  do_syscall_64+0x5b/0x1a0 
[ 1577.818791]  entry_SYSCALL_64_after_hwframe+0x65/0xca 
[ 1577.841651] RIP: 0033:0x7f8649e0283b 
[ 1577.858165] Code: 73 01 c3 48 8b 0d 4d 16 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1d 16 2c 00 f7 d8 64 89 01 48 
[ 1577.950045] RSP: 002b:00007ffc29a54e18 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 
[ 1577.985434] RAX: ffffffffffffffda RBX: 000055fea108e800 RCX: 00007f8649e0283b 
[ 1578.018076] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055fea108e868 
[ 1578.050432] RBP: 0000000000000000 R08: 00007ffc29a53d91 R09: 0000000000000000 
[ 1578.083148] R10: 00007f8649e763a0 R11: 0000000000000206 R12: 00007ffc29a55040 
[ 1578.116481] R13: 00007ffc29a56ef4 R14: 000055fea108e2a0 R15: 000055fea108e800 
[ 1578.148649] ---[ end trace bd3570ed5228c4b7 ]--- 



and confirmed that this issue can not be reproduced on 4.18.0-349.el8.mr1662_211115_1341.x86_64.
there is no kernel warning and panic.

Comment 6 ChanghuiZhong 2021-11-16 14:42:57 UTC
(In reply to Ming Lei from comment #2)
> Hi Yi,
> 
> Can you help to review and ack this BZ since Changhui is on PTO this week?
> 
> BTW, this one blocks another MR(!1678) too.
> 
> Thanks,

Hello,Ming

I can not find MR!1678 in https://gitlab.com/redhat/rhel/src/kernel/rhel-8/-/merge_requests,
Which bz is this about?

Thanks

Comment 7 Ming Lei 2021-11-16 15:24:28 UTC
(In reply to ChanghuiZhong from comment #6)
> (In reply to Ming Lei from comment #2)
> > Hi Yi,
> > 
> > Can you help to review and ack this BZ since Changhui is on PTO this week?
> > 
> > BTW, this one blocks another MR(!1678) too.
> > 
> > Thanks,
> 
> Hello,Ming
> 
> I can not find MR!1678 in
> https://gitlab.com/redhat/rhel/src/kernel/rhel-8/-/merge_requests,
> Which bz is this about?
> 
> Thanks

https://gitlab.com/redhat/rhel/src/kernel/rhel-8/-/merge_requests/1648

Comment 8 ChanghuiZhong 2021-11-17 01:58:22 UTC
(In reply to Ming Lei from comment #7)
> (In reply to ChanghuiZhong from comment #6)
> > (In reply to Ming Lei from comment #2)
> > > Hi Yi,
> > > 
> > > Can you help to review and ack this BZ since Changhui is on PTO this week?
> > > 
> > > BTW, this one blocks another MR(!1678) too.
> > > 
> > > Thanks,
> > 
> > Hello,Ming
> > 
> > I can not find MR!1678 in
> > https://gitlab.com/redhat/rhel/src/kernel/rhel-8/-/merge_requests,
> > Which bz is this about?
> > 
> > Thanks
> 
> https://gitlab.com/redhat/rhel/src/kernel/rhel-8/-/merge_requests/1648

thanks Ming
this is an issue about memory management, looks there is nothing I can do,
other team member will handle it

Comment 12 ChanghuiZhong 2021-12-01 08:49:06 UTC
verified this issue can not be reproduced on kernel-4.18.0-353.el8,
there is no kernel warning and panic.

fix patches has included to kernel tree:
$ git log kernel-4.18.0-353.el8 --oneline --grep=2015755
530d462bef59 Merge: zram: several bug fixes
2951396a37b4 zram: replace fsync_bdev with sync_blockdev
91f0ad82123b zram: avoid race between zram_remove and disksize_store
2278cffd63b9 zram: don't fail to remove zram during unloading module
4cc586d5e7d3 zram: fix race between zram_reset_device() and disksize_store()
3eb7b38aa261 zram: register default groups with device_add_disk()


move to verified

Comment 14 errata-xmlrpc 2022-05-10 15:02:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: kernel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1988


Note You need to log in before you can comment on or make changes to this bug.