Bug 1599324 - vmscan: shrink_slab: gfs2_glock_shrink_scan+0x0/0x2d0 negative objects to delete
Summary: vmscan: shrink_slab: gfs2_glock_shrink_scan+0x0/0x2d0 negative objects to delete
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Robert Peterson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-09 13:56 UTC by Andrew Price
Modified: 2023-12-07 00:31 UTC (History)
19 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2019-10-16 15:44:57 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Proposed upstream and rhel8 patch (2.66 KB, patch)
2019-10-16 12:36 UTC, Robert Peterson
no flags Details | Diff

Description Andrew Price 2018-07-09 13:56:54 UTC
Description of problem:

Running fsstress (from the xfstests tree) for a while, umounting the gfs2 fs and running fsck.gfs2 will cause vmscan warnings about "negative objects to delete" to be printed. 

Version-Release number of selected component (if applicable):

Current mainline kernel (v4.18-rc4)

How reproducible:
Not 100% but chances increase as the running time of fsstress increases.

Steps to Reproduce:
1. mkfs.gfs2 -p lock_nolock /dev/foo
2. mount /dev/foo /mnt/test
3. ./fsstress -d /mnt/test/ -p3 -l0 -n 10000000 -v -c
4. (Wait about an hour)
5. ^C
6. umount /mnt/test
7. fsck.gfs2 /dev/foo

Actual results:

At some point starting in pass 1, a lot of these warnings hit the console:

[10322.608787] vmscan: shrink_slab: gfs2_glock_shrink_scan+0x0/0x2d0 negative objects to delete nr=-9223372036854775718
[10322.611004] vmscan: shrink_slab: gfs2_glock_shrink_scan+0x0/0x2d0 negative objects to delete nr=-9223372036854775718
[10322.615502] vmscan: shrink_slab: gfs2_glock_shrink_scan+0x0/0x2d0 negative objects to delete nr=-9223372036854775718
[10322.619220] vmscan: shrink_slab: gfs2_glock_shrink_scan+0x0/0x2d0 negative objects to delete nr=-9223372036854775718
...

Expected results:

No warnings

Additional info:

https://www.redhat.com/archives/cluster-devel/2018-April/msg00019.html

Comment 1 Robert Peterson 2019-10-15 17:09:12 UTC
I'm guessing this is a simple overflow problem. GFS2 uses an atomic_t to keep track of
the number of items on the lru list. That's read as an int (4 bytes) for the
calculation. I bet this goes away if we change that to an atomic64_t.

I'll whip up a quick patch and maybe Andy can test it.
Reassigning to myself.

Comment 4 Robert Peterson 2019-10-16 12:36:06 UTC
Created attachment 1626473 [details]
Proposed upstream and rhel8 patch

Andy, here's the full patch.

Comment 5 Andrew Price 2019-10-16 15:44:57 UTC
I can't reproduce this at all any more. Scanning through commits it seems very likely that this one would have fixed the issue:

  commit 7881ef3f33bb80f459ea6020d1e021fc524a6348
  Author: Ross Lagerwall <ross.lagerwall>
  Date:   Wed Mar 27 17:09:17 2019 +0000
  
      gfs2: Fix lru_count going negative

So I'm going to close this on that basis.


Note You need to log in before you can comment on or make changes to this bug.