Bug 1411029 - Possibly XFS related memory leak leaves machine with most memory in "inactive (file)" that is not reclaimed
Summary: Possibly XFS related memory leak leaves machine with most memory in "inactive...
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 25
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-07 16:33 UTC by Petr Tuma
Modified: 2019-01-09 12:54 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-01-17 06:45:01 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Petr Tuma 2017-01-07 16:33:08 UTC
Description of problem:

A default (headless server) install of Fedora 25 on a single CPU Xeon E5-2620 V4 machine with 64GB RAM, running benchmarks from the ScalaBench suite (http://www.scalabench.org) ends up in a state where most memory is listed as "inactive (file)" in /proc/meminfo and programs fail to start (in our case OpenJDK JVM with 12GB heap encounters mmap failure trying to allocate the heap block). After reboot, things again work for a while until memory is exhausted.

Some observations:

- This happens in the same way on 8 equivalent machines, and is likely related to XFS, which is used for root filesystem. Reinstalling the same machine with EXT4 makes the problem disappear.

- Although /proc/meminfo suggests most memory is "available" (total 64G, free 400M, available 64G), it is not freed on operations such as drop caches.

More information (meminfo, slabinfo etc.) attached below.

Version-Release number of selected component (if applicable):

kernel 4.8.15-300.fc25.x86_64

Additional info:

> cat /proc/meminfo 
MemTotal:       65862388 kB
MemFree:          558900 kB
MemAvailable:   64710356 kB
Buffers:            1272 kB
Cached:          3536380 kB
SwapCached:            0 kB
Active:          4015596 kB
Inactive:       59286660 kB
Active(anon):      98588 kB
Inactive(anon):    19560 kB
Active(file):    3917008 kB
Inactive(file): 59267100 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:        511996 kB
SwapFree:         511996 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:        116340 kB
Mapped:           157400 kB
Shmem:              1808 kB
Slab:            1785552 kB
SReclaimable:    1699036 kB
SUnreclaim:        86516 kB
KernelStack:        3584 kB
PageTables:         7452 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    33443188 kB
Committed_AS:     448404 kB
VmallocTotal:   34359738367 kB
VmallocUsed:           0 kB
VmallocChunk:          0 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       93644 kB
DirectMap2M:     6082560 kB
DirectMap1G:    62914560 kB

> slabtop
  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
15184728 15184728 100%    0.10K 389352       39   1557408K buffer_head
327285 321440  98%    0.19K  15585	 21     62340K dentry
146304 143780  98%    0.03K   1143	128	 4572K kmalloc-32
130050 130050 100%    0.02K    765	170	 3060K scsi_data_buffer
125120 125120 100%    0.06K   1955	 64	 7820K kmalloc-64
 92160  92160 100%    0.01K    180	512	  720K kmalloc-8
 74752  74752 100%    0.02K    292	256	 1168K kmalloc-16
 72063  72063 100%    0.08K   1413	 51	 5652K Acpi-State
 43622  43622 100%    0.12K   1283	 34	 5132K kernfs_node_cache
 42432  41238  97%    0.94K   1248	 34     39936K xfs_inode
 28084  22504  80%    0.57K   1003	 28     16048K radix_tree_node
 24472  24472 100%    0.07K    437	 56	 1748K Acpi-Operand
 22820  22746  99%    0.57K    815	 28     13040K inode_cache
 22300  22300 100%    0.16K    892	 25	 3568K sigqueue
 22050  21315  96%    0.09K    525	 42	 2100K kmalloc-96
 19520  19308  98%    0.06K    305	 64	 1220K anon_vma_chain
 16064  14760  91%    1.00K    502	 32     16064K kmalloc-1024
 12831  12166  94%    0.19K    611	 21	 2444K cred_jar
 12750  12571  98%    0.08K    250	 51	 1000K anon_vma
 10506  10506 100%    0.04K    103	102	  412K Acpi-Namespace
  7648   6127  80%    0.25K    239	 32	 1912K kmalloc-256
  6528   4625  70%    0.50K    204	 32	 3264K kmalloc-512
  5921   5921 100%    1.01K    191	 31	 6112K nfs_inode_cache
  5355   5231  97%    0.19K    255	 21	 1020K kmalloc-192
  4875   4875 100%    0.62K    195	 25	 3120K proc_inode_cache

> cat /proc/zoneinfo 
Node 0, zone      DMA
  per-node stats
      nr_inactive_anon 4889
      nr_active_anon 24649
      nr_inactive_file 14816805
      nr_active_file 979289
      nr_unevictable 0
      nr_isolated_anon 0
      nr_isolated_file 0
      nr_pages_scanned 0
      workingset_refault 2495
      workingset_activate 2218
      workingset_nodereclaim 99
      nr_anon_pages 29086
      nr_mapped    39406
      nr_file_pages 884480
      nr_dirty     24
      nr_writeback 0
      nr_writeback_temp 0
      nr_shmem     452
      nr_shmem_hugepages 0
      nr_shmem_pmdmapped 0
      nr_anon_transparent_hugepages 0
      nr_unstable  0
      nr_vmscan_write 0
      nr_vmscan_immediate_reclaim 0
      nr_dirtied   23357107
      nr_written   7998186
  pages free     3971
        min      4
        low      7
        high     10
   node_scanned  0
        spanned  4095
        present  3993
        managed  3972
      nr_free_pages 3971
      nr_zone_inactive_anon 0
      nr_zone_active_anon 0
      nr_zone_inactive_file 0
      nr_zone_active_file 0
      nr_zone_unevictable 0
      nr_zone_write_pending 0
      nr_mlock     0
      nr_slab_reclaimable 0
      nr_slab_unreclaimable 1
      nr_page_table_pages 0
      nr_kernel_stack 0
      nr_bounce    0
      nr_zspages   0
      numa_hit     1
      numa_miss    0
      numa_foreign 0
      numa_interleave 0
      numa_local   1
      numa_other   0
      nr_free_cma  0
        protection: (0, 1836, 64280, 64280, 64280)
  pagesets
    cpu: 0
              count: 0
              high:  0
              batch: 1
  vm stats threshold: 8
    cpu: 1
              count: 0
              high:  0
              batch: 1
  vm stats threshold: 8
    cpu: 2
              count: 0
              high:  0
              batch: 1
  vm stats threshold: 8
    cpu: 3
              count: 0
              high:  0
              batch: 1
  vm stats threshold: 8
    cpu: 4
              count: 0
              high:  0
              batch: 1
  vm stats threshold: 8
    cpu: 5
              count: 0
              high:  0
              batch: 1
  vm stats threshold: 8
    cpu: 6
              count: 0
              high:  0
              batch: 1
  vm stats threshold: 8
    cpu: 7
              count: 0
              high:  0
              batch: 1
  vm stats threshold: 8
  node_unreclaimable:  0
  start_pfn:           1
  node_inactive_ratio: 0
Node 0, zone    DMA32
  pages free     63365
        min      482
        low      952
        high     1422
   node_scanned  0
        spanned  1044480
        present  491379
        managed  474987
      nr_free_pages 63365
      nr_zone_inactive_anon 0
      nr_zone_active_anon 0
      nr_zone_inactive_file 393462
      nr_zone_active_file 6931
      nr_zone_unevictable 0
      nr_zone_write_pending 0
      nr_mlock     0
      nr_slab_reclaimable 10273
      nr_slab_unreclaimable 6
      nr_page_table_pages 0
      nr_kernel_stack 0
      nr_bounce    0
      nr_zspages   0
      numa_hit     470243
      numa_miss    0
      numa_foreign 0
      numa_interleave 0
      numa_local   470243
      numa_other   0
      nr_free_cma  0
        protection: (0, 0, 62443, 62443, 62443)
  pagesets
    cpu: 0
              count: 172
              high:  186
              batch: 31
  vm stats threshold: 40
    cpu: 1
              count: 177
              high:  186
              batch: 31
  vm stats threshold: 40
    cpu: 2
              count: 166
              high:  186
              batch: 31
  vm stats threshold: 40
    cpu: 3
              count: 70
              high:  186
              batch: 31
  vm stats threshold: 40
    cpu: 4
              count: 174
              high:  186
              batch: 31
  vm stats threshold: 40
    cpu: 5
              count: 68
              high:  186
              batch: 31
  vm stats threshold: 40
    cpu: 6
              count: 56
              high:  186
              batch: 31
  vm stats threshold: 40
    cpu: 7
              count: 60
              high:  186
              batch: 31
  vm stats threshold: 40
  node_unreclaimable:  0
  start_pfn:           4096
  node_inactive_ratio: 0
Node 0, zone   Normal
  pages free     72132
        min      16409
        low      32394
        high     48379
   node_scanned  0
        spanned  16252928
        present  16252928
        managed  15986638
      nr_free_pages 72132
      nr_zone_inactive_anon 4889
      nr_zone_active_anon 24649
      nr_zone_inactive_file 14423343
      nr_zone_active_file 972358
      nr_zone_unevictable 0
      nr_zone_write_pending 24
      nr_mlock     0
      nr_slab_reclaimable 414520
      nr_slab_unreclaimable 21659
      nr_page_table_pages 1863
      nr_kernel_stack 3568
      nr_bounce    0
      nr_zspages   0
      numa_hit     177613519
      numa_miss    0
      numa_foreign 0
      numa_interleave 40826
      numa_local   177613519
      numa_other   0
      nr_free_cma  0
        protection: (0, 0, 0, 0, 0)
  pagesets
    cpu: 0
              count: 178
              high:  186
              batch: 31
  vm stats threshold: 80
    cpu: 1
              count: 181
              high:  186
              batch: 31
  vm stats threshold: 80
    cpu: 2
              count: 176
              high:  186
              batch: 31
  vm stats threshold: 80
    cpu: 3
              count: 130
              high:  186
              batch: 31
  vm stats threshold: 80
    cpu: 4
              count: 165
              high:  186
              batch: 31
  vm stats threshold: 80
    cpu: 5
              count: 76
              high:  186
              batch: 31
  vm stats threshold: 80
    cpu: 6
              count: 173
              high:  186
              batch: 31
  vm stats threshold: 80
    cpu: 7
              count: 134
              high:  186
              batch: 31
  vm stats threshold: 80
  node_unreclaimable:  0
  start_pfn:           1048576
  node_inactive_ratio: 0

Comment 1 Petr Tuma 2017-01-11 09:12:52 UTC
Jan Kara and Vlastimil Babka of SUSE tracked this down to commit 99579ccec4e2 "xfs: skip dirty pages in ->releasepage()" in kernel, patch was proposed to stable.

Comment 2 Laura Abbott 2017-01-17 01:23:14 UTC
*********** MASS BUG UPDATE **************
We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 25 kernel bugs.
 
Fedora 25 has now been rebased to 4.9.3-200.fc25.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.
 
If you have moved on to Fedora 26, and are still experiencing this issue, please change the version to Fedora 26.
 
If you experience different issues, please open a new bug report for those.

Comment 3 Petr Tuma 2017-01-17 06:45:01 UTC
Fixed by commit 0a417b8dc1f10b03e8f558b8a831f07ec4c23795 in mainline for 4.10.


Note You need to log in before you can comment on or make changes to this bug.