Description of problem: A default (headless server) install of Fedora 25 on a single CPU Xeon E5-2620 V4 machine with 64GB RAM, running benchmarks from the ScalaBench suite (http://www.scalabench.org) ends up in a state where most memory is listed as "inactive (file)" in /proc/meminfo and programs fail to start (in our case OpenJDK JVM with 12GB heap encounters mmap failure trying to allocate the heap block). After reboot, things again work for a while until memory is exhausted. Some observations: - This happens in the same way on 8 equivalent machines, and is likely related to XFS, which is used for root filesystem. Reinstalling the same machine with EXT4 makes the problem disappear. - Although /proc/meminfo suggests most memory is "available" (total 64G, free 400M, available 64G), it is not freed on operations such as drop caches. More information (meminfo, slabinfo etc.) attached below. Version-Release number of selected component (if applicable): kernel 4.8.15-300.fc25.x86_64 Additional info: > cat /proc/meminfo MemTotal: 65862388 kB MemFree: 558900 kB MemAvailable: 64710356 kB Buffers: 1272 kB Cached: 3536380 kB SwapCached: 0 kB Active: 4015596 kB Inactive: 59286660 kB Active(anon): 98588 kB Inactive(anon): 19560 kB Active(file): 3917008 kB Inactive(file): 59267100 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 511996 kB SwapFree: 511996 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 116340 kB Mapped: 157400 kB Shmem: 1808 kB Slab: 1785552 kB SReclaimable: 1699036 kB SUnreclaim: 86516 kB KernelStack: 3584 kB PageTables: 7452 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 33443188 kB Committed_AS: 448404 kB VmallocTotal: 34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB HardwareCorrupted: 0 kB AnonHugePages: 0 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 93644 kB DirectMap2M: 6082560 kB DirectMap1G: 62914560 kB > slabtop OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 15184728 15184728 100% 0.10K 389352 39 1557408K buffer_head 327285 321440 98% 0.19K 15585 21 62340K dentry 146304 143780 98% 0.03K 1143 128 4572K kmalloc-32 130050 130050 100% 0.02K 765 170 3060K scsi_data_buffer 125120 125120 100% 0.06K 1955 64 7820K kmalloc-64 92160 92160 100% 0.01K 180 512 720K kmalloc-8 74752 74752 100% 0.02K 292 256 1168K kmalloc-16 72063 72063 100% 0.08K 1413 51 5652K Acpi-State 43622 43622 100% 0.12K 1283 34 5132K kernfs_node_cache 42432 41238 97% 0.94K 1248 34 39936K xfs_inode 28084 22504 80% 0.57K 1003 28 16048K radix_tree_node 24472 24472 100% 0.07K 437 56 1748K Acpi-Operand 22820 22746 99% 0.57K 815 28 13040K inode_cache 22300 22300 100% 0.16K 892 25 3568K sigqueue 22050 21315 96% 0.09K 525 42 2100K kmalloc-96 19520 19308 98% 0.06K 305 64 1220K anon_vma_chain 16064 14760 91% 1.00K 502 32 16064K kmalloc-1024 12831 12166 94% 0.19K 611 21 2444K cred_jar 12750 12571 98% 0.08K 250 51 1000K anon_vma 10506 10506 100% 0.04K 103 102 412K Acpi-Namespace 7648 6127 80% 0.25K 239 32 1912K kmalloc-256 6528 4625 70% 0.50K 204 32 3264K kmalloc-512 5921 5921 100% 1.01K 191 31 6112K nfs_inode_cache 5355 5231 97% 0.19K 255 21 1020K kmalloc-192 4875 4875 100% 0.62K 195 25 3120K proc_inode_cache > cat /proc/zoneinfo Node 0, zone DMA per-node stats nr_inactive_anon 4889 nr_active_anon 24649 nr_inactive_file 14816805 nr_active_file 979289 nr_unevictable 0 nr_isolated_anon 0 nr_isolated_file 0 nr_pages_scanned 0 workingset_refault 2495 workingset_activate 2218 workingset_nodereclaim 99 nr_anon_pages 29086 nr_mapped 39406 nr_file_pages 884480 nr_dirty 24 nr_writeback 0 nr_writeback_temp 0 nr_shmem 452 nr_shmem_hugepages 0 nr_shmem_pmdmapped 0 nr_anon_transparent_hugepages 0 nr_unstable 0 nr_vmscan_write 0 nr_vmscan_immediate_reclaim 0 nr_dirtied 23357107 nr_written 7998186 pages free 3971 min 4 low 7 high 10 node_scanned 0 spanned 4095 present 3993 managed 3972 nr_free_pages 3971 nr_zone_inactive_anon 0 nr_zone_active_anon 0 nr_zone_inactive_file 0 nr_zone_active_file 0 nr_zone_unevictable 0 nr_zone_write_pending 0 nr_mlock 0 nr_slab_reclaimable 0 nr_slab_unreclaimable 1 nr_page_table_pages 0 nr_kernel_stack 0 nr_bounce 0 nr_zspages 0 numa_hit 1 numa_miss 0 numa_foreign 0 numa_interleave 0 numa_local 1 numa_other 0 nr_free_cma 0 protection: (0, 1836, 64280, 64280, 64280) pagesets cpu: 0 count: 0 high: 0 batch: 1 vm stats threshold: 8 cpu: 1 count: 0 high: 0 batch: 1 vm stats threshold: 8 cpu: 2 count: 0 high: 0 batch: 1 vm stats threshold: 8 cpu: 3 count: 0 high: 0 batch: 1 vm stats threshold: 8 cpu: 4 count: 0 high: 0 batch: 1 vm stats threshold: 8 cpu: 5 count: 0 high: 0 batch: 1 vm stats threshold: 8 cpu: 6 count: 0 high: 0 batch: 1 vm stats threshold: 8 cpu: 7 count: 0 high: 0 batch: 1 vm stats threshold: 8 node_unreclaimable: 0 start_pfn: 1 node_inactive_ratio: 0 Node 0, zone DMA32 pages free 63365 min 482 low 952 high 1422 node_scanned 0 spanned 1044480 present 491379 managed 474987 nr_free_pages 63365 nr_zone_inactive_anon 0 nr_zone_active_anon 0 nr_zone_inactive_file 393462 nr_zone_active_file 6931 nr_zone_unevictable 0 nr_zone_write_pending 0 nr_mlock 0 nr_slab_reclaimable 10273 nr_slab_unreclaimable 6 nr_page_table_pages 0 nr_kernel_stack 0 nr_bounce 0 nr_zspages 0 numa_hit 470243 numa_miss 0 numa_foreign 0 numa_interleave 0 numa_local 470243 numa_other 0 nr_free_cma 0 protection: (0, 0, 62443, 62443, 62443) pagesets cpu: 0 count: 172 high: 186 batch: 31 vm stats threshold: 40 cpu: 1 count: 177 high: 186 batch: 31 vm stats threshold: 40 cpu: 2 count: 166 high: 186 batch: 31 vm stats threshold: 40 cpu: 3 count: 70 high: 186 batch: 31 vm stats threshold: 40 cpu: 4 count: 174 high: 186 batch: 31 vm stats threshold: 40 cpu: 5 count: 68 high: 186 batch: 31 vm stats threshold: 40 cpu: 6 count: 56 high: 186 batch: 31 vm stats threshold: 40 cpu: 7 count: 60 high: 186 batch: 31 vm stats threshold: 40 node_unreclaimable: 0 start_pfn: 4096 node_inactive_ratio: 0 Node 0, zone Normal pages free 72132 min 16409 low 32394 high 48379 node_scanned 0 spanned 16252928 present 16252928 managed 15986638 nr_free_pages 72132 nr_zone_inactive_anon 4889 nr_zone_active_anon 24649 nr_zone_inactive_file 14423343 nr_zone_active_file 972358 nr_zone_unevictable 0 nr_zone_write_pending 24 nr_mlock 0 nr_slab_reclaimable 414520 nr_slab_unreclaimable 21659 nr_page_table_pages 1863 nr_kernel_stack 3568 nr_bounce 0 nr_zspages 0 numa_hit 177613519 numa_miss 0 numa_foreign 0 numa_interleave 40826 numa_local 177613519 numa_other 0 nr_free_cma 0 protection: (0, 0, 0, 0, 0) pagesets cpu: 0 count: 178 high: 186 batch: 31 vm stats threshold: 80 cpu: 1 count: 181 high: 186 batch: 31 vm stats threshold: 80 cpu: 2 count: 176 high: 186 batch: 31 vm stats threshold: 80 cpu: 3 count: 130 high: 186 batch: 31 vm stats threshold: 80 cpu: 4 count: 165 high: 186 batch: 31 vm stats threshold: 80 cpu: 5 count: 76 high: 186 batch: 31 vm stats threshold: 80 cpu: 6 count: 173 high: 186 batch: 31 vm stats threshold: 80 cpu: 7 count: 134 high: 186 batch: 31 vm stats threshold: 80 node_unreclaimable: 0 start_pfn: 1048576 node_inactive_ratio: 0
Jan Kara and Vlastimil Babka of SUSE tracked this down to commit 99579ccec4e2 "xfs: skip dirty pages in ->releasepage()" in kernel, patch was proposed to stable.
*********** MASS BUG UPDATE ************** We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 25 kernel bugs. Fedora 25 has now been rebased to 4.9.3-200.fc25. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 26, and are still experiencing this issue, please change the version to Fedora 26. If you experience different issues, please open a new bug report for those.
Fixed by commit 0a417b8dc1f10b03e8f558b8a831f07ec4c23795 in mainline for 4.10.