Bug 1066702
Summary: | Hugepage allocations hang on numa nodes with insufficient memory | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Sterling Alexander <stalexan> |
Component: | kernel | Assignee: | Rafael Aquini <aquini> |
Status: | CLOSED ERRATA | QA Contact: | Li Wang <liwan> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 6.5 | CC: | aquini, kernel-mgr, liwan, loberman, lwoodman, pdwyer, pholasek, stalexan, yanwang |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | kernel-2.6.32-542.el6 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-07-22 08:04:23 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1075802, 1159933 |
Description
Sterling Alexander
2014-02-18 22:59:55 UTC
Hi Sterling, thank you for your report, it really seems like buggy behaviour on 6.5 side. How it hangs exatly? Could you please provide dmesg output after hang? I've tried to reserve similar HP DL580 G7 in Beaker but without success, yet. Would it be possible to provide access to your machine? thanks, Petr Hello Petr, I helped the customer capture a dump on boot after hang. We will get the dump. Notes: We captured a crashdump from the boot hang. In it I see the only active task that is not swapper is sysctl: crash> files 37729 PID: 37729 TASK: ffff89bfa52a6040 CPU: 50 COMMAND: "sysctl" ROOT: / CWD: / FD FILE DENTRY INODE TYPE PATH 0 ffff884059093540 ffff89bfa7ce7f00 ffff8a3fa6a53108 CHR /dev/console 1 ffff8a3fa2bd0b00 ffff88bfa7cbd180 ffff893fa7b0d108 CHR /dev/null 2 ffff8a3fa2bd0b00 ffff88bfa7cbd180 ffff893fa7b0d108 CHR /dev/null 3 ffff8aafb601c8c0 ffff8aafb34b4780 ffff8aafb7814ca8 REG /etc/sysctl.conf 4 ffff8aafb3a48080 ffff8aafb34b4900 ffff8aafb37d7ab8 REG /proc/sys/vm/nr_hugepages crash> bt 37729 PID: 37729 TASK: ffff89bfa52a6040 CPU: 50 COMMAND: "sysctl" ... RIP: ffffffff8152a357 RSP: ffff89bfa0d8da08 RFLAGS: 00000246 RAX: 0000000000000000 RBX: ffff89bfa0d8da08 RCX: ffffea0c7e632090 RDX: ffff8b8000019380 RSI: 0000000000000246 RDI: 0000000000000246 RBP: ffffffff8100bb8e R8: 0000000000000001 R9: ffff89bfa0d8dae8 R10: ffff8b8000445e80 R11: 0000000000000000 R12: ffff8b8000444e00 R13: 0000000000000000 R14: ffff89bfa0d8da08 R15: ffffffff8100bb8e ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018 #16 [ffff89bfa0d8da10] compact_zone at ffffffff8116a06b #17 [ffff89bfa0d8dad0] compact_zone_order at ffffffff8116a81c #18 [ffff89bfa0d8db80] try_to_compact_pages at ffffffff8116a951 #19 [ffff89bfa0d8dbf0] __alloc_pages_direct_compact at ffffffff8112f1ba #20 [ffff89bfa0d8dc60] __alloc_pages_nodemask at ffffffff8112f69f #21 [ffff89bfa0d8dda0] alloc_fresh_huge_page at ffffffff81160ede #22 [ffff89bfa0d8ddd0] set_max_huge_pages at ffffffff81161714 #23 [ffff89bfa0d8de20] hugetlb_sysctl_handler_common at ffffffff81163873 #24 [ffff89bfa0d8de70] hugetlb_sysctl_handler at ffffffff811638ee #25 [ffff89bfa0d8de80] proc_sys_call_handler at ffffffff811fd6f7 #26 [ffff89bfa0d8dee0] proc_sys_write at ffffffff811fd744 #27 [ffff89bfa0d8def0] vfs_write at ffffffff81188f78 #28 [ffff89bfa0d8df30] sys_write at ffffffff81189871 #29 [ffff89bfa0d8df80] system_call_fastpath at ... The zonelist at 0xffff8b800002ade0 was passed to __alloc_pages_direct_compact. That zonelist is node_zonelists[1] of: crash> pg_data_t.node_id 0xffff8b80000001c0 node_id = 0x7 So it seems likely the reproduction using numactl coupled with nr_hugepages_mempolicy replicates what is happening during boot on this customer's system. Let me know if you require the vmcore obtained from the boot hang 2014-02-18. =- Curt Hi Curt, thanks a lot for the backtrace. I've just loaned similar G7 machine, but if you have a chance, please upload vmcore somewhere. thanks, Petr Petr, I can likely reproduce and get a forced crash. I was able to reproduce the issue of the hang once booted using numactl -m 0-1 echo 102400 > /proc/sys/vm/nr_hugepages_mempolicy. I was never able to reproduce the hard hang on boot when simply setting a value for hugepages in /etc/sysctl.conf as long as that fitted into the memory across all numa nodes. I only have 256GB though, and that takes around 5 to 6 minutes to allocate so with 1.2TB that would take some time to complete allocations on boot. In the numactl -m 0-1 echo 102400 > /proc/sys/vm/nr_hugepages_mempolicy test the prior kernel version (6.2) exits out and does not hang but does not produce a warning that it could not allocate the memory asked for. The newer kernel (6.5) indeed does hang. Let me know if you want a froced crash after bootup using numactl -m 0-1 echo 102400 > /proc/sys/vm/nr_hugepages_mempolicy. Here are the notes from my testing ------------------------------------ Testing now on the 431 stock 6.5 kernel and attempting to allocate 230GB so I will cross all 4 numa nodes. The numactl -m 0-1 echo 102400 > /proc/sys/vm/nr_hugepages_mempolicy hangs in my lab on 6.5 (431.5.1). It does not hang on 6.2. Hello, (In reply to loberman from comment #5) > Petr, > > I can likely reproduce and get a forced crash. I was able to reproduce the > issue of the hang once booted using numactl -m 0-1 echo 102400 > > /proc/sys/vm/nr_hugepages_mempolicy. > > I was never able to reproduce the hard hang on boot when simply setting a > value for hugepages in /etc/sysctl.conf as long as that fitted into the > memory across all numa nodes. I only have 256GB though, and that takes > around 5 to 6 minutes to allocate so with 1.2TB that would take some time > to complete allocations on boot. > > In the numactl -m 0-1 echo 102400 > /proc/sys/vm/nr_hugepages_mempolicy test > the prior kernel version (6.2) exits out and does not hang but does not > produce a warning that it could not allocate the memory asked for. > > The newer kernel (6.5) indeed does hang. > > Let me know if you want a froced crash after bootup using numactl -m 0-1 > echo 102400 > /proc/sys/vm/nr_hugepages_mempolicy. sorry for a late reply. Yes please, I'd love to look into vmcore. Thank you! > > Here are the notes from my testing > ------------------------------------ > Testing now on the 431 stock 6.5 kernel and attempting to allocate 230GB so > I will cross all 4 numa nodes. > > The numactl -m 0-1 echo 102400 > /proc/sys/vm/nr_hugepages_mempolicy hangs > in my lab on 6.5 (431.5.1). It does not hang on 6.2. Hi, I've completed testing of a few kernels and can confirm that issue is not related to libhugetlbfs. I didn't see the issue with kernel -279 (rhel-6.3), but it appeared on same system with kernel -358 (rhel-6.4). And I was even able to reproduce it on recent vanilla 3.15.0-rc1. Testing note: I've seen the issue only on machines with > ~100G of memory spreaded among 2+ nodes. So reassigning to kernel component. 6.5 2.6.32-431.11.2.el6.x86_64 Hangs here: numactl -m 0-1 echo 102400 > /proc/sys/vm/nr_hugepages_mempolicy Where are we blocked: 34.55% [kernel] [k] compact_zone 19.99% [kernel] [k] get_pageblock_flags_group 13.44% [kernel] [k] _spin_lock_irqsave 13.11% [kernel] [k] native_write_msr_safe 8.21% [kernel] [k] compact_checklock_irqsave 0.81% [kernel] [k] _spin_unlock_irqrestore 0.44% [kernel] [k] __reset_isolation_suitable 0.32% [kernel] [k] tick_nohz_stop_sched_tick -------------------------------------------------------------------------- 6.3 kernel-2.6.32-279.el6.x86_64 This command returns to # numactl -m 0-1 echo 102400 > /proc/sys/vm/nr_hugepages_mempolicy # sysctl -a | grep huge vm.nr_hugepages = 95993 vm.nr_hugepages_mempolicy = 95993 -------------------------------------------------------------------------- 6.4 kernel kernel-2.6.32-358.el6.x86_64 # sysctl -a | grep huge vm.nr_hugepages = 32768 vm.nr_hugepages_mempolicy = 32768 numactl -m 0-1 echo 102400 > /proc/sys/vm/nr_hugepages_mempolicy Hangs here 59.35% [kernel] [k] _spin_lock_irqsave 10.97% [kernel] [k] native_write_msr_safe 8.79% [kernel] [k] compact_zone 4.40% [kernel] [k] get_pageblock_flags_group 3.98% [kernel] [k] smp_call_function_many 1.97% [kernel] [k] compact_checklock_irqsave 0.83% [kernel] [k] tick_nohz_stop_sched_tick Forced crash and will find a place to provide vmcore after analyzing. Thanks Loberman With the 6.4 kernel I do see the mempolicy hugepages allocated but we never return from the numactl command. I forced another crash and can make the vmcore available. Thanks loberman Hello Please can I have an update, I will also go find Larry Woodman. Hi Larry, Any updates to share with the customer yet ? Thanks again for helping here. loberman Patch(es) available on kernel-2.6.32-542.el6 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1272.html |