Bug 756471

Summary: khugepaged slows down systems
Product: Red Hat Enterprise Linux 6 Reporter: Bernd Schubert <bernd.schubert>
Component: kernelAssignee: Red Hat Kernel Manager <kernel-mgr>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.1CC: aarcange, aquini, vscdieter
Target Milestone: rcFlags: bernd.schubert: needinfo-
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-09-08 11:37:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Bernd Schubert 2011-11-23 17:47:40 UTC
Description of problem:

We got a number of customer reports that FhGFS is running extremely slowly and that even stopping our fhgfs daemons does not work. It then turned out this those issues had not been related to FhGFS at all, but khugepaged extremely slowed down those systems.

Version-Release number of selected component (if applicable):

2.6.32-131.17.1.el6.x86_64 

How reproducible:

Unknown, as there are many fhgfs clients nodes involved with and those again have different applications running. 

  
Actual results:

khugepaged running at 100% CPU and slowing down everything.


Expected results:

Although our customers report an approximately 10% higher IO throughput if transparent-huge pages are enabled (and are in working state), it still should be disabled in default setups, IMHO, until this feature is being stabilized.

Additional info:

kernel version: 2.6.32-131.17.1.el6.x86_64

Hardware
--------
* Motherboard ASUS KGMH-D16/QDR
* CPU AMD Opteron(tm) Processor 6132 HE
* 32 GB RAM (8x4 GB DDR3)


Nov 23 14:23:34 r03n32 kernel: BUG: soft lockup - CPU#1 stuck for 67s! [khugepaged:183]
Nov 23 14:23:34 r03n32 kernel: Modules linked in: fhgfs(P)(U) fhgfs_client_opentk(U) nfs lockd fscache nfs_acl auth_rpcgss sunrpc rdma_ucm(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ib_cm(U) ib_sa(U) ipv6 ib_uverbs(U) ib_umad(U) iw_nes(U) libcrc32c iw_cxgb3(U) cxgb3(U) ib_qib(U) dca mlx4_ib(U) mlx4_en(U) mlx4_core(U) ib_mthca(U) ib_mad(U) ib_core(U) dm_mirror dm_region_hash dm_log e1000e serio_raw ghes hed k10temp hwmon amd64_edac_mod edac_core edac_mce_amd i2c_piix4 i2c_core sg shpchp ext3 jbd mbcache sd_mod crc_t10dif pata_acpi ata_generic pata_atiixp ahci dm_mod [last unloaded: scsi_wait_scan]
Nov 23 14:23:34 r03n32 kernel: Modules linked in: fhgfs(P)(U) fhgfs_client_opentk(U) nfs lockd fscache nfs_acl auth_rpcgss sunrpc rdma_ucm(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ib_cm(U) ib_sa(U) ipv6 ib_uverbs(U) ib_umad(U) iw_nes(U) libcrc32c iw_cxgb3(U) cxgb3(U) ib_qib(U) dca mlx4_ib(U) mlx4_en(U) mlx4_core(U) ib_mthca(U) ib_mad(U) ib_core(U) dm_mirror dm_region_hash dm_log e1000e serio_raw ghes hed k10temp hwmon amd64_edac_mod edac_core edac_mce_amd i2c_piix4 i2c_core sg shpchp ext3 jbd mbcache sd_mod crc_t10dif pata_acpi ata_generic pata_atiixp ahci dm_mod [last unloaded: scsi_wait_scan]
Nov 23 14:23:34 r03n32 kernel: Call Trace:
Nov 23 14:23:34 r03n32 kernel: [<ffffffff81046728>] ? flush_tlb_others_ipi+0x128/0x130
Nov 23 14:23:34 r03n32 kernel: [<ffffffff810467a6>] ? native_flush_tlb_others+0x76/0x90
Nov 23 14:23:34 r03n32 kernel: [<ffffffff8104697c>] ? flush_tlb_mm+0x5c/0xa0
Nov 23 14:23:34 r03n32 kernel: [<ffffffff8116b1a7>] ? khugepaged+0xb87/0x1210
Nov 23 14:23:34 r03n32 kernel: [<ffffffff8106067b>] ? dequeue_task_fair+0x8b/0x90
Nov 23 14:23:34 r03n32 kernel: [<ffffffff8108e180>] ? autoremove_wake_function+0x0/0x40
Nov 23 14:23:34 r03n32 kernel: [<ffffffff8116a620>] ? khugepaged+0x0/0x1210
Nov 23 14:23:34 r03n32 kernel: [<ffffffff8108de16>] ? kthread+0x96/0xa0
Nov 23 14:23:34 r03n32 kernel: [<ffffffff8100c1ca>] ? child_rip+0xa/0x20
Nov 23 14:23:34 r03n32 kernel: [<ffffffff8108dd80>] ? kthread+0x0/0xa0
Nov 23 14:23:34 r03n32 kernel: [<ffffffff8100c1c0>] ? child_rip+0x0/0x20

Nov 23 14:24:58 r03n32 kernel: BUG: soft lockup - CPU#1 stuck for 67s! [khugepaged:183]
Nov 23 14:24:58 r03n32 kernel: Modules linked in: fhgfs(P)(U) fhgfs_client_opentk(U) nfs lockd fscache nfs_acl auth_rpcgss sunrpc rdma_ucm(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ib_cm(U) ib_sa(U) ipv6 ib_uverbs(U) ib_umad(U) iw_nes(U) libcrc32c iw_cxgb3(U) cxgb3(U) ib_qib(U) dca mlx4_ib(U) mlx4_en(U) mlx4_core(U) ib_mthca(U) ib_mad(U) ib_core(U) dm_mirror dm_region_hash dm_log e1000e serio_raw ghes hed k10temp hwmon amd64_edac_mod edac_core edac_mce_amd i2c_piix4 i2c_core sg shpchp ext3 jbd mbcache sd_mod crc_t10dif pata_acpi ata_generic pata_atiixp ahci dm_mod [last unloaded: scsi_wait_scan]
Nov 23 14:24:58 r03n32 kernel: Modules linked in: fhgfs(P)(U) fhgfs_client_opentk(U) nfs lockd fscache nfs_acl auth_rpcgss sunrpc rdma_ucm(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ib_cm(U) ib_sa(U) ipv6 ib_uverbs(U) ib_umad(U) iw_nes(U) libcrc32c iw_cxgb3(U) cxgb3(U) ib_qib(U) dca mlx4_ib(U) mlx4_en(U) mlx4_core(U) ib_mthca(U) ib_mad(U) ib_core(U) dm_mirror dm_region_hash dm_log e1000e serio_raw ghes hed k10temp hwmon amd64_edac_mod edac_core edac_mce_amd i2c_piix4 i2c_core sg shpchp ext3 jbd mbcache sd_mod crc_t10dif pata_acpi ata_generic pata_atiixp ahci dm_mod [last unloaded: scsi_wait_scan]
Nov 23 14:24:58 r03n32 kernel: Call Trace:
Nov 23 14:24:58 r03n32 kernel: [<ffffffff810467a6>] ? native_flush_tlb_others+0x76/0x90
Nov 23 14:24:58 r03n32 kernel: [<ffffffff8104697c>] ? flush_tlb_mm+0x5c/0xa0
Nov 23 14:24:58 r03n32 kernel: [<ffffffff8116b1a7>] ? khugepaged+0xb87/0x1210
Nov 23 14:24:58 r03n32 kernel: [<ffffffff8106067b>] ? dequeue_task_fair+0x8b/0x90
Nov 23 14:24:58 r03n32 kernel: [<ffffffff8108e180>] ? autoremove_wake_function+0x0/0x40
Nov 23 14:24:58 r03n32 kernel: [<ffffffff8116a620>] ? khugepaged+0x0/0x1210
Nov 23 14:24:58 r03n32 kernel: [<ffffffff8108de16>] ? kthread+0x96/0xa0
Nov 23 14:24:58 r03n32 kernel: [<ffffffff8100c1ca>] ? child_rip+0xa/0x20
Nov 23 14:24:58 r03n32 kernel: [<ffffffff8108dd80>] ? kthread+0x0/0xa0
Nov 23 14:24:58 r03n32 kernel: [<ffffffff8100c1c0>] ? child_rip+0x0/0x20

Comment 2 Andrea Arcangeli 2011-11-25 13:57:05 UTC
It looks like the tlb flush couldn't be delivered, this looks quite unrelated to khugepaged. Could you verify that the NMI watchdog is enabled by checking `grep NMI /proc/interrupts` increasing?

If it's something keeping irqs disabled, I guess similar issues would happen if the system was swapping and IPI had to be delivered for other reasons. Did you try some swapping workload, does that hang or not?

Can we check the source of ghgfs to search for IPI delivery or paths that keeps irq disabled?

The only bug that could lead to high khugepaged utilization was a compaction bug that has been fixed in kernel-2.6.32-169.el6 but I don't see compaction in the above stack traces. It may still be worth trying with a more recent RHEL6.2 kernel just in case it's related to that but it doesn't look like that.

Comment 3 Dieter Kvasnicka 2011-11-28 09:43:15 UTC
The output of 'grep NMI /proc/interrupts' on r03n32:
  
NMI:       2816       2812       2811       2811       2811       2811       2810       2810       2811       2812       2810       2810       2811       2811       2810       2810   Non-maskable interrupts

and climbing when there is load on the compute node.

There is no swap space defined on r03n32.

Comment 5 RHEL Program Management 2012-05-03 05:28:19 UTC
Since RHEL 6.3 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 6 Jes Sorensen 2012-12-06 14:32:08 UTC
Dieter,

Is this problem still an issue for you, or did it get resolved upgrading to
a more recent kernel?

Thanks,
Jes

Comment 7 Dieter Kvasnicka 2012-12-06 15:36:57 UTC
Jes,

the problem is still there, although it does not occur every day.

We did not yet upgrade to another kernel version. Which version would you recommend?

Best regards

Dieter

Comment 8 Bernd Schubert 2014-09-08 11:17:49 UTC
I think this bugzilla can be closed. I can't provide the required information, as I simply don't have them. 
I would close it myself, but the system so far does not allow me to do so.