| Summary: | khugepaged slows down systems | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Bernd Schubert <bernd.schubert> |
| Component: | kernel | Assignee: | Red Hat Kernel Manager <kernel-mgr> |
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Red Hat Kernel QE team <kernel-qe> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 6.1 | CC: | aarcange, aquini, vscdieter |
| Target Milestone: | rc | Flags: | bernd.schubert:
needinfo-
|
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-09-08 11:37:39 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Bernd Schubert
2011-11-23 17:47:40 UTC
It looks like the tlb flush couldn't be delivered, this looks quite unrelated to khugepaged. Could you verify that the NMI watchdog is enabled by checking `grep NMI /proc/interrupts` increasing? If it's something keeping irqs disabled, I guess similar issues would happen if the system was swapping and IPI had to be delivered for other reasons. Did you try some swapping workload, does that hang or not? Can we check the source of ghgfs to search for IPI delivery or paths that keeps irq disabled? The only bug that could lead to high khugepaged utilization was a compaction bug that has been fixed in kernel-2.6.32-169.el6 but I don't see compaction in the above stack traces. It may still be worth trying with a more recent RHEL6.2 kernel just in case it's related to that but it doesn't look like that. The output of 'grep NMI /proc/interrupts' on r03n32: NMI: 2816 2812 2811 2811 2811 2811 2810 2810 2811 2812 2810 2810 2811 2811 2810 2810 Non-maskable interrupts and climbing when there is load on the compute node. There is no swap space defined on r03n32. Since RHEL 6.3 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. Dieter, Is this problem still an issue for you, or did it get resolved upgrading to a more recent kernel? Thanks, Jes Jes, the problem is still there, although it does not occur every day. We did not yet upgrade to another kernel version. Which version would you recommend? Best regards Dieter I think this bugzilla can be closed. I can't provide the required information, as I simply don't have them. I would close it myself, but the system so far does not allow me to do so. |