Bug 735946
Summary: | khugepaged stalls system | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Slawomir Czarko <slawomir.czarko> |
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> |
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 15 | CC: | andyrojas, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | kernel-2.6.41.1-1.fc15 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2011-11-17 23:28:55 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Slawomir Czarko
2011-09-06 09:00:33 UTC
Unfortunately after executing: echo madvise > /sys/kernel/mm/transparent_hugepage/defrag I get OOM errors and applications being killed even that system has lots of free RAM. OOM errors get triggered by rsync on a large directory for example. This patch fixed the problem for me: https://lkml.org/lkml/2011/7/26/103 (In reply to comment #2) > This patch fixed the problem for me: > > https://lkml.org/lkml/2011/7/26/103 Thanks for the URL. That thread seems to have stalled with no alternative solution. I've contacted the two developers involved and hopefully we'll get some kind of resolution. A new set of patches has been posted for this issue. You can find the thread here: http://thread.gmane.org/gmane.linux.kernel/1200542 Would it be possible for you to test those two patches and let us know if they resolve the issues you were seeing? I'm building kernel with these patches now. Will let you know in a few days if these patches work. The new patches work. Btw, I was unable to reproduce the problem with VMware Player 4.0 and unpatched kernel. I can reproduce it with VMware Player 3.1.4. I've added these patches to the f15 kernel (as well as rawhide/f16). They will be in the next build. kernel-2.6.40.7-0.fc15 has been submitted as an update for Fedora 15. https://admin.fedoraproject.org/updates/kernel-2.6.40.7-0.fc15 Package kernel-2.6.40.7-0.fc15: * should fix your issue, * was pushed to the Fedora 15 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kernel-2.6.40.7-0.fc15' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2011-14513 then log in and leave karma (feedback). kernel-2.6.40.7-3.fc15 has been submitted as an update for Fedora 15. https://admin.fedoraproject.org/updates/kernel-2.6.40.7-3.fc15 I'm running kernel patched with patches from comment#4 and today I was getting some stalls again when working with a Windows VM. It's not as bad as before but noticeable. Output of cat /proc/`pgrep khugepaged`/io shows: rchar: 0 wchar: 0 syscr: 0 syscw: 0 read_bytes: 0 write_bytes: 8192 cancelled_write_bytes: 0 (this command was mentioned here https://lkml.org/lkml/2011/9/20/261) kernel-2.6.40.8-2.fc15 has been submitted as an update for Fedora 15. https://admin.fedoraproject.org/updates/kernel-2.6.40.8-2.fc15 kernel-2.6.40.8-4.fc15 has been submitted as an update for Fedora 15. https://admin.fedoraproject.org/updates/kernel-2.6.40.8-4.fc15 Kernel 2.6.40.6-0.fc15.i686 on Toshiba NB305 After around 1-1/2 hours of leaving the laptop untouched, but with at least one application open (e.g., Firefox, LibreOffice), the computer becomes unresponsive (mouse, keyboard, black screen when left on X desktop). Last time it happened I left it on tty2, and once the hour and so had passed, everything was unresponsive, but this time the screen was displaying repeated bug messages. The message that kept being displayed again and again was something like: BUG: soft lockup CPU #1 khugepaged: 28 Sorry for not being able to be more specific. (In reply to comment #15) Update to kernel-2.6.40.8-4.fc15 seems to have fixed the BUG. (In reply to comment #16) > (In reply to comment #15) > Update to kernel-2.6.40.8-4.fc15 seems to have fixed the BUG. It didn't. It made it worse. Now it just stalls after a few minutes. I can't even get to switch to TTY2 to try to see if there are any debugging messages. kernel-2.6.41.1-1.fc15 has been submitted as an update for Fedora 15. https://admin.fedoraproject.org/updates/kernel-2.6.41.1-1.fc15 kernel-2.6.41.1-1.fc15 has been pushed to the Fedora 15 stable repository. If problems still persist, please make note of it in this bug report. |