Bug 633825

Summary:

kswapd0 100%

Product:

Red Hat Enterprise Linux 6

Reporter:

Leslie <lphartm>

Component:

kernel

Assignee:

Johannes Weiner <jweiner>

Status:

CLOSED ERRATA

QA Contact:

Caspar Zhang <czhang>

Severity:

urgent

Docs Contact:

Priority:

high

Version:

6.1

CC:

dhoward, esandeen, ian.chard, jweiner, jwest, lwang, lwoodman, mzywusko, qcai, riel

Target Milestone:

Keywords:

ZStream

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

kernel-2.6.32-112.el6

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2011-05-19 12:19:12 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

694186

Attachments:

Description	Flags
/var/log/message output	none
/proc/zoneinfo	none
/var/log/messages	none
[patch] mm: skip rebalance on hopeless zone	none
[patch v2] mm: skip rebalance on hopeless zone	none

Description Leslie 2010-09-14 13:46:52 UTC

Description of problem:
Installed RHEL6.0-20100715.2-Server-x86_64-DVD1.iso in VMWare Workstation 7.1.1.
Install went fine, but after the system came up kswapd0 is running at 100%

Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_zebra-lv_root
                       25G  5.2G   18G  23% /
tmpfs                 1.4G  272K  1.4G   1% /dev/shm
/dev/sda1             485M   40M  420M   9% /boot

I assigned 3 Gigs of memory to the VM.

I haven't had a chance to reload it again and see if id does the same thing.

How reproducible:

Haven't tried yet.


Steps to Reproduce:
1.
2.
3.

Comment 2 Eric Sandeen 2010-09-14 16:28:39 UTC

sysrq-t may give you a backtrace of kswapd to see where it's at.

Comment 3 Leslie 2010-09-15 05:28:13 UTC

(In reply to comment #2)
> sysrq-t may give you a backtrace of kswapd to see where it's at.

what is the key sequence for sysrq-t?

Comment 4 Eric Sandeen 2010-09-15 15:08:29 UTC

Using sysrq is described in the kernel docs, i.e.

/usr/share/doc/kernel-doc-2.6.32/Documentation/sysrq.txt


from the kernel-doc rpm.

Depends on if you're on the physical console, etc.  Output will go to dmesg and/or/var/log/messages.

Thanks,
-Eric

Comment 5 Leslie 2010-09-16 14:14:58 UTC

Created attachment 447756 [details]
/var/log/message output

output from sysrq t

Comment 6 Eric Sandeen 2010-09-16 16:26:21 UTC

So, kswapd is here:

Sep 16 07:05:51 zebra kernel: kswapd0       R  running task        0    34      2 0x00000000
...
Sep 16 07:05:51 zebra kernel: Call Trace:
Sep 16 07:05:51 zebra kernel: [<ffffffff810668ea>] __cond_resched+0x2a/0x40
Sep 16 07:05:51 zebra kernel: [<ffffffff814d8800>] _cond_resched+0x30/0x40
Sep 16 07:05:51 zebra kernel: [<ffffffff81124d05>] balance_pgdat+0x335/0x760
Sep 16 07:05:51 zebra kernel: [<ffffffff811253f0>] ? isolate_pages_global+0x0/0x250
Sep 16 07:05:51 zebra kernel: [<ffffffff8112524e>] kswapd+0x11e/0x2c0
Sep 16 07:05:51 zebra kernel: [<ffffffff81090d50>] ? autoremove_wake_function+0x0/0x40
Sep 16 07:05:51 zebra kernel: [<ffffffff81125130>] ? kswapd+0x0/0x2c0
Sep 16 07:05:51 zebra kernel: [<ffffffff810909e6>] kthread+0x96/0xa0
Sep 16 07:05:51 zebra kernel: [<ffffffff810141ca>] child_rip+0xa/0x20
Sep 16 07:05:51 zebra kernel: [<ffffffff81090950>] ? kthread+0x0/0xa0


If we're in cond_resched, we're here:

out:
        if (!all_zones_ok) {
                cond_resched();
                ...
                goto loop_again;
        }

so looks like we're never getting out of this function.

I'm no expert here, I'll pass this off to someone who is :)

-Eric

Comment 7 Rik van Riel 2010-09-16 16:37:58 UTC

Leslie,

would you have output of /proc/zoneinfo so we can see if any of the memory zones really are low on memory?

Also, are you running any programs on the system that could be using the memory?

Comment 8 Leslie 2010-09-17 13:22:20 UTC

Created attachment 448008 [details]
/proc/zoneinfo

Comment 9 Leslie 2010-09-17 13:24:17 UTC

(In reply to comment #7)
> Leslie,
> 
> would you have output of /proc/zoneinfo so we can see if any of the memory
> zones really are low on memory?
> 
> Also, are you running any programs on the system that could be using the
> memory?

Rik:

I assigned 3 Gig to the VM and have not run anything but a browser.

Top is reporting over 2 Gigs free.

Mem:   2914984k total,   790280k used,  2124704k free,   183660k buffers
Swap:  5144568k total,        0k used,  5144568k free,   188704k cached

Leslie

Comment 10 Rik van Riel 2010-09-17 13:52:23 UTC

Looks like we found the problem.  This virtual machine has a tiny (24MB) ZONE_NORMAL, which has been pretty much completely filled up with unreclaimable slab pages.

As a consequence, kswapd tries to free pages from this zone, but is not succeeding. 

Node 0, zone   Normal
  pages free     0
        min      131
        low      163
        high     196
        scanned  0
        spanned  6144
        present  6060
    nr_free_pages 0
    nr_inactive_anon 0
    nr_active_anon 0
    nr_inactive_file 0
    nr_active_file 0
    nr_unevictable 0
    nr_mlock     0
    nr_anon_pages 0
    nr_mapped    0
    nr_file_pages 0
    nr_dirty     0
    nr_writeback 0
    nr_slab_reclaimable 0
    nr_slab_unreclaimable 6136

As a test, could you decrease the amount of memory the virtual machine has by 50MB and see if the issue still happens?

Also, could you attach the full output of "dmesg", so we can see the memory layout in the virtual machine?

Comment 11 Leslie 2010-09-17 14:26:43 UTC

Created attachment 448023 [details]
/var/log/messages

Comment 12 Leslie 2010-09-17 14:30:15 UTC

(In reply to comment #10)
> Looks like we found the problem.  This virtual machine has a tiny (24MB)
> ZONE_NORMAL, which has been pretty much completely filled up with unreclaimable
> slab pages.
> 
> As a consequence, kswapd tries to free pages from this zone, but is not
> succeeding. 
> 
> Node 0, zone   Normal
>   pages free     0
>         min      131
>         low      163
>         high     196
>         scanned  0
>         spanned  6144
>         present  6060
>     nr_free_pages 0
>     nr_inactive_anon 0
>     nr_active_anon 0
>     nr_inactive_file 0
>     nr_active_file 0
>     nr_unevictable 0
>     nr_mlock     0
>     nr_anon_pages 0
>     nr_mapped    0
>     nr_file_pages 0
>     nr_dirty     0
>     nr_writeback 0
>     nr_slab_reclaimable 0
>     nr_slab_unreclaimable 6136
> 
> As a test, could you decrease the amount of memory the virtual machine has by
> 50MB and see if the issue still happens?
> 
> Also, could you attach the full output of "dmesg", so we can see the memory
> layout in the virtual machine?

Rik:

Sure enough, I reduced the memory down by 50 megs and kswapd stoped running at
100%.

Top:
Tasks: 174 total,   3 running, 171 sleeping,   0 stopped,   0 zombie
Cpu(s):  5.7%us,  2.8%sy,  0.0%ni, 91.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   2933932k total,   730124k used,  2203808k free,    36252k buffers
Swap:  5144568k total,        0k used,  5144568k free,   268932k cached

Comment 13 Johannes Weiner 2010-09-22 19:31:51 UTC

Created attachment 449028 [details]
[patch] mm: skip rebalance on hopeless zone

Leslie,

could you test this patch and tell us if it fixes the problem?

Comment 14 Rik van Riel 2010-09-22 19:55:22 UTC

@@ -2320,7 +2338,7 @@ void wakeup_kswapd(struct zone *zone, int order)
 		return;
 
 	pgdat = zone->zone_pgdat;
-	if (zone_watermark_ok(zone, order, low_wmark_pages(zone), 0, 0))
+	if (zone_needs_scan(zone, order, low_wmark_pages(zone), 0))
 		return;
 	if (pgdat->kswapd_max_order < order)
 		pgdat->kswapd_max_order = order;

Johannes, shouldn't the above be "if (!zone_needs_scan(zone, ...." ?

Comment 15 Johannes Weiner 2010-09-22 23:57:16 UTC

Created attachment 449070 [details]
[patch v2] mm: skip rebalance on hopeless zone

D'oh, you are right, Rik, thanks for spotting it.  Rebasing manually was a bad idea.  Here is the revision.

Comment 16 Leslie 2010-09-28 05:38:15 UTC

Johannes:

 How do I apply the patch?

 Thanks.

 Leslie Hartman


(In reply to comment #15)
> Created attachment 449070 [details]
> [patch v2] mm: skip rebalance on hopeless zone
> 
> D'oh, you are right, Rik, thanks for spotting it.  Rebasing manually was a bad
> idea.  Here is the revision.

Comment 17 Johannes Weiner 2010-10-06 00:20:59 UTC

(In reply to comment #16)
> Johannes:
> 
>  How do I apply the patch?

I prebuilt a kernel for you, please find it at http://people.redhat.com/~jweiner/bz633825/ .  `rpm -i kernel*.rpm' should install it in parallel to the old kernel and also set the bootloader to choose this kernel per default.

Comment 18 Leslie 2010-10-27 16:17:02 UTC

I applied the patch and tried a number of different memory settings. It worked correctly every time. I noticed that when vmware now uses multiple's of 4 in there latest version, so I was not even able to get the exact amount of memory I had selected the first time. Any way, I think you have resolved the problem. Thank you for assistance.

Comment 19 RHEL Program Management 2010-12-17 04:20:17 UTC

This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 20 Aristeu Rozanski 2011-02-03 15:19:30 UTC

Patch(es) available on kernel-2.6.32-112.el6

Comment 28 Caspar Zhang 2011-05-01 11:08:01 UTC

Re-tested for many times, failed to reproduce the problem. confirmed the patch is included in 131.0.9.el6. Mark SanityOnly.

Leslie, can you help to verify?

Comment 29 errata-xmlrpc 2011-05-19 12:19:12 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0542.html