Bug 525898

Summary:

soft lockups with kswapd in RHEL 5.4 kernel 2.6.18-164.el5 x86_64

Product:

Red Hat Enterprise Linux 5

Reporter:

David Tauriainen <david.tauriainen>

Component:

kernel-xen

Assignee:

Andrew Jones <drjones>

Status:

CLOSED ERRATA

QA Contact:

Virtualization Bugs <virt-bugs>

Severity:

high

Docs Contact:

Priority:

low

Version:

5.4

CC:

aloga, dev_nll, drjones, ian.chard, jzheng, lwoodman, mshao, pasteur, phan, qcai, stephen.vaughan, vdla7600, xen-maint

Target Milestone:

---

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Paravirt Xen guests used to allocate all low memory (all memory for 64-bit) to ZONE_DMA, rather than also using ZONE_DMA32 and ZONE_NORMAL. The guest kernels now us all three zones the same as natively running kernels do.

Story Points:

---

Clone Of:

Environment:

Last Closed:

2011-07-21 10:24:43 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

514489

Attachments:

Description	Flags
proposed patch	none
/var/log/messages (trimmed)	none
dmesg output from second event - 1Mar10	none
/proc/zoneinfo -164	none
sysrq 'show memory' -164	none
/proc/zoneinfo -257	none
sysrq 'show memory' -257	none

Description David Tauriainen 2009-09-26 21:22:53 UTC

Description of problem:
In a RHEL 5.4 system using x86_64 2.6.18-164.el5, kswapd isn't paging anything to swap.  The RAM maxes out, and I can watch swap sit at nothing, then kswapd0 starts chewing up CPU on all cores, leaving me soft lockup errors in the system log (BUG: soft lockup - CPU#[0/1/2/3/...] stuck for 10s! [kswapd0:PID]), and load shortly skyrockets, leaving the system unresponsive.

The errors are virtually identical to this centos posting.   https://www.centos.org/modules/newbb/viewtopic.php?post_id=86319&topic_id=22351

x86_64 2.6.18-128.7.1.el5 (the most recent 5.3 kernel) seems to work fine.

Version-Release number of selected component (if applicable):
x86_64 2.6.18-164.el5

How reproducible: 100%

Steps to Reproduce:
1. Run fsck or something that eats up RAM.
2. Sit back and watch

Actual results:
System lockup

Expected results:
Normal system functioning

Additional info:

Comment 1 Cong Wang 2009-10-20 06:10:12 UTC

Could you please provide your full kernel log?

And I can't reproduce it with fsck. I believe this is a memory IO related problem, running a memory-consuming program without IO should not trigger this problem.

Can you reproduce it on more than one machines? Or just on one machine?

Thanks.

Comment 2 Cong Wang 2009-10-22 02:12:07 UTC

I also tried ltp test cases, still can't reproduce it.

And do you have other way to reproduce it? (The simpler, the better.) So that I can try.

Thanks.

Comment 3 Cong Wang 2009-10-22 08:31:38 UTC

Created attachment 365673 [details]
proposed patch

Proposed patch to fix this, from upstream commit 73ce02e9.

Comment 4 Cong Wang 2009-10-22 08:33:09 UTC

Hi,

Could you try my proposed patch in the attachment? Check if it fixes the problem.

Thanks.

Comment 5 Cong Wang 2010-01-08 09:03:02 UTC

(In reply to comment #4)
> Hi,
> 
> Could you try my proposed patch in the attachment? Check if it fixes the
> problem.

Any reply?? :)

Comment 6 Rob Moser 2010-02-24 17:13:37 UTC

Hi Amerigo

We had a problem come up yesterday with exactly these symptoms.  The machine is running the Xen modification of RH:

cat /proc/version
Linux version 2.6.18-164.11.1.el5xen (mockbuild.redhat.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)) #1 SMP Wed Jan 6 13:43:33 EST 2010

Yesterday, in the midst of transferring a large amount of data from NFS mounted drives to local drives, kswapd0 suddenly shot up to 100% CPU and stayed there, and the machine became extremely unresponsive.  Looking at some other, older bugtraq issues with similar symptoms, it sounded like a sysrq memory report would be useful, and I managed to eventually get this out of it:

SysRq : Show Memory
Mem-info:
DMA per-cpu:
cpu 0 hot: high 186, batch 31 used:24
cpu 0 cold: high 62, batch 15 used:10
cpu 1 hot: high 186, batch 31 used:119
cpu 1 cold: high 62, batch 15 used:60
cpu 2 hot: high 186, batch 31 used:133
cpu 2 cold: high 62, batch 15 used:47
cpu 3 hot: high 186, batch 31 used:172
cpu 3 cold: high 62, batch 15 used:50
cpu 4 hot: high 186, batch 31 used:148
cpu 4 cold: high 62, batch 15 used:52
cpu 5 hot: high 186, batch 31 used:126
cpu 5 cold: high 62, batch 15 used:49
DMA32 per-cpu: empty
Normal per-cpu: empty
HighMem per-cpu: empty
Free pages:       14148kB (0kB HighMem)
Active:447657 inactive:735078 dirty:106239 writeback:0 unstable:26
free:3537 slab:309765 mapped-file:19268 mapped-anon:261372 pagetables:28641
DMA free:14148kB min:10032kB low:12540kB high:15048kB active:1790628kB
inactive:2940312kB present:6291456kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 2111*4kB 99*8kB 15*16kB 8*32kB 7*64kB 1*128kB 1*256kB 1*512kB
1*1024kB 1*2048kB 0*4096kB = 14148kB
DMA32: empty
Normal: empty
HighMem: empty
921271 pagecache pages
Swap cache: add 153, delete 153, find 55/72, race 0+0
Free swap  = 6144756kB
Total swap = 6144852kB
Free swap:       6144756kB
1572864 pages of RAM
33682 reserved pages
1391020 pages shared
0 pages swap cached

dmesg shows this about every 10 seconds or so:

> BUG: soft lockup - CPU#1 stuck for 10s! [kswapd0:157]
> CPU 1:
> Modules linked in: nls_utf8 hfsplus ip_conntrack_netbios_ns xt_state
> ip_conntrack nfnetlink iptable_filter ip_tables ipv6 xfrm_nalgo
> crypto_api autofs4 hidp l2cap bluetooth nfs fscache nfs_acl lockd sunrpc
> ipt_recent ipt_LOG ipt_REJECT xt_tcpudp x_tables dm_mirror dm_multipath
> scsi_dh scsi_mod parport_pc lp parport xennet pcspkr dm_raid45
> dm_message dm_region_hash dm_log dm_mod dm_mem_cache xenblk ext3 jbd
> uhci_hcd ohci_hcd ehci_hcd
> Pid: 157, comm: kswapd0 Not tainted 2.6.18-164.11.1.el5xen #1
> RIP: e030:[<ffffffff88055b8d>]  [<ffffffff88055b8d>]
> :ext3:ext3_journal_start_sb+0x0/0x46
> RSP: e02b:ffff88017faebca8  EFLAGS: 00000202
> RAX: 0000000000080000 RBX: ffff880108becd80 RCX: ffff880083432408
> RDX: ffff88017c6f0c00 RSI: 0000000000000002 RDI: ffff88017c6f0c00
> RBP: 0000000000000000 R08: ffffffff804f8e00 R09: ffff880127853b80
> R10: ffff8800dc28c740 R11: ffffffff88282ef9 R12: ffff880108becd80
> R13: ffff88017faebd60 R14: 0000000000000080 R15: 0000000000000300
> FS:  00002b846195eb40(0000) GS:ffffffff805ca080(0000) knlGS:0000000000000000
> CS:  e033 DS: 0000 ES: 0000
> 
> Call Trace:
>  [<ffffffff802f1fbd>] dqput+0x81/0x19f
>  [<ffffffff802f265d>] dquot_drop+0x30/0x5e
>  [<ffffffff88057f55>] :ext3:ext3_dquot_drop+0x45/0x6b
>  [<ffffffff802233ff>] clear_inode+0xb4/0x123
>  [<ffffffff802362df>] dispose_list+0x41/0xe0
>  [<ffffffff8022e04e>] shrink_icache_memory+0x1b7/0x1e6
>  [<ffffffff802410c0>] shrink_slab+0xdc/0x154
>  [<ffffffff8025a1da>] kswapd+0x337/0x460
>  [<ffffffff8026ef47>] monotonic_clock+0x35/0x7b
>  [<ffffffff8029bd72>] autoremove_wake_function+0x0/0x2e
>  [<ffffffff8029bb5a>] keventd_create_kthread+0x0/0xc4
>  [<ffffffff80259ea3>] kswapd+0x0/0x460
>  [<ffffffff8029bb5a>] keventd_create_kthread+0x0/0xc4
>  [<ffffffff80233b8f>] kthread+0xfe/0x132
>  [<ffffffff80260b2c>] child_rip+0xa/0x12
>  [<ffffffff8029bb5a>] keventd_create_kthread+0x0/0xc4
>  [<ffffffff80233a91>] kthread+0x0/0x132
>  [<ffffffff80260b22>] child_rip+0x0/0x12

I got a vmstat to run for a bit, and there didn't seem to be any significant amount of actual swapping going on.  We rebooted the machine and it seems to be running normally now.

I'm sorry if this is somewhat garbled or missing critical info; my first time posting a bug here, and I couldn't find any guidelines about what you might need.  If there's any other information that I might be able to collect from the logs, please let me know.  If you think that my problem is unrelated to this bug, just let me know and I'll repost it separately.  (Unfortunately, as its a live production system, I can't see anyone letting me test your kernel patch on it.)

Thanks

     - rob.

Comment 7 Cong Wang 2010-02-25 06:01:52 UTC

(In reply to comment #6)
> Hi Amerigo
> 
> We had a problem come up yesterday with exactly these symptoms.  The machine is
> running the Xen modification of RH:
> 
> cat /proc/version
> Linux version 2.6.18-164.11.1.el5xen (mockbuild.redhat.com)
> (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)) #1 SMP Wed Jan 6 13:43:33 EST
> 2010
> 
> Yesterday, in the midst of transferring a large amount of data from NFS mounted
> drives to local drives, kswapd0 suddenly shot up to 100% CPU and stayed there,
> and the machine became extremely unresponsive.  Looking at some other, older
> bugtraq issues with similar symptoms, it sounded like a sysrq memory report
> would be useful, and I managed to eventually get this out of it:


It would be useful if you could provide your steps of hitting this.


> 
> SysRq : Show Memory
...
> Active:447657 inactive:735078 dirty:106239 writeback:0 unstable:26
> free:3537 slab:309765 mapped-file:19268 mapped-anon:261372 pagetables:28641
> DMA free:14148kB min:10032kB low:12540kB high:15048kB active:1790628kB
> inactive:2940312kB present:6291456kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
> present:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
> present:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB
> present:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> DMA: 2111*4kB 99*8kB 15*16kB 8*32kB 7*64kB 1*128kB 1*256kB 1*512kB
> 1*1024kB 1*2048kB 0*4096kB = 14148kB
> DMA32: empty
> Normal: empty
> HighMem: empty


Oh, there are only free pages in DMA zone... Bad.

> 921271 pagecache pages
> Swap cache: add 153, delete 153, find 55/72, race 0+0
> Free swap  = 6144756kB
> Total swap = 6144852kB
> Free swap:       6144756kB
> 1572864 pages of RAM
> 33682 reserved pages
> 1391020 pages shared
> 0 pages swap cached


But you have enough free swap, kswapd didn't do swap as expect.

> 
> dmesg shows this about every 10 seconds or so:
> 
> > BUG: soft lockup - CPU#1 stuck for 10s! [kswapd0:157]
> > CPU 1:
> > Modules linked in: nls_utf8 hfsplus ip_conntrack_netbios_ns xt_state
> > ip_conntrack nfnetlink iptable_filter ip_tables ipv6 xfrm_nalgo
> > crypto_api autofs4 hidp l2cap bluetooth nfs fscache nfs_acl lockd sunrpc
> > ipt_recent ipt_LOG ipt_REJECT xt_tcpudp x_tables dm_mirror dm_multipath
> > scsi_dh scsi_mod parport_pc lp parport xennet pcspkr dm_raid45
> > dm_message dm_region_hash dm_log dm_mod dm_mem_cache xenblk ext3 jbd
> > uhci_hcd ohci_hcd ehci_hcd
> > Pid: 157, comm: kswapd0 Not tainted 2.6.18-164.11.1.el5xen #1
> > RIP: e030:[<ffffffff88055b8d>]  [<ffffffff88055b8d>]
> > :ext3:ext3_journal_start_sb+0x0/0x46
> > RSP: e02b:ffff88017faebca8  EFLAGS: 00000202
> > RAX: 0000000000080000 RBX: ffff880108becd80 RCX: ffff880083432408
> > RDX: ffff88017c6f0c00 RSI: 0000000000000002 RDI: ffff88017c6f0c00
> > RBP: 0000000000000000 R08: ffffffff804f8e00 R09: ffff880127853b80
> > R10: ffff8800dc28c740 R11: ffffffff88282ef9 R12: ffff880108becd80
> > R13: ffff88017faebd60 R14: 0000000000000080 R15: 0000000000000300
> > FS:  00002b846195eb40(0000) GS:ffffffff805ca080(0000) knlGS:0000000000000000
> > CS:  e033 DS: 0000 ES: 0000
> > 
> > Call Trace:
> >  [<ffffffff802f1fbd>] dqput+0x81/0x19f
> >  [<ffffffff802f265d>] dquot_drop+0x30/0x5e
> >  [<ffffffff88057f55>] :ext3:ext3_dquot_drop+0x45/0x6b
> >  [<ffffffff802233ff>] clear_inode+0xb4/0x123
> >  [<ffffffff802362df>] dispose_list+0x41/0xe0
> >  [<ffffffff8022e04e>] shrink_icache_memory+0x1b7/0x1e6
> >  [<ffffffff802410c0>] shrink_slab+0xdc/0x154
> >  [<ffffffff8025a1da>] kswapd+0x337/0x460
> >  [<ffffffff8026ef47>] monotonic_clock+0x35/0x7b
> >  [<ffffffff8029bd72>] autoremove_wake_function+0x0/0x2e
> >  [<ffffffff8029bb5a>] keventd_create_kthread+0x0/0xc4
> >  [<ffffffff80259ea3>] kswapd+0x0/0x460
> >  [<ffffffff8029bb5a>] keventd_create_kthread+0x0/0xc4
> >  [<ffffffff80233b8f>] kthread+0xfe/0x132
> >  [<ffffffff80260b2c>] child_rip+0xa/0x12
> >  [<ffffffff8029bb5a>] keventd_create_kthread+0x0/0xc4
> >  [<ffffffff80233a91>] kthread+0x0/0x132
> >  [<ffffffff80260b22>] child_rip+0x0/0x12
> 

Have you enabled quota on your swap partation?

> 
> If there's any other information that I might be able to collect from
> the logs, please let me know.  If you think that my problem is unrelated to
> this bug, just let me know and I'll repost it separately.  (Unfortunately, as
> its a live production system, I can't see anyone letting me test your kernel
> patch on it.)
> 

I am not sure, perhaps it's related.

It's better if you could provide your dmesg output, in case that there
are some other errors may appear before this soft lockup happened.

Thanks.

Comment 8 Cong Wang 2010-02-25 06:07:48 UTC

Larry, any ideas about what happened in comment #6?
Please also check the attached patch in this BZ.

Thanks!

Comment 9 Larry Woodman 2010-02-25 13:21:48 UTC

I'm guessing every CPU is stuck on the zone->lru_lock 
inside shrink_zone() after calling try_to_free_pages()
out of __alloc_pages()???  An AltSysrq-W output when 
the system is at 100% sysem time will tell us for sure.

Did that patch help in comment3???

Larry

Comment 10 Rob Moser 2010-02-25 23:17:28 UTC

(In reply to comment #7)

> It would be useful if you could provide your steps of hitting this.

Not sure if you mean how did the system end up hung, or how did I generate the memory report; I'll give you both.

1) Moving user home directories from an old solaris box to a new Xen virtual RH instance.  Mounted the home directories from the solaris box onto the RH via NFS, and used a little Perl script running on the new machine to do a bit of housekeeping and then copy the home directories from the NFS mounted drives to a local drive via cpio.  So lots of NFS reads and local writes.

2) echo m > /proc/sysrq-trigger
dmesg

> Have you enabled quota on your swap partation?

No.

> I am not sure, perhaps it's related.
> 
> It's better if you could provide your dmesg output, in case that there
> are some other errors may appear before this soft lockup happened.

Yeah, sorry about that; the machine had cycled by the time I thought to get the info, and apparently the kernel ring buffer had been overwritten, so the dmesg output of the error occurring was lost.  The quoted text was taken directly from the console and emailed to me, but as you say, it is incomplete.

If it happens again, I'll get the full message and an AltSysrq-w while the error is occurring - I assume I do that in the same way as the AltSysrq-m?

Comment 11 Cong Wang 2010-02-26 01:39:28 UTC

(In reply to comment #9)
> I'm guessing every CPU is stuck on the zone->lru_lock 
> inside shrink_zone() after calling try_to_free_pages()
> out of __alloc_pages()???  An AltSysrq-W output when 
> the system is at 100% sysem time will tell us for sure.

Yeah.
Rob, do you have any chance to do AltSysrq-W when it occurs?

> 
> Did that patch help in comment3???
> 

Rob said they don't have a chance to test the patch on the victim machine.

Comment 12 Cong Wang 2010-02-26 01:45:18 UTC

(In reply to comment #10)
> (In reply to comment #7)
> 
> > It would be useful if you could provide your steps of hitting this.
> 
> Not sure if you mean how did the system end up hung, or how did I generate the
> memory report; I'll give you both.


Oh, I mean how did you make the system hang. ;)

> 
> 1) Moving user home directories from an old solaris box to a new Xen virtual RH
> instance.  Mounted the home directories from the solaris box onto the RH via
> NFS, and used a little Perl script running on the new machine to do a bit of
> housekeeping and then copy the home directories from the NFS mounted drives to
> a local drive via cpio.  So lots of NFS reads and local writes.
> 

I see, you mean lots of NFS loads.
Does the hang happen every time?

 
> > I am not sure, perhaps it's related.
> > 
> > It's better if you could provide your dmesg output, in case that there
> > are some other errors may appear before this soft lockup happened.
> 
> Yeah, sorry about that; the machine had cycled by the time I thought to get the
> info, and apparently the kernel ring buffer had been overwritten, so the dmesg
> output of the error occurring was lost.  The quoted text was taken directly
> from the console and emailed to me, but as you say, it is incomplete.
> 
> If it happens again, I'll get the full message and an AltSysrq-w while the
> error is occurring - I assume I do that in the same way as the AltSysrq-m?    

Yes, exactly. Please do that.

Plus, if dmesg buffer got overwritten, you may be able to use /var/log/messages.

Thanks for testing!

Comment 13 Rob Moser 2010-02-26 19:11:14 UTC

Created attachment 396639 [details]
/var/log/messages (trimmed)

I'm attaching /var/log/messages from the time around the lock-up.

Still no repeat of the problem (good fur us; not so good for debugging...)

Comment 14 Rob Moser 2010-03-02 15:27:42 UTC

Created attachment 397352 [details]
dmesg output from second event - 1Mar10

To be honest, I think that this might not be the same problem.  But, on the general principle that too much info is better than not enough...

Yesterday we had a superficially similar problem, where the CPU on the machine seemed to be spinning out of control.  However in this case, it appeared to be a series of samba daemons using all the CPU rather than kswapd0, which is why I think it might be an unrelated problem.  Restarting the samba daemon seemed to fix the problem.  On the other hand, this morning the CPU load is creeping up again, and the kswapd0 process, while not spitting out the errors we got before, seems to be using an unreasonable amount of CPU.  We've restarted the server now, and it seems to be running normally again.

I collected two AltSysrq-w and AltSysrq-m reports last night before we restarted samba, and another this morning.  Output attached.

Comment 15 Cong Wang 2010-03-03 06:33:37 UTC

Rob,

Thanks for those info!

May I ask if you could try a non-xen RH Linux? So that we will know if this is a Xen specific problem.

In fact, I tried a 2.6.18-164.11.1.el5 kernel but not Xen, I can't reproduce this problem when I was copying some 2G files via NFS (Memory is reduced to 800M only).

Thank you.

Comment 16 Rob Moser 2010-03-03 16:42:33 UTC

Hi Amerigo,

Thanks for looking into this for us.  Well I can't try a non-xen RH on anything like the same machine, because its a xen virtual server - without xen it doesn't really exist!  When I started this report I hadn't realised the now-obvious fact that Citrix had modified the RH kernel in order to make it run under their virtual server architecture - if you can't reproduce it on a non-xen RH, then I guess I'd better take it up with Citrix.

Comment 17 Cong Wang 2010-03-09 10:10:26 UTC

(In reply to comment #16)
> Hi Amerigo,
> 
> Thanks for looking into this for us.  Well I can't try a non-xen RH on anything
> like the same machine, because its a xen virtual server - without xen it
> doesn't really exist!  When I started this report I hadn't realised the
> now-obvious fact that Citrix had modified the RH kernel in order to make it run
> under their virtual server architecture - if you can't reproduce it on a
> non-xen RH, then I guess I'd better take it up with Citrix.    

No problem. I am not sure at all, just wanted to kick out that case.
I will check more about it.

Thanks.

Comment 18 Andrew Jones 2010-12-10 08:55:35 UTC

While kswapd probably could be improved to handle cases like this, i.e. it could look down in the DMA zone as a last resort, it's also likely kswapd isn't the only component making the assumption that it shouldn't bother. kernel subcomponents likely all believe that if you're down to only having DMA memory left, then you're just in trouble. So the root of the problem is that the xen kernel assumed it could just shove all memory in the DMA zone. This was initially done to avoid any issues with drivers not having enough "DMA memory". They don't really have any anyway, because the real zoning is actually handled by the hypervisor. However, long ago upstream deemed this DMA-only zoning as unnecessary, and we can now see that it actually disturbs the rest of the kernel.

I'll backport the upstream patch that changes xen kernels back to the normal zone size setup. We can then do a round of testing to ensure no regressions and that this problem goes away.

Comment 19 Andrew Jones 2010-12-10 14:24:47 UTC

The original description of this bug doesn't mention Xen. Was this originally seen on a bare-metal kernel? Or was it always on xen? The patch we've posted will only fix things with respect to the problem described in comment 6, i.e. kswapd can't find any memory since it isn't looking in the DMA zone on a xen system. bare-metal systems won't have this problem, and thus this patch wouldn't fix them.

Comment 20 Andrew Jones 2010-12-13 21:02:55 UTC

    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Paravirt Xen guests used to allocate all low memory (all memory for 64-bit) to ZONE_DMA, rather than also using ZONE_DMA32 and ZONE_NORMAL. The guest kernels now us all three zones the same as natively running kernels do.

Comment 21 RHEL Program Management 2011-02-01 16:53:11 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 27 Jarod Wilson 2011-02-18 22:39:02 UTC

in kernel-2.6.18-244.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 29 Jinxin Zheng 2011-04-27 08:17:29 UTC

I can't reproduce this copying some large (10G) file over nfs on -164 kernel-xen, either on dom0 or domU. The kswapd0 was always fine.

Have you reproduced before?

Comment 30 Andrew Jones 2011-04-27 08:49:04 UTC

I never reproduced it. Maybe you increase /proc/sys/vm/swappiness before trying the NFS transfer? Otherwise you can certainly see that the patch changes the memory layout by simply checking sysrq mem info to see that before all the memory is in DMA zone and after it populates all zones.

Comment 31 Jinxin Zheng 2011-04-27 10:34:13 UTC

(In reply to comment #30)
> I never reproduced it. Maybe you increase /proc/sys/vm/swappiness before trying
> the NFS transfer? Otherwise you can certainly see that the patch changes the
> memory layout by simply checking sysrq mem info to see that before all the
> memory is in DMA zone and after it populates all zones.

Increased dom0's /proc/sys/vm/swappiness from 60 to 100, still cannot reproduce.

Can we just do sanity checking for this?

Comment 32 Andrew Jones 2011-04-27 10:51:28 UTC

(In reply to comment #31)
> Can we just do sanity checking for this?

Fine by me. The new memory layout is getting used all the time. So it gets plenty of exercise. Maybe the reporter can state if the kswapd issues has gone away when running with a patched kernel?

Comment 33 Jinxin Zheng 2011-04-28 09:31:08 UTC

OK. We checked the -257 code, confirming that the patch in comment 27 is inside. As we cannot reproduce this problem on -164 or -257, setting this 'Verified:SanityOnly'.

Comment 34 Andrew Jones 2011-04-28 13:36:59 UTC

(In reply to comment #33)
> OK. We checked the -257 code, confirming that the patch in comment 27 is
> inside. As we cannot reproduce this problem on -164 or -257, setting this
> 'Verified:SanityOnly'.

Well, as I said in comment 30, you can still make sure the patch works by checking with sysrq, i.e. before the patch all mem is in the DMA zone and after it's distributed throughout. So you don't have to look at the code to check that the patch is there. Reproducing the kswapd problem might be difficult though, and so I'm ok with sanityonly for that particular bug.

Comment 35 Jinxin Zheng 2011-04-29 04:28:14 UTC

Created attachment 495708 [details]
/proc/zoneinfo -164

on -164 kernel-xen:

cat /proc/zoneinfo

Comment 36 Jinxin Zheng 2011-04-29 04:29:37 UTC

Created attachment 495709 [details]
sysrq 'show memory' -164

on -164 kernel-xen:

echo m > /proc/sysrq-trigger

dmesg

Comment 37 Jinxin Zheng 2011-04-29 04:30:25 UTC

Created attachment 495710 [details]
/proc/zoneinfo -257

on -257 kernel-xen:

cat /proc/zoneinfo

Comment 38 Jinxin Zheng 2011-04-29 04:31:19 UTC

Created attachment 495711 [details]
sysrq 'show memory' -257

on -257 kernel-xen:

echo m > /proc/sysrq-trigger
dmesg

Comment 39 Jinxin Zheng 2011-04-29 04:37:31 UTC

Based on the above attachments, I should have verified that the patch is making dom0 to map available memory into the three zones ZONE_DMA, ZONE_DMA32 and ZONE_NORMAL. Compared to -164 where all memory is mapped only into ZONE_DMA.

Comment 40 Stephen 2011-05-17 03:45:30 UTC

So, when is this likely to be included in a future update of RHEL5?

Comment 41 Andrew Jones 2011-05-17 07:32:33 UTC

(In reply to comment #40)
> So, when is this likely to be included in a future update of RHEL5?

It'll be in 5.7.

Comment 42 errata-xmlrpc 2011-07-21 10:24:43 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-1065.html