Bug 164959 - CRM# 619256 RHEL4 lvm2 memory allocation failures locking up system with snapshots
Summary: CRM# 619256 RHEL4 lvm2 memory allocation failures locking up system with snap...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: lvm2
Version: 4.0
Hardware: All
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Alasdair Kergon
QA Contact:
URL:
Whiteboard:
: 166975 168970 169162 (view as bug list)
Depends On: 173163 173164 173166
Blocks: 168429
TreeView+ depends on / blocked
 
Reported: 2005-08-03 02:59 UTC by Issue Tracker
Modified: 2007-11-30 22:07 UTC (History)
12 users (show)

Fixed In Version: RHBA-2006-0137
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-03-07 21:33:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2006:0099 0 qe-ready SHIPPED_LIVE device-mapper bug fix and enhancement update 2006-03-06 05:00:00 UTC
Red Hat Product Errata RHBA-2006:0137 0 qe-ready SHIPPED_LIVE lvm2 bug fix and enhancement update 2006-03-06 05:00:00 UTC

Comment 6 Alasdair Kergon 2005-09-21 19:54:31 UTC
*** Bug 168970 has been marked as a duplicate of this bug. ***

Comment 7 Alasdair Kergon 2005-09-21 19:59:32 UTC
*** Bug 166975 has been marked as a duplicate of this bug. ***

Comment 8 Alasdair Kergon 2005-09-21 20:08:49 UTC
This is the RHEL4 version of bug 132057

Comment 11 Damian Menscher 2005-09-28 18:17:01 UTC
According to the other bug report, this problem has been "fully understood"
since January?  What's the holdup on getting it fixed?

Logs from a CentOS 4.1 machine:

Sep 28 03:00:03 astro kernel: lvcreate: page allocation failure. order:0, mode:0xd0
Sep 28 03:00:03 astro kernel:  [<c013fa77>] __alloc_pages+0x28b/0x29d
Sep 28 03:00:03 astro kernel:  [<f8884a3b>] alloc_pl+0x27/0x3d [dm_mod]
Sep 28 03:00:03 astro kernel:  [<f8884b16>] client_alloc_pages+0x15/0x47 [dm_mod]
Sep 28 03:00:03 astro kernel:  [<f88854b6>] kcopyd_client_create+0x64/0x9f [dm_mod]
Sep 28 03:00:03 astro kernel:  [<f884b697>] snapshot_ctr+0x231/0x2b8 [dm_snapshot]
Sep 28 03:00:03 astro kernel:  [<f8881185>] dm_table_add_target+0xfc/0x169 [dm_mod]
Sep 28 03:00:03 astro kernel:  [<f888320c>] populate_table+0x8a/0xaf [dm_mod]
Sep 28 03:00:03 astro kernel:  [<f8883268>] table_load+0x37/0x123 [dm_mod]
Sep 28 03:00:03 astro kernel:  [<f8883ce3>] ctl_ioctl+0xd1/0x144 [dm_mod] 
Sep 28 03:00:03 astro kernel:  [<f8883231>] table_load+0x0/0x123 [dm_mod]
Sep 28 03:00:03 astro kernel:  [<c0165b5e>] sys_ioctl+0x227/0x269
Sep 28 03:00:03 astro kernel:  [<c02c7377>] syscall_call+0x7/0xb
Sep 28 03:00:03 astro kernel: Mem-info:
Sep 28 03:00:03 astro kernel: DMA per-cpu:
Sep 28 03:00:03 astro kernel: cpu 0 hot: low 2, high 6, batch 1
Sep 28 03:00:03 astro kernel: cpu 0 cold: low 0, high 2, batch 1
Sep 28 03:00:03 astro kernel: cpu 1 hot: low 2, high 6, batch 1 
Sep 28 03:00:03 astro kernel: cpu 1 cold: low 0, high 2, batch 1
Sep 28 03:00:03 astro kernel: cpu 2 hot: low 2, high 6, batch 1 
Sep 28 03:00:03 astro kernel: cpu 2 cold: low 0, high 2, batch 1
Sep 28 03:00:03 astro kernel: cpu 3 hot: low 2, high 6, batch 1 
Sep 28 03:00:03 astro kernel: cpu 3 cold: low 0, high 2, batch 1
Sep 28 03:00:03 astro kernel: Normal per-cpu:
Sep 28 03:00:03 astro kernel: cpu 0 hot: low 32, high 96, batch 16 
Sep 28 03:00:03 astro kernel: cpu 0 cold: low 0, high 32, batch 16
Sep 28 03:00:03 astro kernel: cpu 1 hot: low 32, high 96, batch 16
Sep 28 03:00:03 astro kernel: cpu 1 cold: low 0, high 32, batch 16
Sep 28 03:00:03 astro kernel: cpu 2 hot: low 32, high 96, batch 16
Sep 28 03:00:03 astro kernel: cpu 2 cold: low 0, high 32, batch 16
Sep 28 03:00:03 astro kernel: cpu 3 hot: low 32, high 96, batch 16
Sep 28 03:00:03 astro kernel: cpu 3 cold: low 0, high 32, batch 16
Sep 28 03:00:03 astro kernel: HighMem per-cpu:
Sep 28 03:00:03 astro kernel: cpu 0 hot: low 14, high 42, batch 7
Sep 28 03:00:03 astro kernel: cpu 0 cold: low 0, high 14, batch 7
Sep 28 03:00:03 astro kernel: cpu 1 hot: low 14, high 42, batch 7
Sep 28 03:00:03 astro kernel: cpu 1 cold: low 0, high 14, batch 7
Sep 28 03:00:03 astro kernel: cpu 2 hot: low 14, high 42, batch 7
Sep 28 03:00:03 astro kernel: cpu 2 cold: low 0, high 14, batch 7
Sep 28 03:00:03 astro kernel: cpu 3 hot: low 14, high 42, batch 7
Sep 28 03:00:03 astro kernel: cpu 3 cold: low 0, high 14, batch 7
Sep 28 03:00:03 astro kernel: 
Sep 28 03:00:03 astro kernel: Free pages:       14836kB (280kB HighMem)
Sep 28 03:00:03 astro kernel: Active:197720 inactive:36044 dirty:48 writeback:0
unstable:0 free:3709 slab:15165 mapped:158110 pagetables:2309
Sep 28 03:00:03 astro kernel: DMA free:12636kB min:16kB low:32kB high:48kB
active:0kB inactive:0kB present:16384kB pages_scanned:53734 all_unreclaimable? yes
Sep 28 03:00:03 astro kernel: protections[]: 0 0 0
Sep 28 03:00:03 astro kernel: Normal free:1920kB min:928kB low:1856kB
high:2784kB active:670208kB inactive:140912kB present:901120kB pages_scanned:0
all_unreclaimable? no
Sep 28 03:00:03 astro kernel: protections[]: 0 0 0
Sep 28 03:00:03 astro kernel: HighMem free:308kB min:128kB low:256kB high:384kB
active:120672kB inactive:3264kB present:130496kB pages_scanned:0
all_unreclaimable? no
Sep 28 03:00:03 astro kernel: protections[]: 0 0 0
Sep 28 03:00:03 astro kernel: DMA: 1*4kB 3*8kB 4*16kB 4*32kB 4*64kB 1*128kB
1*256kB 1*512kB 1*1024kB 1*2048kB 2*4096kB = 12636kB
Sep 28 03:00:03 astro kernel: Normal: 480*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB
0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1920kB
Sep 28 03:00:03 astro kernel: HighMem: 91*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB
0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 364kB
Sep 28 03:00:04 astro kernel: Swap cache: add 1685, delete 1683, find 453/671,
race 0+0
Sep 28 03:00:04 astro kernel: Free swap:       2096248kB
Sep 28 03:00:04 astro kernel: 262000 pages of RAM
Sep 28 03:00:04 astro kernel: 32624 pages of HIGHMEM 
Sep 28 03:00:04 astro kernel: 3522 reserved pages
Sep 28 03:00:04 astro kernel: 169391 pages shared
Sep 28 03:00:04 astro kernel: 2 pages swap cached
Sep 28 03:00:04 astro kernel: device-mapper: Could not create kcopyd client
Sep 28 03:00:04 astro kernel: device-mapper: error adding target to table



Comment 12 Alasdair Kergon 2005-10-03 15:40:06 UTC
*** Bug 169162 has been marked as a duplicate of this bug. ***

Comment 20 Alasdair Kergon 2005-12-01 19:51:37 UTC
There are now some changes as follows:

The steps necessary to create or activate a snapshot have been resequenced so
that if there isn't enough memory available this should not cause the system to
lock up.

Changes are being made to the kernel, lvm2 and device-mapper packages for U3.

The same amount of memory as before is still needed, but the memory now gets
reserved *before* the snapshot becomes live, rather than during critical parts
of the process where failure could leave the machine in an unusable state.


Work will continue separately aimed at providing control over the amount of
memory used.


Comment 22 Marc Bejarano 2006-02-07 19:00:00 UTC
i just ran into this.  are the current bits available for beta testing the fix?
 is this issue being tracked in another open bugtracker?

Comment 24 Red Hat Bugzilla 2006-03-07 18:40:35 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0099.html


Comment 25 Red Hat Bugzilla 2006-03-07 21:33:38 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0137.html



Note You need to log in before you can comment on or make changes to this bug.