Bug 132057 - page allocation failure on snapshot lvcreate
page allocation failure on snapshot lvcreate
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: device-mapper-obsolete (Show other bugs)
rawhide
i686 Linux
high Severity high
: ---
: ---
Assigned To: Alasdair Kergon
:
: 143650 144874 169597 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-09-08 09:02 EDT by Alexandre Oliva
Modified: 2007-11-30 17:10 EST (History)
21 users (show)

See Also:
Fixed In Version: 1.02.02
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-11-09 07:45:56 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Alexandre Oliva 2004-09-08 09:02:22 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2)
Gecko/20040809

Description of problem:
Sometimes, when I try to create a snapshot of a live filesystem, I get
an error like this in /var/log/messages, and the entire VG becomes
unavailable as it can't be brought back up:

lvcreate: page allocation failure. order:0, mode:0 [<02142fb8>]
__alloc_pages+0x28b/0x298
 [<22a4c447>] alloc_pl+0x27/0x3d [dm_mod]
 [<22a4c6d7>] client_alloc_pages+0x15/0x47 [dm_mod]
 [<22a4d6e0>] kcopyd_client_create+0x77/0xb5 [dm_mod]
 [<22a55d99>] dm_create_persistent+0x94/0xf4 [dm_snapshot]
 [<22a546dc>] snapshot_ctr+0x251/0x2d8 [dm_snapshot]
 [<22a49073>] dm_table_add_target+0xfc/0x169 [dm_mod]
 [<22a4b048>] populate_table+0x8a/0xaf [dm_mod]
 [<22a4b0a4>] table_load+0x37/0xf6 [dm_mod]
 [<22a4b74b>] ctl_ioctl+0xcb/0x10d [dm_mod]
 [<22a4b06d>] table_load+0x0/0xf6 [dm_mod]
 [<021760f3>] sys_ioctl+0x296/0x337
device-mapper: : Could not create kcopyd client
device-mapper: error adding target to table


Version-Release number of selected component (if applicable):
kernel-2.6.8-1.541 device-mapper-1.00.19-1 lvm2-2.00.21-2

How reproducible:
Sometimes

Steps to Reproduce:
1.Add a physical volume with free space to an existing volume group
2.Create a snapshot of a live, mounted logical volume

Actual Results:  The kernel error above, and an error from lvcreate
that the volume group couldn't be brought back up.  vgscan doesn't
seem to fix it.  The mounted filesystem eventually gets I/O errors and
is remounted read-only, with all accesses to it blocked.

Expected Results:  Should ideally just work :-)

Additional info:

I don't know whether this is related, but right after the vgextend,
/var/log/messages got this line:
udev[17225]: PROGRAM execution of '/sbin/devmap_name 253 4' failed
udev[17225]: creating device node '/dev/dm-4'

and right after the `lvcreate: page allocation failure' line (before
the rest of the stack trace):
udev[17249]: PROGRAM execution of '/sbin/devmap_name 253 4' failed
udev[17249]: creating device node '/dev/dm-4'
udev[17257]: PROGRAM execution of '/sbin/devmap_name 253 5' failed
udev[17257]: creating device node '/dev/dm-5'
udev[17264]: PROGRAM execution of '/sbin/devmap_name 253 6' failed
udev[17264]: creating device node '/dev/dm-6'
Comment 1 Alexandre Oliva 2004-09-09 11:45:53 EDT
A significantly different stack trace for an identical scenario.

lvcreate: page allocation failure. order:0, mode:0xd2
 [<02142fb8>] __alloc_pages+0x28b/0x298
 [<02159df9>] __vmalloc+0xb7/0xf0
 [<22a4469d>] dm_vcalloc+0x1d/0x42 [dm_mod]
 [<25c9c278>] init_exception_table+0x18/0x4f [dm_snapshot]
 [<25c9c467>] init_hash_tables+0xb4/0xd8 [dm_snapshot]
 [<25c9c683>] snapshot_ctr+0x1f8/0x2d8 [dm_snapshot]
 [<22a45073>] dm_table_add_target+0xfc/0x169 [dm_mod]
 [<22a47048>] populate_table+0x8a/0xaf [dm_mod]
 [<22a470a4>] table_load+0x37/0xf6 [dm_mod]
 [<22a4774b>] ctl_ioctl+0xcb/0x10d [dm_mod]
 [<22a4706d>] table_load+0x0/0xf6 [dm_mod]
 [<021760f3>] sys_ioctl+0x296/0x337
 [<0210808b>] do_IRQ+0x2f7/0x303
device-mapper: : Unable to allocate hash table space

device-mapper: error adding target to table

FWIW, in this case /sbin/devmap_name didn't fail.  After the error, I got:

udev[13335]: creating device node '/dev/dm-4'
udev[13337]: creating device node '/dev/dm-5'
udev[13340]: creating device node '/dev/dm-6'
Comment 2 Need Real Name 2004-09-15 08:41:35 EDT
Same problem here.

Sometimes, lvcreate will make the snapshot, sometimes it will freeze
but, as I have all my filesystems on one volume group, I can't see
debug info on the logs. I still can see it on the console, though.


Comment 3 Alasdair Kergon 2004-09-26 14:26:00 EDT
So insufficient memory for the snapshot?  Try different parameters 
(fewer concurrent snapshots, smaller device, larger chunksize etc.).

[And snapshots of volumes used by the lvm2 tools themselves are not 
supported yet; in other cases, lvremove-ing the failed snapshot ought 
to tidy things up after a failure.]
Comment 4 Alexandre Oliva 2004-09-26 15:52:53 EDT
The system had lots of free swap space, so it shouldn't be it.  FWIW,
I once got the same error while lvremoving the snapshot LV.  No other
snapshots were present, and this was a single-LV VG in external disks
only, totally unrelated with the root filesystem.
Comment 5 Alexandre Oliva 2004-09-26 16:00:46 EDT
Nice, tre system had frozen, but the oops made it to
/var/log/messages.  I figured a snapshot was hurting performance too
much (rsyncing big files around), so I tried to remove it while there
was a lot of stuff yet to be flushed to disk.  And then it died:

slab error in kmem_cache_destroy(): cache `dm-io-bio': Can't free all
objects
 [<021471a8>] kmem_cache_destroy+0x97/0x11c
 [<229efa00>] bio_set_exit+0x1c/0x45 [dm_mod]
 [<229efd2a>] resize_pool+0x2c/0x9b [dm_mod]
 [<229f1729>] kcopyd_client_destroy+0xb/0x26 [dm_mod]
 [<229fe7b2>] snapshot_dtr+0x4f/0x58 [dm_snapshot]
 [<229ec886>] table_destroy+0x39/0x83 [dm_mod]
 [<229ee26a>] __hash_remove+0x3e/0x52 [dm_mod]
 [<229ee941>] dev_remove+0x7a/0x96 [dm_mod]
 [<229ef74b>] ctl_ioctl+0xcb/0x10d [dm_mod]
 [<229ee8c7>] dev_remove+0x0/0x96 [dm_mod]
 [<021760f3>] sys_ioctl+0x296/0x337
slab error in kmem_cache_destroy(): cache `dm-io-1': Can't free all
objects
 [<021471a8>] kmem_cache_destroy+0x97/0x11c
 [<229efa20>] bio_set_exit+0x3c/0x45 [dm_mod]
 [<229efd2a>] resize_pool+0x2c/0x9b [dm_mod]
 [<229f1729>] kcopyd_client_destroy+0xb/0x26 [dm_mod]
 [<229fe7b2>] snapshot_dtr+0x4f/0x58 [dm_snapshot]
 [<229ec886>] table_destroy+0x39/0x83 [dm_mod]
 [<229ee26a>] __hash_remove+0x3e/0x52 [dm_mod]
 [<229ee941>] dev_remove+0x7a/0x96 [dm_mod]
 [<229ef74b>] ctl_ioctl+0xcb/0x10d [dm_mod]
 [<229ee8c7>] dev_remove+0x0/0x96 [dm_mod]
 [<021760f3>] sys_ioctl+0x296/0x337
------------[ cut here ]------------
kernel BUG at drivers/md/kcopyd.c:145!
invalid operand: 0000 [#1]
Modules linked in: tun nfsd exportfs lockd md5 ipv6 parport_pc lp
parport autofs4 sunrpc ds ipt_REJECT ipt_LOG ipt_state iptable_filter
iptable_nat ip_conntrack iptable_mangle ip_tables button battery
asus_acpi ac yenta_socket pcmcia_core uhci_hcd i8xx_tco snd_maestro3
snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer
snd_page_alloc snd soundcore e100 mii floppy dm_snapshot dm_mirror
ext3 jbd raid1 dm_mod usb_storage sbp2 ohci1394 ieee1394 sd_mod scsi_mod
CPU:    0
EIP:    0060:[<229f0714>]    Not tainted VLI
EFLAGS: 00010287   (2.6.8-1.541) 
EIP is at client_free_pages+0xb/0x32 [dm_mod]
eax: 00000100   ebx: 219efd20   ecx: 03397018   edx: 00000015
esi: 22a23080   edi: 00000000   ebp: 00000000   esp: 1512cf38
ds: 007b   es: 007b   ss: 0068
Process lvremove (pid: 27924, threadinfo=1512c000 task=1a88d950)
Stack: 219efd20 229f1730 2010a080 229fe7b2 22a23080 21e03480 229ec886
1e97e5e0 
       22e87000 00000004 08c66300 229ee26a 229f5c20 229ee941 00000000
1512c000 
       229ef74b 229ee8c7 22e87000 229f5ce0 c134fd04 063ccec0 ffffffe7
021760f3 
Call Trace:
 [<229f1730>] kcopyd_client_destroy+0x12/0x26 [dm_mod]
 [<229fe7b2>] snapshot_dtr+0x4f/0x58 [dm_snapshot]
 [<229ec886>] table_destroy+0x39/0x83 [dm_mod]
 [<229ee26a>] __hash_remove+0x3e/0x52 [dm_mod]
 [<229ee941>] dev_remove+0x7a/0x96 [dm_mod]
 [<229ef74b>] ctl_ioctl+0xcb/0x10d [dm_mod]
 [<229ee8c7>] dev_remove+0x0/0x96 [dm_mod]
 [<021760f3>] sys_ioctl+0x296/0x337
Code: <3>Debug: sleeping function called from invalid context at
include/linux/rwsem.h:43
in_atomic():0[expected: 0], irqs_disabled():1
 [<0211c50c>] __might_sleep+0x7d/0x89
 [<0215e0fe>] rw_vm+0x216/0x482
 [<229f06e9>] client_alloc_pages+0x27/0x47 [dm_mod]
 [<229f06e9>] client_alloc_pages+0x27/0x47 [dm_mod]
 [<0215e850>] get_user_size+0x30/0x57
 [<229f06e9>] client_alloc_pages+0x27/0x47 [dm_mod]
 [<0210664d>] show_registers+0x109/0x15e
 [<02106822>] die+0x11b/0x219
 [<02106b0b>] do_invalid_op+0x0/0x95
 [<02106b9a>] do_invalid_op+0x8f/0x95
 [<229f0714>] client_free_pages+0xb/0x32 [dm_mod]
 [<021425e3>] __free_pages_ok+0xb2/0xba
 [<021467c0>] kmem_freepages+0x6d/0x87
 [<229f0714>] client_free_pages+0xb/0x32 [dm_mod]
 [<229f1730>] kcopyd_client_destroy+0x12/0x26 [dm_mod]
 [<229fe7b2>] snapshot_dtr+0x4f/0x58 [dm_snapshot]
 [<229ec886>] table_destroy+0x39/0x83 [dm_mod]
 [<229ee26a>] __hash_remove+0x3e/0x52 [dm_mod]
 [<229ee941>] dev_remove+0x7a/0x96 [dm_mod]
 [<229ef74b>] ctl_ioctl+0xcb/0x10d [dm_mod]
 [<229ee8c7>] dev_remove+0x0/0x96 [dm_mod]
 [<021760f3>] sys_ioctl+0x296/0x337
 Bad EIP value.
 <1>Unable to handle kernel paging request at virtual address 22d19000
 printing eip:
229ff94f
*pde = 1e0a7067
Oops: 0002 [#2]
Modules linked in: tun nfsd exportfs lockd md5 ipv6 parport_pc lp
parport autofs4 sunrpc ds ipt_REJECT ipt_LOG ipt_state iptable_filter
iptable_nat ip_conntrack iptable_mangle ip_tables button battery
asus_acpi ac yenta_socket pcmcia_core uhci_hcd i8xx_tco snd_maestro3
snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer
snd_page_alloc snd soundcore e100 mii floppy dm_snapshot dm_mirror
ext3 jbd raid1 dm_mod usb_storage sbp2 ohci1394 ieee1394 sd_mod scsi_mod
CPU:    0
EIP:    0060:[<229ff94f>]    Not tainted VLI
EFLAGS: 00010206   (2.6.8-1.541) 
EIP is at write_exception+0x18/0x2d [dm_snapshot]
eax: 00f33909   ebx: 1e89bf0c   ecx: 22d19000   edx: 00000000
esi: 229fea4b   edi: 219efda0   ebp: 229fea50   esp: 1e89bf04
ds: 007b   es: 007b   ss: 0068
Process kcopyd (pid: 3958, threadinfo=1e89b000 task=201da030)
Stack: 00000000 229ffc4b 00f33909 00000000 0008ba26 00000000 17786b94
2010a080 
       17786b94 229fea50 229fea80 17786b94 0e2f2b74 00000000 229f083d
00000000 
       0e2f2b74 229f5dd8 00000246 229f07fe 229f0c31 00000006 229f65c0
00000206 
Call Trace:
 [<229ffc4b>] persistent_commit+0x39/0xd4 [dm_snapshot]
 [<229fea50>] copy_callback+0x0/0x34 [dm_snapshot]
 [<229fea80>] copy_callback+0x30/0x34 [dm_snapshot]
 [<229f083d>] run_complete_job+0x3f/0x47 [dm_mod]
 [<229f07fe>] run_complete_job+0x0/0x47 [dm_mod]
 [<229f0c31>] process_jobs+0x19/0x3bc [dm_mod]
 [<229f0fe3>] do_work+0xf/0x2d [dm_mod]
 [<02130705>] worker_thread+0x21f/0x33d
 [<229f0fd4>] do_work+0x0/0x2d [dm_mod]
 [<0211b5bd>] default_wake_function+0x0/0xc
 [<0211b5bd>] default_wake_function+0x0/0xc
 [<021304e6>] worker_thread+0x0/0x33d
 [<02135401>] kthread+0x69/0x91
 [<02135398>] kthread+0x0/0x91
 [<021041d9>] kernel_thread_helper+0x5/0xb
Code: 8b 41 08 89 53 04 8b 51 0c 89 43 08 31 c0 89 53 0c 5b c3 53 89
cb e8 b9 ff ff ff 89 c1 85 c9 b8 ea ff ff ff 74 18 8b 03 8b 53 04 <89>
01 8b 43 08 89 51 04 8b 53 0c 89 41 08 31 c0 89 51 0c 5b c3 
Comment 6 Alasdair Kergon 2004-09-26 19:44:36 EDT
Only physical memory counts: snapshots can't use swap.

Need to locate the source for the dm components (dm* and kcopyd* in 
drivers/md/) you were running and check whether it matches current 
upstream [e.g. to confirm whether or not an upstream fix affecting 
kcopyd_client_destroy got applied yet to the version you were 
running].  If the code you're running *does* match upstream, then 
I'll spend time investigating.
Comment 7 Alasdair Kergon 2004-10-11 16:31:48 EDT
Also similar report on linux-kernel last week.

I think the problems will go away once the lvm2 activation code has
been changed to make use of the device-mapper create/load ioctl
splitting.  But this is a significant amount of work and it is not yet
scheduled.
Comment 8 Alasdair Kergon 2005-01-12 09:04:33 EST
*** Bug 144874 has been marked as a duplicate of this bug. ***
Comment 9 Alasdair Kergon 2005-01-12 09:08:14 EST
Update: this problem is now fully understood and work on the solution
is underway.
Comment 10 Alasdair Kergon 2005-01-18 11:18:05 EST
*** Bug 143650 has been marked as a duplicate of this bug. ***
Comment 12 David Juran 2005-04-25 05:10:53 EDT
How is the solution comming along? I think I stumbled upon this problem on a
machine running the kernel-smp-2.6.9-5.0.3.EL.x86_64 on RHEL4. I got the
following messages in my syslog:

Apr 22 16:32:06 livingston kernel: kjournald starting.  Commit interval 5 seconds
Apr 22 16:32:06 livingston kernel: EXT3-fs: dm-21: orphan cleanup on readonly fs
Apr 22 16:32:06 livingston kernel: EXT3-fs warning (device dm-21):
ext3_orphan_get: bad orphan inode 18874!  e2fsck was run?
Apr 22 16:32:06 livingston kernel:
Apr 22 16:32:06 livingston kernel: ext3_test_bit(bit=2489, block=33794) = 0
Apr 22 16:32:06 livingston kernel: inode=0000000000000000
Apr 22 16:32:06 livingston kernel: EXT3-fs: recovery complete.
Apr 22 16:32:06 livingston kernel: EXT3-fs: mounted filesystem with ordered data
mode.
Apr 22 16:32:06 livingston kernel: SELinux: initialized (dev dm-21, type ext3),
uses xattr
Apr 22 16:32:07 livingston kernel: kjournald starting.  Commit interval 5 seconds
Apr 22 16:32:07 livingston kernel: EXT3-fs: mounted filesystem with ordered data
mode.
Apr 22 16:32:07 livingston kernel: SELinux: initialized (dev dm-24, type ext3),
uses xattr
Apr 22 16:32:08 livingston kernel: kjournald starting.  Commit interval 5 seconds
Apr 22 16:32:08 livingston kernel: EXT3-fs: mounted filesystem with ordered data
mode.
Apr 22 16:32:08 livingston kernel: SELinux: initialized (dev dm-27, type ext3),
uses xattr
Apr 22 16:32:11 livingston kernel: lvcreate: page allocation failure. order:4,
mode:0xd0
Apr 22 16:32:11 livingston kernel:
Apr 22 16:32:11 livingston kernel: Call
Trace:<ffffffff801ea63a>{vgacon_cursor+0} <ffffffff80157ba6>{__alloc_pages+768}
Apr 22 16:32:11 livingston kernel:       
<ffffffff80157c3a>{__get_free_pages+11} <ffffffff8015aab4>{kmem_getpages+36}
Apr 22 16:32:11 livingston kernel:       
<ffffffff8015b24f>{cache_alloc_refill+615} <ffffffff8015af17>{__kmalloc+123}
Apr 22 16:32:11 livingston kernel:        <ffffffff80156522>{mempool_resize+136}
<ffffffffa003d6d3>{:dm_mod:resize_pool+79}
Apr 22 16:32:11 livingston kernel:       
<ffffffffa003eade>{:dm_mod:kcopyd_client_create+598}
Apr 22 16:32:11 livingston kernel:       
<ffffffffa0088488>{:dm_snapshot:snapshot_ctr+740}
<ffffffffa003ac5e>{:dm_mod:dm_table_add_target+554}
Apr 22 16:32:11 livingston kernel:       
<ffffffffa003cde6>{:dm_mod:table_load+225} <ffffffffa003cd05>{:dm_mod:table_load+0}
Apr 22 16:32:11 livingston kernel:       
<ffffffffa003d49b>{:dm_mod:ctl_ioctl+569} <ffffffff801833ca>{sys_ioctl+858}
Apr 22 16:32:11 livingston kernel:        <ffffffff8011000e>{system_call+126}
Apr 22 16:32:11 livingston kernel: device-mapper: : Could not create kcopyd client
Apr 22 16:32:11 livingston kernel:
Apr 22 16:32:11 livingston kernel: device-mapper: error adding target to table
Apr 22 16:32:11 livingston SnapVault[18629]: OSSV: /snapshot/master
sudbury:/vol/vol3/livingston_lfs_master /snapshot/master is not an existing
directory
Apr 22 16:32:11 livingston kernel: FAT: Unrecognized mount option "acl" or
missing value
Apr 22 16:32:12 livingston SnapVault[18629]: OSSV: /snapshot/patch
sudbury:/vol/vol3/livingston_lfs_patch /snapshot/patch is not an existing directory
Apr 22 16:32:13 livingston kernel: lvcreate: page allocation failure. order:4,
mode:0xd0
Apr 22 16:32:13 livingston kernel:
Apr 22 16:32:13 livingston kernel: Call
Trace:<ffffffff801ea63a>{vgacon_cursor+0} <ffffffff80157ba6>{__alloc_pages+768}
Apr 22 16:32:13 livingston kernel:       
<ffffffff80157c3a>{__get_free_pages+11} <ffffffff8015aab4>{kmem_getpages+36}
Apr 22 16:32:13 livingston kernel:       
<ffffffff8015b24f>{cache_alloc_refill+615} <ffffffff8015af17>{__kmalloc+123}
Apr 22 16:32:13 livingston kernel:        <ffffffff80156522>{mempool_resize+136}
<ffffffffa003d6d3>{:dm_mod:resize_pool+79}
Apr 22 16:32:13 livingston kernel:       
<ffffffffa003eade>{:dm_mod:kcopyd_client_create+598}
Apr 22 16:32:13 livingston kernel:       
<ffffffffa0088488>{:dm_snapshot:snapshot_ctr+740}
<ffffffffa003ac5e>{:dm_mod:dm_table_add_target+554}
Apr 22 16:32:13 livingston kernel:       
<ffffffffa003cde6>{:dm_mod:table_load+225} <ffffffffa003cd05>{:dm_mod:table_load+0}
Apr 22 16:32:13 livingston kernel:       
<ffffffffa003d49b>{:dm_mod:ctl_ioctl+569} <ffffffff801833ca>{sys_ioctl+858}
Apr 22 16:32:13 livingston kernel:        <ffffffff8011000e>{system_call+126}
Apr 22 16:32:13 livingston kernel: device-mapper: : Could not create kcopyd client
Apr 22 16:32:13 livingston kernel:
Apr 22 16:32:13 livingston kernel: device-mapper: error adding target to table

Apr 22 16:32:47 livingston kernel: lvremove: page allocation failure. order:4,
mode:0xd0
Apr 22 16:32:47 livingston kernel:
Apr 22 16:32:48 livingston kernel: Call
Trace:<ffffffff801ea63a>{vgacon_cursor+0} <ffffffff80157ba6>{__alloc_pages+768}
Apr 22 16:32:48 livingston kernel:       
<ffffffff80157c3a>{__get_free_pages+11} <ffffffff8015aab4>{kmem_getpages+36}
Apr 22 16:32:48 livingston kernel:       
<ffffffff8015b24f>{cache_alloc_refill+615} <ffffffff8015af17>{__kmalloc+123}
Apr 22 16:32:48 livingston kernel:        <ffffffff80156522>{mempool_resize+136}
<ffffffffa003d6d3>{:dm_mod:resize_pool+79}
Apr 22 16:32:48 livingston kernel:       
<ffffffffa003eade>{:dm_mod:kcopyd_client_create+598}
Apr 22 16:32:48 livingston kernel:       
<ffffffffa0088488>{:dm_snapshot:snapshot_ctr+740}
<ffffffffa003ac5e>{:dm_mod:dm_table_add_target+554}
Apr 22 16:32:48 livingston kernel:       
<ffffffffa003cde6>{:dm_mod:table_load+225} <ffffffffa003cd05>{:dm_mod:table_load+0}
Apr 22 16:32:48 livingston kernel:       
<ffffffffa003d49b>{:dm_mod:ctl_ioctl+569} <ffffffff801833ca>{sys_ioctl+858}
Apr 22 16:32:48 livingston kernel:        <ffffffff8011000e>{system_call+126}
Apr 22 16:32:48 livingston kernel: device-mapper: : Could not create kcopyd client
Apr 22 16:32:48 livingston kernel:
Apr 22 16:32:48 livingston kernel: device-mapper: error adding target to table
Apr 22 16:32:49 livingston rpc.mountd: authenticated unmount request from
murray.carmen.se:627 for /lfs/local_share (/lfs/local_share)
Apr 22 16:32:49 livingston rpc.mountd: authenticated unmount request from
murray.carmen.se:627 for /lfs/local_share (/lfs/local_share)
Apr 22 16:32:49 livingston kernel: lvremove: page allocation failure. order:4,
mode:0xd0
Apr 22 16:32:49 livingston kernel:
Apr 22 16:32:50 livingston kernel: Call
Trace:<ffffffff801ea63a>{vgacon_cursor+0} <ffffffff80157ba6>{__alloc_pages+768}
Apr 22 16:32:50 livingston kernel:       
<ffffffff80157c3a>{__get_free_pages+11} <ffffffff8015aab4>{kmem_getpages+36}
Apr 22 16:32:50 livingston kernel:       
<ffffffff8015b24f>{cache_alloc_refill+615} <ffffffff8015af17>{__kmalloc+123}
Apr 22 16:32:50 livingston kernel:        <ffffffff80156522>{mempool_resize+136}
<ffffffffa003d6d3>{:dm_mod:resize_pool+79}
Apr 22 16:32:50 livingston kernel:       
<ffffffffa003eade>{:dm_mod:kcopyd_client_create+598}
Apr 22 16:32:50 livingston kernel:       
<ffffffffa0088488>{:dm_snapshot:snapshot_ctr+740}
<ffffffffa003ac5e>{:dm_mod:dm_table_add_target+554}
Apr 22 16:32:50 livingston kernel:       
<ffffffffa003cde6>{:dm_mod:table_load+225} <ffffffffa003cd05>{:dm_mod:table_load+0}
Apr 22 16:32:50 livingston kernel:       
<ffffffffa003d49b>{:dm_mod:ctl_ioctl+569} <ffffffff801833ca>{sys_ioctl+858}
Apr 22 16:32:50 livingston kernel:        <ffffffff8011000e>{system_call+126}
Apr 22 16:32:50 livingston kernel: device-mapper: : Could not create kcopyd client
Apr 22 16:32:50 livingston kernel:
Apr 22 16:32:50 livingston kernel: device-mapper: error adding target to table
Comment 13 Alasdair Kergon 2005-09-21 16:10:21 EDT
The RHEL4 version of this bug is tracked on bug 164959
Comment 14 Damian Menscher 2005-09-28 14:14:53 EDT
Seeing this on RHEL 4.1; will add my logs to that report.  This problem has been
"fully understood" since January... why hasn't a fix been released?
Comment 15 Alasdair Kergon 2005-10-03 11:28:59 EDT
*** Bug 169597 has been marked as a duplicate of this bug. ***
Comment 16 Alasdair Kergon 2005-10-03 11:34:05 EDT
The number of problem reports was small and other work was assigned higher
priority.  The fix involves a substantial amount of work, but this is now
underway and has a high priority.
Comment 17 Alasdair Kergon 2006-01-13 16:20:23 EST
The userspace packages have been updated and kernel patches are starting to
trickle upstream.
Comment 18 Adam Heath 2006-04-10 17:09:34 EDT
Have all the patches been applied for this bug yet?  If so, what kernel versions?  
Comment 19 Alasdair Kergon 2006-04-10 17:48:04 EDT
2.6.17-rc1 has all the patches that are ready so far.
Comment 20 Adam Heath 2006-04-10 17:58:19 EDT
What's the first kernel version that had these patches?  We are running 2.6.12.6
in xen.  Are they easy to backport?  Upgrading is difficult for us.  I'm willing
to do them myself.

I've looked at the device-mapper and lvm2 cvs repositories, but can't really
find the exact series of fixes that deal with this.

Does it require both kernel and userspace changes?
Comment 21 Alasdair Kergon 2006-04-10 18:12:03 EDT
2.6.17-rc1, yes, yes.

Diff the dm files in the two kernels to work out what to change.

For userspace just keep up with the most up-to-date releases.
Comment 22 Adam Heath 2006-04-10 18:23:46 EDT
Sorry for pestering again; I see the redhat advisory that mentions these being
fixed.  Is that based on an unreleased version from cvs, or the last versions
released on Feb 7, 2006?  If the latter, then upgrading is a simple matter of
apt-get(we run debian).

19 files changed, 937 insertions(+), 603 deletions(-)

I haven't tried just copying the files verbatim; I doubt that would work,
anyways.  I'll plug away at this.
Comment 23 Adam Heath 2006-04-10 18:41:42 EDT
Btw, I saw mails about this, with one suggestion that the user should get more
physical ram.  Is that really the case?

Also, this fix talks about preallocating the needed memory.  What if the
preallocation fails?  How big of a contiguous block is needed?

Details about my hardware: 256m machine(xen dom0).  23 logical volumes, on a
545G VG with 99G free.  6 snapshots, which get created/removed for syncing.
Comment 24 Petr Rockai 2006-11-09 07:45:56 EST
This should be fixed in both FC6 and FC devel and to my best knowledge also 
FC5.

Note You need to log in before you can comment on or make changes to this bug.