Bug 143650 - order 0 page allocation failure during lvcreate
order 0 page allocation failure during lvcreate
Status: CLOSED DUPLICATE of bug 132057
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Alasdair Kergon
Brian Brock
Depends On:
  Show dependency treegraph
Reported: 2004-12-23 05:13 EST by Paul P Komkoff Jr
Modified: 2007-11-30 17:10 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2006-02-21 14:07:47 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
output of dmsetup info -c (3.92 KB, application/octet-stream)
2005-04-21 06:01 EDT, Graham King
no flags Details

  None (edit)
Description Paul P Komkoff Jr 2004-12-23 05:13:50 EST
Description of problem:
To make consistent backups of user maildirs I am using the following 
scenario: lvcreate -L 1G -s blabla && mount -o ro && backup && umount 
&& lvremove -f
I am running this procedure 4 times a day (every 6 hours). After last 
kernel update (kernel 2.6.9-1.6_FC2.i686) I started to get following 
Dec 23 00:00:02 ns kernel: lvcreate: page allocation failure. 
order:0, mode:0xd0
Dec 23 00:00:02 ns kernel: [<02147890>] __alloc_pages+0x2bd/0x2db
Dec 23 00:00:02 ns kernel: [<42106533>] alloc_pl+0x27/0x3d [dm_mod]
Dec 23 00:00:02 ns kernel: [<421067c6>] client_alloc_pages+0x15/0x47 
Dec 23 00:00:02 ns kernel: [<421077db>] 
kcopyd_client_create+0x74/0xb4 [dm_mod]
Dec 23 00:00:02 ns kernel: [<4214bd7c>] 
dm_create_persistent+0x94/0xf4 [dm_snapshot]
Dec 23 00:00:02 ns kernel: [<4214a6f0>] snapshot_ctr+0x246/0x2cf 
Dec 23 00:00:02 ns kernel: [<421030cd>] 
dm_table_add_target+0x11c/0x189 [dm_mod]
Dec 23 00:00:02 ns kernel: [<42105143>] populate_table+0x8a/0xaf 
Dec 23 00:00:02 ns kernel: [<4210519f>] table_load+0x37/0xf6 [dm_mod]
Dec 23 00:00:02 ns kernel: [<4210583d>] ctl_ioctl+0xcd/0x10f [dm_mod]
Dec 23 00:00:02 ns kernel: [<42105168>] table_load+0x0/0xf6 [dm_mod]
Dec 23 00:00:02 ns kernel: [<0217b0e2>] sys_ioctl+0x29a/0x33c
Dec 23 00:00:02 ns kernel: [<02108a98>] do_IRQ+0x286/0x290
Dec 23 00:00:02 ns kernel: device-mapper: : Could not create kcopyd 

After which system starts misbehaving badly - i.e. all processes 
working with snapshot source stuck forever in D state or even all 
harddisk access is halted.

This usually happens after 4-5 days of normal work.
Any suggestions?

Version-Release number of selected component (if applicable):
kernel 2.6.9-1.6_FC2.i686

How reproducible:
Well, reproducible

Steps to Reproduce:
1. Create and remove snapshot enough times
Comment 1 Dave Jones 2005-01-10 23:18:16 EST
is this still a problem with todays 2.6.10 updates ?
Comment 2 Paul P Komkoff Jr 2005-01-11 04:58:21 EST
I don't know yet. It's a production server, I will try to update at today
evening and check.
Comment 3 Alasdair Kergon 2005-01-18 11:04:47 EST
This is a known shortcoming of device-mapper snapshots.
Work is underway to address it.
Comment 4 Alasdair Kergon 2005-01-18 11:15:51 EST
Each snapshot requires a certain amount of physical kernel memory to
be available.  If there isn't enough free kernel memory at the instant
you create or activate a snapshot, you get an error like that.

Recent upstream kernel changes have actually made the problem worse.

To recover without rebooting, you need to use 'dmsetup' to reset the
states of the devices in the right sequence, removing the snapshot in
the process.

I hope to have the code fixed in about a month's time. [It involves
rewriting some complicated code, unfortunately.]
Comment 5 Alasdair Kergon 2005-01-18 11:17:58 EST

*** This bug has been marked as a duplicate of 132057 ***
Comment 6 Graham King 2005-04-13 11:21:04 EDT
(In reply to comment #4)
> To recover without rebooting, you need to use 'dmsetup' to reset the
> states of the devices in the right sequence, removing the snapshot in
> the process.

Please could you post a simple example (for an LVM wedged with one snapshot in
use)?  This would really help me while we're waiting for the real fix!  Thanks.
(The man page for dmsetup assumes far more low-level knowledge than I have).
Comment 7 Alasdair Kergon 2005-04-13 11:30:30 EDT
It depends on how it failed.

Post the output of 'dmsetup info -c' here.
Comment 8 Graham King 2005-04-13 12:11:09 EDT
Too late to do that, I'm afraid.  I blundered around for several hours and
eventually resuscitated the machine, without understanding how or why.
But when the problem first arose, I noted that vgob01-lv_home (the source of the
snapshot) was INACTIVE and suspended, whereas the corresponding snapshot,
vgob01-hdup_snapshot, was INACTIVE and available.

I managed to regain use of the machine with dmsetup resume vgob01-lv_home and
lvremove /dev/vgob01/hdup_snapshot.  But a reboot then failed with:
"device-mapper: : unknown target type
Couldn't load device 'vgob01-hdup_snapshot' "

If that's not enough info, I'll post the requested output next time the problem
happens (seems to be once every few weeks at present).  Thanks for your help.
Comment 9 Graham King 2005-04-21 06:01:27 EDT
Created attachment 113461 [details]
output of dmsetup info -c

as requested in comment 7.  This was run from a virtual console in multi-user
mode before attempting any recovery.
If this leads to a dmsetup "recipe" I'd be grateful.  I can post a log of my
present (embarrassingly ignorant, long-winded, and probably dangerous) recovery
procedure, if that would help.
Comment 10 Red Hat Bugzilla 2006-02-21 14:07:47 EST
Changed to 'CLOSED' state since 'RESOLVED' has been deprecated.

Note You need to log in before you can comment on or make changes to this bug.