Bug 179751 - lvremove panic in dm_mod:kcopyd_client_destroy while attempting to remove a snapshot
Summary: lvremove panic in dm_mod:kcopyd_client_destroy while attempting to remove a s...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.3
Hardware: All
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Alasdair Kergon
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 168430
TreeView+ depends on / blocked
 
Reported: 2006-02-02 17:30 UTC by Corey Marthaler
Modified: 2007-11-30 22:07 UTC (History)
3 users (show)

Fixed In Version: RHSA-2006-0132
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-03-07 21:21:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2006:0132 0 qe-ready SHIPPED_LIVE Moderate: Updated kernel packages available for Red Hat Enterprise Linux 4 Update 3 2006-03-09 16:31:00 UTC

Description Corey Marthaler 2006-02-02 17:30:51 UTC
Description of problem:
1. Made a 10G filesystem
2. Put a 3G (or so) file in it using dd
3. Created two 3G snapshots of that filesystem
4. Started dd'ing to another file in that filesystem
5. With the dd I/O running, attempted to remove both of the snapshots 
[that caused the machine to panic]


----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at kcopyd:145
invalid operand: 0000 [1]
CPU 0
Modules linked in: md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core
sunrpc ds yenta_socket pcmcia_core button battery ac ohci_hcd hw_random tg3
floppy dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod qla2300 qla2xxx
scsi_transport_fc mptscsih mptsas mptspi mptfc mptscsi mptbase sd_mod scsi_mod
Pid: 4410, comm: lvremove Not tainted 2.6.9-29.EL
RIP: 0010:[<ffffffffa00ba779>] <ffffffffa00ba779>{:dm_mod:client_free_pages+12}
RSP: 0000:000001001f4b1d18  EFLAGS: 00010283
RAX: 0000000000000100 RBX: 00000100062d3680 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000100062d3680
RBP: ffffff0000032080 R08: 000001003ffd2010 R09: 000001003ffef280
R10: 0000000000000046 R11: 00000100062d3680 R12: ffffffffa00c4e60
R13: 00000000c138fd04 R14: 00000000006aa940 R15: ffffffffa00b8714
FS:  0000002a955783e0(0000) GS:ffffffff80538980(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000006bb92b CR3: 0000000000101000 CR4: 00000000000006e0
Process lvremove (pid: 4410, threadinfo 000001001f4b0000, task 00000100195e8570)
Stack: 00000100062d3680 ffffffffa00bb9d5 000001002f0fe2c0 000001002f0fe2c0
       0000000000000000 ffffffffa010d72c ffffffff80216690 ffffff0000032080
       000001001d106000 ffffffffa00b67b5
Call Trace:<ffffffffa00bb9d5>{:dm_mod:kcopyd_client_destroy+30}
       <ffffffffa010d72c>{:dm_snapshot:snapshot_dtr+242}
<ffffffff80216690>{kobject_release+0}
       <ffffffffa00b67b5>{:dm_mod:table_destroy+72}
<ffffffffa00b7fde>{:dm_mod:__hash_remove+124}
       <ffffffffa00b87ad>{:dm_mod:dev_remove+153}
<ffffffffa00b9a96>{:dm_mod:ctl_ioctl+599}
       <ffffffff801a37e2>{sys_ioctl+1006} <ffffffff80110b76>{tracesys+209}


Code: 0f 0b 36 cc 0b a0 ff ff ff ff 91 00 48 8b 7b 40 e8 b3 ff ff
RIP <ffffffffa00ba779>{:dm_mod:client_free_pages+12} RSP <000001001f4b1d18>
 <0>Kernel panic - not syncing: Oops


Here is how far the two lvremove processes made it:

[...]
stat("/dev/mapper/VolGroup00-LogVol00", {st_mode=S_IFBLK|0660,
st_rdev=makedev(253, 0), ...}) = 0
lstat("/dev", {st_mode=S_IFDIR|0755, st_size=5780, ...}) = 0
lstat("/dev", {st_mode=S_IFDIR|0755, st_size=5780, ...}) = 0
lstat("/dev/mapper", {st_mode=S_IFDIR|0755, st_size=220, ...}) = 0
lstat("/dev/VolGroup00", {st_mode=S_IFDIR|0700, st_size=80, ...}) = 0
lstat("/dev/mapper/VolGroup00-LogVol00", {st_mode=S_IFBLK|0660,
st_rdev=makedev(253, 0), ...}) = 0
lstat("/dev/VolGroup00/LogVol00", {st_mode=S_IFLNK|0777, st_size=31, ...}) = 0
stat("/dev/dm-2", {st_mode=S_IFBLK|0640, st_rdev=makedev(253, 2), ...}) = 0
stat("/dev/hda1", {st_mode=S_IFBLK|0660, st_rdev=makedev(3, 1), ...}) = 0
stat("/dev/ram1", {st_mode=S_IFBLK|0660, st_rdev=makedev(1, 1), ...}) = 0
lstat("/dev", {st_mode=S_IFDIR|0755, st_size=5780, ...}) = 0
lstat("/dev", {st_mode=S_IFDIR|0755, st_size=5780, ...}) = 0
lstat("/dev/ram1", {st_mode=S_IFBLK|0660, st_rdev=makedev(1, 1), ...}) = 0
lstat("/dev/ram", {st_mode=S_IFLNK|0777, st_size=4, ...}) = 0
stat("/dev/sda1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 1), ...}) = 0
stat("/dev/sdb1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 17), ...}) = 0
stat("/dev/sdc1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 33), ...}) = 0
stat("/dev/sdd1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 49), ...}) = 0
stat("/dev/sde1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 65), ...}) = 0
stat("/dev/sdf1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 81), ...}) = 0
stat("/dev/ramdisk", {st_mode=S_IFBLK|0660, st_rdev=makedev(1, 0), ...}) = 0
stat("/dev/mapper/VolGroup00-LogVol01", {st_mode=S_IFBLK|0660,
st_rdev=makedev(253, 1), ...}) = 0
lstat("/dev", {st_mode=S_IFDIR|0755, st_size=5780, ...}) = 0
lstat("/dev", {st_mode=S_IFDIR|0755, st_size=5780, ...}) = 0
lstat("/dev/mapper", {st_mode=S_IFDIR|0755, st_size=220, ...}) = 0
lstat("/dev/VolGroup00", {st_mode=S_IFDIR|0700, st_size=80, ...}) = 0
lstat("/dev/mapper/VolGroup00-LogVol01", {st_mode=S_IFBLK|0660,
st_rdev=makedev(253, 1), ...}) = 0
lstat("/dev/VolGroup00/LogVol01", {st_mode=S_IFLNK|0777, st_size=31, ...}) = 0
stat("/dev/ram0", {st_mode=S_IFBLK|0660, st_rdev=makedev(1, 0), ...}) = 0
lstat("/dev", {st_mode=S_IFDIR|0755, st_size=5780, ...}) = 0
lstat("/dev", {st_mode=S_IFDIR|0755, st_size=5780, ...}) = 0
lstat("/dev/ram0", {st_mode=S_IFBLK|0660, st_rdev=makedev(1, 0), ...}) = 0
lstat("/dev/ramdisk", {st_mode=S_IFLNK|0777, st_size=4, ...}) = 0
stat("/dev/root", {st_mode=S_IFBLK|0660, st_rdev=makedev(253, 0), ...}) = 0
stat("/dev/dm-1", {st_mode=S_IFBLK|0640, st_rdev=makedev(253, 1), ...}) = 0
stat("/dev/hda2", {st_mode=S_IFBLK|0660, st_rdev=makedev(3, 2), ...}) = 0
stat("/dev/ram2", {st_mode=S_IFBLK|0660, st_rdev=makedev(1, 2), ...}) = 0
stat("/dev/ram8", {st_mode=S_IFBLK|0660, st_rdev=makedev(1, 8), ...}) = 0
stat("/dev/ram9", {st_mode=S_IFBLK|0660, st_rdev=makedev(1, 9), ...}) = 0
stat("/dev/dm-0", {st_mode=S_IFBLK|0640, st_rdev=makedev(253, 0), ...}) = 0
lstat("/dev", {st_mode=S_IFDIR|0755, st_size=5780, ...}) = 0
lstat("/dev", {st_mode=S_IFDIR|0755, st_size=5780, ...}) = 0
lstat("/dev/dm-0", {st_mode=S_IFBLK|0640, st_rdev=makedev(253, 0), ...}) = 0
lstat("/dev/root", {st_mode=S_IFLNK|0777, st_size=24, ...}) = 0
stat("/dev/ram15", {st_mode=S_IFBLK|0660, st_rdev=makedev(1, 15), ...}) = 0
stat("/dev/hda3", {st_mode=S_IFBLK|0660, st_rdev=makedev(3, 3), ...}) = 0
stat("/dev/ram3", {st_mode=S_IFBLK|0660, st_rdev=makedev(1, 3), ...}) = 0
stat("/dev/ram4", {st_mode=S_IFBLK|0660, st_rdev=makedev(1, 4), ...}) = 0
stat("/dev/ram7", {st_mode=S_IFBLK|0660, st_rdev=makedev(1, 7), ...}) = 0
stat("/dev/snapper/origin", {st_mode=S_IFBLK|0660, st_rdev=makedev(253, 2),
...}) = 0
getpid()                                = 4412
stat("/etc/lvm/lvm.conf", {st_mode=S_IFREG|0644, st_size=10437, ...}) = 0
stat("/var/lock/lvm", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
access("/var/lock/lvm", R_OK|W_OK|X_OK) = 0
stat("/proc/lvm/VGs/snapper", 0x7fbfffb4c0) = -1 ENOENT (No such file or directory)
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0
open("/var/lock/lvm/V_snapper", O_RDWR|O_APPEND|O_CREAT, 0777) = 3
rt_sigaction(SIGINT, {0x44fd60, [INT], SA_RESTORER|SA_RESTART, 0x336f82e380},
{SIG_DFL}, 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[INT RTMIN RT_1], NULL, 8) = 0
rt_sigaction(SIGINT, NULL, {0x44fd60, [INT], SA_RESTORER|SA_RESTART,
0x336f82e380}, 8) = 0
rt_sigaction(SIGINT, {0x44fd60, [INT], SA_RESTORER, 0x336f82e380}, NULL, 8) = 0
flock(3, LOCK_EX  



[...]
read(5, "snapper {\nid = \"Vm8vhZ-i5oP-dStF"..., 512) = 512
lseek(6, 0, SEEK_SET)                   = 0
read(6, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2048) = 2048
lseek(6, 4096, SEEK_SET)                = 4096
read(6, "\17A\0335 LVM2 x[5A%r0N*>\1\0\0\0\0\20\0\0\0\0\0\0"..., 512) = 512
lseek(6, 52736, SEEK_SET)               = 52736
read(6, "snapper {\nid = \"Vm8vhZ-i5oP-dStF"..., 512) = 512
lseek(7, 0, SEEK_SET)                   = 0
read(7, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2048) = 2048
lseek(7, 4096, SEEK_SET)                = 4096
read(7, "\17A\0335 LVM2 x[5A%r0N*>\1\0\0\0\0\20\0\0\0\0\0\0"..., 512) = 512
lseek(7, 52736, SEEK_SET)               = 52736
read(7, "snapper {\nid = \"Vm8vhZ-i5oP-dStF"..., 512) = 512
lseek(8, 0, SEEK_SET)                   = 0
read(8, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2048) = 2048
lseek(8, 4096, SEEK_SET)                = 4096
read(8, "\17A\0335 LVM2 x[5A%r0N*>\1\0\0\0\0\20\0\0\0\0\0\0"..., 2048) = 2048
lseek(8, 51200, SEEK_SET)               = 51200
read(8, "OQ-t2W0-qx5w-ty2x-LX7NVg\"\nstatus"..., 2048) = 2048
lseek(9, 0, SEEK_SET)                   = 0
read(9, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2048) = 2048
lseek(9, 4096, SEEK_SET)                = 4096
read(9, "\17A\0335 LVM2 x[5A%r0N*>\1\0\0\0\0\20\0\0\0\0\0\0"..., 512) = 512
lseek(9, 52736, SEEK_SET)               = 52736
read(9, "snapper {\nid = \"Vm8vhZ-i5oP-dStF"..., 512) = 512
lseek(9, 4096, SEEK_SET)                = 4096
read(9, "\17A\0335 LVM2 x[5A%r0N*>\1\0\0\0\0\20\0\0\0\0\0\0"..., 512) = 512
lseek(9, 52736, SEEK_SET)               = 52736
read(9, "snapper {\nid = \"Vm8vhZ-i5oP-dStF"..., 512) = 512
lseek(9, 52736, SEEK_SET)               = 52736
read(9, "snapper {\nid = \"Vm8vhZ-i5oP-dStF"..., 2560) = 2560
lseek(4, 0, SEEK_SET)                   = 0
read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2048) = 2048
lseek(4, 4096, SEEK_SET)                = 4096
read(4, "\17A\0335 LVM2 x[5A%r0N*>\1\0\0\0\0\20\0\0\0\0\0\0"..., 512) = 512
lseek(4, 52736, SEEK_SET)               = 52736
read(4, "snapper {\nid = \"Vm8vhZ-i5oP-dStF"..., 512) = 512
lseek(5, 0, SEEK_SET)                   = 0
read(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2048) = 2048
lseek(5, 4096, SEEK_SET)                = 4096
read(5, "\17A\0335 LVM2 x[5A%r0N*>\1\0\0\0\0\20\0\0\0\0\0\0"..., 512) = 512
lseek(5, 52736, SEEK_SET)               = 52736
read(5, "snapper {\nid = \"Vm8vhZ-i5oP-dStF"..., 512) = 512
lseek(6, 0, SEEK_SET)                   = 0
read(6, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2048) = 2048
lseek(6, 4096, SEEK_SET)                = 4096
read(6, "\17A\0335 LVM2 x[5A%r0N*>\1\0\0\0\0\20\0\0\0\0\0\0"..., 512) = 512
lseek(6, 52736, SEEK_SET)               = 52736
read(6, "snapper {\nid = \"Vm8vhZ-i5oP-dStF"..., 512) = 512
lseek(7, 0, SEEK_SET)                   = 0
read(7, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2048) = 2048
lseek(7, 4096, SEEK_SET)                = 4096
read(7, "\17A\0335 LVM2 x[5A%r0N*>\1\0\0\0\0\20\0\0\0\0\0\0"..., 512) = 512
lseek(7, 52736, SEEK_SET)               = 52736
read(7, "snapper {\nid = \"Vm8vhZ-i5oP-dStF"..., 512) = 512
lseek(8, 0, SEEK_SET)                   = 0
read(8, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2048) = 2048
lseek(8, 4096, SEEK_SET)                = 4096
read(8, "\17A\0335 LVM2 x[5A%r0N*>\1\0\0\0\0\20\0\0\0\0\0\0"..., 2048) = 2048
lseek(8, 51200, SEEK_SET)               = 51200
read(8, "OQ-t2W0-qx5w-ty2x-LX7NVg\"\nstatus"..., 2048) = 2048
lseek(9, 0, SEEK_SET)                   = 0
read(9, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2048) = 2048
lseek(9, 4096, SEEK_SET)                = 4096
read(9, "\17A\0335 LVM2 x[5A%r0N*>\1\0\0\0\0\20\0\0\0\0\0\0"..., 512) = 512
lseek(9, 52736, SEEK_SET)               = 52736
read(9, "snapper {\nid = \"Vm8vhZ-i5oP-dStF"..., 512) = 512
ioctl(10, DM_DEV_STATUS, 0x6aa940)      = 0
mlockall(MCL_CURRENT|MCL_FUTURE)        = 0
brk(0x12bb000)                          = 0x12bb000
brk(0xabb000)                           = 0xabb000
getpriority(PRIO_PROCESS, 0)            = 20
setpriority(PRIO_PROCESS, 0, 4294967278


Version-Release number of selected component (if applicable):
[root@link-02 ~]# lvremove --version
  LVM version:     2.02.01 (2005-11-23)
  Library version: 1.02.02 (2006-01-04)
  Driver version:  4.5.0
[root@link-02 ~]# rpm -q device-mapper
device-mapper-1.02.02-3.0.RHEL4
[root@link-02 ~]# rpm -q lvm2
lvm2-2.02.01-1.3.RHEL4
[root@link-02 ~]# uname -ar
Linux link-02 2.6.9-29.EL #1 Wed Jan 25 11:18:11 EST 2006 x86_64 x86_64 x86_64
GNU/Linux

It was actually link-08 that paniced, but link-02 has that same installed rpm
versions.

Comment 1 Corey Marthaler 2006-02-02 19:27:17 UTC
Just a note that with mulitple snapshots, this is very reproducable.

Comment 2 Corey Marthaler 2006-02-02 23:53:12 UTC
I can reproduce this without the filesystem involved, (unless this is a
different bug).

I was running b_doio/b_iogen to an origin volume (no filesystems involved at
all). And while that I/O was going on, I created two snapshots and then
successfully deleted one, and then attempted to delete the other, which caused this.

slab error <1>Unable to handle kernel NULL pointer dereference at
0000000000000030 RIP:
<ffffffff8016a618>{mempool_free+11}
PML4 38196067 PGD 38b32067 PMD 0
Oops: 0000 [1]
CPU 0
Modules linked in: md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core
sunrpc ds yenta_socket pcmcia_core button battery ac ohci_hcd hw_random tg3
floppy dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod qla2300 qla2xxx
scsi_transport_fc mptscsih mptsas mptspi mptfc mptscsi mptbase sd_mod scsi_mod
Pid: 4155, comm: lvremove Not tainted 2.6.9-29.EL
RIP: 0010:[<ffffffff8016a618>] <ffffffff8016a618>{mempool_free+11}
RSP: 0000:ffffffff804a4258  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000002000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000100357d9580
RBP: 0000000000000000 R08: 0000000000000000 R09: 00000100357d9580
R10: 0000000000000246 R11: ffffffffa00b9f2c R12: 00000100362381d0
R13: ffffffffa00ba87c R14: 0000000000000000 R15: 0000000000080000
FS:  0000002a955783e0(0000) GS:ffffffff80538980(0000) knlGS:00000000f7fdd6c0
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000030 CR3: 0000000000101000 CR4: 00000000000006e0
Process lvremove (pid: 4155, threadinfo 00000100383dc000, task 00000100389280f0)
Stack: 0000000000000246 0000000000000246 0000000000000000 ffffffffa00b9f15
       0000010035fa3d80 0000010038ffe268 00000100390c4a00 ffffffffa00b9fcd
       00000100352598b8 ffffffffa00b5203
Call Trace:<IRQ> <ffffffffa00b9f15>{:dm_mod:dec_count+67}
<ffffffffa00b9fcd>{:dm_mod:endio+161}
       <ffffffffa00b5203>{:dm_mod:clone_endio+205}
<ffffffff8028785c>{__end_that_request_first+238}
       <ffffffffa0008914>{:scsi_mod:scsi_end_request+44}
<ffffffffa0008cf8>{:scsi_mod:scsi_io_completion+474}
       <ffffffffa000330c>{:scsi_mod:scsi_softirq+219}
<ffffffff8013faec>{__do_softirq+76}
       <ffffffff8013fb73>{do_softirq+49} <ffffffff80113c57>{do_IRQ+664}
       <ffffffff80110fdb>{ret_from_intr+0}  <EOI>
<ffffffff80196180>{bdev_destroy_inode+0}
       <ffffffff8027bac7>{serial_in+83}
<ffffffff8027e702>{serial8250_console_write+113}
       <ffffffffa00b8714>{:dm_mod:dev_remove+0}
<ffffffff80139e1f>{__call_console_drivers+68}
       <ffffffff8013a166>{release_console_sem+495} <ffffffff8013a5e6>{vprintk+873}
       <ffffffff8013a709>{printk+141} <ffffffff8016ff30>{kmem_freepages+192}
       <ffffffff80196180>{bdev_destroy_inode+0}
<ffffffff8016ff30>{kmem_freepages+192}
       <ffffffff80196180>{bdev_destroy_inode+0} <ffffffff80170cb2>{slab_destroy+115}
       <ffffffff80171314>{__cache_shrink+519}
<ffffffff8017155a>{kmem_cache_destroy+191}
       <ffffffffa00b9b1a>{:dm_mod:bio_set_exit+34}
<ffffffffa00b9ca8>{:dm_mod:resize_pool+61}
       <ffffffffa00b8714>{:dm_mod:dev_remove+0}
<ffffffffa00bb9cd>{:dm_mod:kcopyd_client_destroy+22}
       <ffffffffa010d72c>{:dm_snapshot:snapshot_dtr+242}
<ffffffff80216690>{kobject_release+0}
       <ffffffffa00b67b5>{:dm_mod:table_destroy+72}
<ffffffffa00b7fde>{:dm_mod:__hash_remove+124}
       <ffffffffa00b87ad>{:dm_mod:dev_remove+153}
<ffffffffa00b9a96>{:dm_mod:ctl_ioctl+599}
       <ffffffff801a37e2>{sys_ioctl+1006} <ffffffff80110a12>{system_call+126}


Code: 8b 46 30 39 46 34 0f 8d 3c 01 00 00 9c 8f 44 24 08 fa 48 81
RIP <ffffffff8016a618>{mempool_free+11} RSP <ffffffff804a4258>
CR2: 0000000000000030
 <0>Kernel panic - not syncing: Oops

Comment 3 Alasdair Kergon 2006-02-03 19:58:42 UTC
What's happening here is that the snapshot device gets removed while kcopyd is
still using it.

Possible fixes:

  Maintain a counter of the number of outstanding bios referencing the snapshot
structure and wait for it to drop to 0 before deletion;
[Disadvantage - still have to wait for the backlog of now-worthless I/O to be
generated and completed.  Advantage - fairly quick and straightforward to code.]

  Provide a mechanism to cancel outstanding kcopyd jobs referencing a device.
[Disadvantage - involves more work as the jobs that need cancelling are not
indexed anywhere yet.  Advantage - should avoid the long delays you can get when
running 'lvremove', as I/O can get discarded.]


Comment 7 Corey Marthaler 2006-02-24 16:33:16 UTC
This does appear to be fixed however with bz182935 appearing, I'm worried about
closing this bug just yet.

Comment 8 Alasdair Kergon 2006-02-24 16:41:20 UTC
I doubt they're related.

What I think is happening in general is that now that snapshots are usable at
last, you're able to stress them in ways that you couldn't before, and this is
finding new issues that used to be masked by the other problems.

Comment 10 Red Hat Bugzilla 2006-03-07 21:21:47 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0132.html



Note You need to log in before you can comment on or make changes to this bug.