Bug 181574 - device-mapper mirror removal stuck on kcopyd_client_destroy (pvmove hangs)
Summary: device-mapper mirror removal stuck on kcopyd_client_destroy (pvmove hangs)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.0
Hardware: All
OS: Linux
high
medium
Target Milestone: ---
: ---
Assignee: Alasdair Kergon
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 168430
TreeView+ depends on / blocked
 
Reported: 2006-02-14 22:40 UTC by Corey Marthaler
Modified: 2007-11-30 22:07 UTC (History)
6 users (show)

Fixed In Version: RHSA-2006-0132
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-03-07 21:23:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2006:0132 0 qe-ready SHIPPED_LIVE Moderate: Updated kernel packages available for Red Hat Enterprise Linux 4 Update 3 2006-03-09 16:31:00 UTC

Description Corey Marthaler 2006-02-14 22:40:27 UTC
Description of problem:
Mirror creation and deletion has been deadlocking.

strace of a mirror lvremove:
[...]
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = 0
mlockall(MCL_CURRENT|MCL_FUTURE)        = 0
brk(0xedb000)                           = 0xedb000
brk(0x10c3000)                          = 0x10c3000
getpriority(PRIO_PROCESS, 0)            = 20
setpriority(PRIO_PROCESS, 0, 4294967278) = 0
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = 0
ioctl(8, DM_TABLE_DEPS, 0x6b01e0)       = 0
ioctl(8, DM_TABLE_DEPS, 0x6c2230)       = 0
ioctl(8, DM_TABLE_DEPS, 0x6c2230)       = 0
ioctl(8, DM_TABLE_DEPS, 0x6c2230)       = 0
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = -1 ENXIO (No such device or address)
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = -1 ENXIO (No such device or address)
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = -1 ENXIO (No such device or address)
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = -1 ENXIO (No such device or address)
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = -1 ENXIO (No such device or address)
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = -1 ENXIO (No such device or address)
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = 0
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = -1 ENXIO (No such device or address)
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = -1 ENXIO (No such device or address)
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = -1 ENXIO (No such device or address)
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = -1 ENXIO (No such device or address)
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = -1 ENXIO (No such device or address)
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = -1 ENXIO (No such device or address)
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = 0
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = -1 ENXIO (No such device or address)
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = -1 ENXIO (No such device or address)
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = -1 ENXIO (No such device or address)
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = -1 ENXIO (No such device or address)
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = -1 ENXIO (No such device or address)
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = -1 ENXIO (No such device or address)
ioctl(8, DM_DEV_STATUS, 0x6b01e0)       = 0
ioctl(8, DM_DEV_REMOVE
[HANG]

strace of an lvscan during this deadlock:
[...]
stat("/dev/sde1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 65), ...}) = 0
stat("/dev/sde1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 65), ...}) = 0
open("/dev/sde1", O_RDONLY|O_DIRECT|O_NOATIME) = 3
fstat(3, {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 65), ...}) = 0
ioctl(3, BLKBSZGET, 0x687fe8)           = 0
lseek(3, 4096, SEEK_SET)                = 4096
read(3, "\301k\352\217 LVM2 x[5A%r0N*>\1\0\0\0\0\20\0\0\0\0\0\0"..., 2048) = 2048
lseek(3, 6144, SEEK_SET)                = 6144
read(3, " = [\"ALLOCATABLE\"]\npe_start = 38"..., 2048) = 2048
close(3)                                = 0
stat("/dev/sdf1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 81), ...}) = 0
stat("/dev/sdf1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 81), ...}) = 0
open("/dev/sdf1", O_RDONLY|O_DIRECT|O_NOATIME) = 3
fstat(3, {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 81), ...}) = 0
ioctl(3, BLKBSZGET, 0x6880b8)           = 0
lseek(3, 0, SEEK_SET)                   = 0
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2048) = 2048
close(3)                                = 0
stat("/proc/lvm/VGs/VG", 0x7fbfffb500)  = -1 ENOENT (No such file or directory)
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0
open("/var/lock/lvm/V_VG", O_RDWR|O_APPEND|O_CREAT, 0777) = 3
rt_sigaction(SIGINT, {0x44fd60, [INT], SA_RESTORER|SA_RESTART, 0x2a9579e380},
{SIG_DFL}, 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[INT RTMIN RT_1], NULL, 8) = 0
rt_sigaction(SIGINT, NULL, {0x44fd60, [INT], SA_RESTORER|SA_RESTART,
0x2a9579e380}, 8) = 0
rt_sigaction(SIGINT, {0x44fd60, [INT], SA_RESTORER, 0x2a9579e380}, NULL, 8) = 0
flock(3, LOCK_SH
[HANG]


Version-Release number of selected component (if applicable):
[root@link-08 lvm]# uname -ar
Linux link-08 2.6.9-31.ELsmp #1 SMP Thu Feb 9 19:02:57 EST 2006 x86_64 x86_64
x86_64 GNU/Linux
[root@link-08 lvm]# rpm -q device-mapper
device-mapper-1.02.02-3.0.RHEL4
[root@link-08 lvm]# rpm -q lvm2
lvm2-2.02.01-1.3.RHEL4


How reproducible:
often

Comment 1 Alasdair Kergon 2006-02-15 16:28:58 UTC
With problems where a device-mapper operation is hanging in the kernel, I need
diagnostics similar to those requested on bug 179786, such as sysrq t for every
process on the machine, and dmsetup info -c etc.

Comment 3 Corey Marthaler 2006-02-15 21:11:48 UTC
This has nothing to do with mirror creation. The mirror gets created just fine,
from there though, any LVM cmd on that mirror, it's VG, or PVs, deadlock.

Comment 4 Corey Marthaler 2006-02-15 21:30:42 UTC
[root@link-08 lvm]# ps -ef | grep vg
root      7152  3709  0 09:49 pts/1    00:00:00 qarsh root link-08 vgchange -an
root      7154  7153  0 09:49 ?        00:00:00 vgchange -an
root      7429  7232  0 10:03 pts/2    00:00:00 grep vg


[root@link-08 lvm]# lsof -p 7154
COMMAND  PID USER   FD   TYPE DEVICE     SIZE    NODE NAME
lvm     7154 root  cwd    DIR  253,0     4096       2 /
lvm     7154 root  rtd    DIR  253,0     4096       2 /
lvm     7154 root  txt    REG  253,0   497688 4176867 /usr/sbin/lvm
lvm     7154 root  mem    REG  253,0 48524848   81922 /usr/lib/locale/locale-archive
lvm     7154 root  mem    REG  253,0   105080 1409026 /lib64/ld-2.3.4.so
lvm     7154 root  mem    REG  253,0  1493186 1409049 /lib64/tls/libc-2.3.4.so
lvm     7154 root  mem    REG  253,0    17943 1409051 /lib64/libdl-2.3.4.so
lvm     7154 root  mem    REG  253,0    64904 1409075 /lib64/libdevmapper.so.1.02
lvm     7154 root  mem    REG  253,0   229824 4172535 /usr/lib64/libreadline.so.4.3
lvm     7154 root  mem    REG  253,0    62504 1409260 /lib64/libselinux.so.1
lvm     7154 root  mem    REG  253,0  1018858 4167671 /usr/lib64/libncurses.so.5.4
lvm     7154 root    0u  IPv4  26789              TCP
link-08.lab.msp.redhat.com:34666->link-08.lab.msp.redhat.com:5010 (ESTABLISHED)
lvm     7154 root    1u  IPv4  26791              TCP
link-08.lab.msp.redhat.com:34667->link-08.lab.msp.redhat.com:5011 (ESTABLISHED)
lvm     7154 root    2u  IPv4  26793              TCP
link-08.lab.msp.redhat.com:34668->link-08.lab.msp.redhat.com:5012 (ESTABLISHED)
lvm     7154 root    3uR  REG  253,0        0  722964 /var/lock/lvm/V_mirror_6_9215
lvm     7154 root    4r   BLK    8,1            22662 /dev/sda1
lvm     7154 root    5r   BLK    8,2            22681 /dev/sda2
lvm     7154 root    6r   BLK   8,17            22798 /dev/sdb1
lvm     7154 root    7r   BLK   8,18            22811 /dev/sdb2
lvm     7154 root    8r   BLK   8,33            22923 /dev/sdc1
lvm     7154 root    9r   BLK   8,34            22933 /dev/sdc2
lvm     7154 root   10r   BLK   8,49            23046 /dev/sdd1
lvm     7154 root   11r   BLK   8,50            23064 /dev/sdd2
lvm     7154 root   12r   BLK   8,65            23185 /dev/sde1
lvm     7154 root   13r   BLK   8,66            23180 /dev/sde2
lvm     7154 root   14r   BLK   8,81            23308 /dev/sdf1
lvm     7154 root   15r   BLK   8,82            23327 /dev/sdf2
lvm     7154 root   16u   CHR  10,63             1258 /dev/mapper/control

[root@link-08 bin]# dmsetup info -c --noopencount
[HANG]

Feb 15 10:16:39 link-08 kernel: lvm           D 000001003c8075a8     0  7154  
7153                     (NOTLB)
Feb 15 10:16:39 link-08 kernel: 000001001bbb9cb8 0000000000000002
000001001f4d38e0 ffffffff0000006a
Feb 15 10:16:39 link-08 kernel:        000001001ab097f0 000000000000006a
0000010001709fc0 000000001bbb9c20
Feb 15 10:16:39 link-08 kernel:        000001001c9f4030 0000000000007f85
Feb 15 10:16:39 link-08 kernel: Call
Trace:<ffffffffa00ad764>{:dm_mod:dev_remove+0}
<ffffffffa00b0276>{:dm_mod:kcopyd_clie
nt_destroy+118}
Feb 15 10:16:39 link-08 kernel:       
<ffffffff80134dce>{autoremove_wake_function+0} <ffffffff80134dce>{autoremove_wake_f
unction+0}
Feb 15 10:16:39 link-08 kernel:       
<ffffffffa00f35cc>{:dm_mirror:mirror_dtr+101} <ffffffffa00ab7d9>{:dm_mod:table_dest
roy+72}
Feb 15 10:16:39 link-08 kernel:       
<ffffffffa00ad02e>{:dm_mod:__hash_remove+124} <ffffffffa00ad7fd>{:dm_mod:dev_remove
+153}
Feb 15 10:16:39 link-08 kernel:        <ffffffffa00aeae6>{:dm_mod:ctl_ioctl+599}
<ffffffff80189241>{sys_ioctl+853}
Feb 15 10:16:39 link-08 kernel:        <ffffffff801101c6>{system_call+126}
Feb 15 10:16:39 link-08 kernel: sshd          S 0000000000000008     0  7230  
2622  7232    7437  3638 (NOTLB)
Feb 15 10:16:39 link-08 kernel: 00000100177e3d78 0000000000000006
0000000000003fd0 00000100177e3d80
Feb 15 10:16:39 link-08 kernel:        000000d000000246 0000000000000246
0000000000000256 000000000000c780
Feb 15 10:16:39 link-08 kernel:        000001001ab097f0 0000000000000b91
Feb 15 10:16:39 link-08 kernel: Call Trace:<ffffffff80223e81>{tty_ldisc_try+60}
<ffffffff80305461>{schedule_timeout+101}
Feb 15 10:16:39 link-08 kernel:        <ffffffff80223fb4>{tty_ldisc_deref+103}
<ffffffff80189c3b>{do_select+939}
Feb 15 10:16:39 link-08 kernel:        <ffffffff801897d5>{__pollwait+0}
<ffffffff80189fba>{sys_select+820}
Feb 15 10:16:39 link-08 kernel:        <ffffffff8019154c>{dnotify_parent+34}
<ffffffff801101c6>{system_call+126}
Feb 15 10:16:39 link-08 kernel:
Feb 15 10:16:39 link-08 kernel: bash          S 0000007fbffff7b4     0  7232  
7230  7433               (NOTLB)
Feb 15 10:16:39 link-08 kernel: 000001001ac1beb8 0000000000000002
0000000100000246 0000000000000075
Feb 15 10:16:39 link-08 kernel:        000001001f27b030 0000000000000075
0000010020713cc0 0000000100000000
Feb 15 10:16:39 link-08 kernel:        000001001b2547f0 0000000000005d23
Feb 15 10:16:39 link-08 kernel: Call Trace:<ffffffff8013af23>{do_wait+3298}
<ffffffff801333a4>{default_wake_function+0}
Feb 15 10:16:39 link-08 kernel:       
<ffffffff801333a4>{default_wake_function+0} <ffffffff801101c6>{system_call+126}
Feb 15 10:16:39 link-08 kernel:
Feb 15 10:16:39 link-08 kernel: dmsetup       D 0000000000000000     0  7433  
7232                     (NOTLB)
Feb 15 10:16:39 link-08 kernel: 000001003b653d08 0000000000000006
0000000000000000 000001003a4af608
Feb 15 10:16:39 link-08 kernel:        0000000100000001 0000000000000000
0000002a983bf000 000000011f27b030
Feb 15 10:16:39 link-08 kernel:        000001001f27b030 00000000001046fc
Feb 15 10:16:39 link-08 kernel: Call
Trace:<ffffffff801a8ca7>{proc_root_lookup+49} <ffffffff8030562b>{__down_write+134}
Feb 15 10:16:39 link-08 kernel:       
<ffffffffa00ad150>{:dm_mod:list_devices+63} <ffffffff8016e9d4>{__vmalloc+216}
Feb 15 10:16:40 link-08 kernel:       
<ffffffffa00ad111>{:dm_mod:list_devices+0} <ffffffffa00aeae6>{:dm_mod:ctl_ioctl+599
}
Feb 15 10:16:40 link-08 kernel:        <ffffffff80189241>{sys_ioctl+853}
<ffffffff801101c6>{system_call+126}
Feb 15 10:16:40 link-08 kernel:
Feb 15 10:16:40 link-08 kernel: dmsetup       D 0000000000000000     0  7434  
3595                     (NOTLB)
Feb 15 10:16:40 link-08 kernel: 000001001befdd08 0000000000000002
0000000000000000 00000100177fa608
Feb 15 10:16:40 link-08 kernel:        0000000100000001 0000000000000000
0000002a983bf000 000000003a72a7f0
Feb 15 10:16:40 link-08 kernel:        000001003a72a7f0 0000000000150d32
Feb 15 10:16:40 link-08 kernel: Call
Trace:<ffffffff801a8ca7>{proc_root_lookup+49} <ffffffff8030562b>{__down_write+134}
Feb 15 10:16:40 link-08 kernel:       
<ffffffffa00ad150>{:dm_mod:list_devices+63} <ffffffff8016e9d4>{__vmalloc+216}
Feb 15 10:16:40 link-08 kernel:       
<ffffffffa00ad111>{:dm_mod:list_devices+0} <ffffffffa00aeae6>{:dm_mod:ctl_ioctl+599
}
Feb 15 10:16:40 link-08 kernel:        <ffffffff80189241>{sys_ioctl+853}
<ffffffff801101c6>{system_call+126}
Feb 15 10:16:40 link-08 kernel:


Comment 5 Jonathan Earl Brassow 2006-02-16 21:53:57 UTC
These are the rpms I'm using:

[root@tng1-1 ~]# rpm -q device-mapper
device-mapper-1.02.03-1.0.RHEL4
[root@tng1-1 ~]# rpm -q lvm2
lvm2-2.02.02-1.0.RHEL4
[root@tng1-1 ~]#

They seem to work fine.

Comment 7 Corey Marthaler 2006-02-16 23:06:43 UTC
on x86_64 with the lastest rpms, it's the same problem.

ioctl(8, DM_DEV_REMOVE

[root@link-08 ~]# rpm -q lvm2
lvm2-2.02.02-1.0.RHEL4
[root@link-08 ~]# rpm -q device-mapper
device-mapper-1.02.03-1.0.RHEL4


Comment 8 Jonathan Earl Brassow 2006-02-17 18:16:29 UTC
yup, just confirmed this bug.

I was using an earlier kernel (with patches) and the latest rpms, and it works
fine.  Switching to the new kernel causes lockups.  I'll see if I can figure out
what has changed.


Comment 9 Jonathan Earl Brassow 2006-02-17 20:44:19 UTC
There was a snapshot fix that involved tracking I/O's that go through kcopyd and
making sure all were completed before returning from kcopyd_client_destroy. 
This patch was introduced somewhere between 28.EL and 32.EL.  the mirror_dtr is
hanging on this.

Specifically, after 400 kcopyd_copy requests, and trying to remove the mirror, I
get:
Feb 17 15:46:58 tng1-1 kernel: Entering kcopyd_client_destroy -- nr_jobs == -400

Obviously, there is a problem incrementing the counter.  I'm looking for this.


Comment 10 Jonathan Earl Brassow 2006-02-17 23:50:02 UTC
it's the extra atomic_dec in segment_complete.  Either that has to be removed,
or it needs to be balanced with an atomic_inc in split_job.


Comment 11 Jonathan Earl Brassow 2006-02-18 00:34:58 UTC
switching severity to regression, because 'pvmove' will never work with this bug
in place.

Comment 12 Jonathan Earl Brassow 2006-02-18 00:38:23 UTC
patch posted to rhkernel

Comment 13 Alasdair Kergon 2006-02-19 17:20:07 UTC
Does setting mirror_region_size = 32 in the activation section of lvm.conf
workaround the problem?

Do you also see the problem with snapshots if you set a chunksize larger than 32KB?


Comment 16 Heinz Mauelshagen 2006-02-20 14:47:46 UTC
Alasdair,

because unloading dmraid mappings online is rare, this should be a minor issue.

Comment 17 Heinz Mauelshagen 2006-02-20 14:49:57 UTC
FYI on comment #13:

should theoretically be 64KB, because SUB_JOB_SIZE is defined as 128 sectors of
512 bytes in kcopyd.c

Comment 18 Alasdair Kergon 2006-02-20 15:30:02 UTC
But the test is < not <=.

Comment 19 Corey Marthaler 2006-02-20 17:41:25 UTC
Does setting mirror_region_size = 32 in the activation section of lvm.conf
workaround the problem?

It does appear to be a work around in this case.

Do you also see the problem with snapshots if you set a chunksize larger than 32KB?

Nope, snaps appear to work fine.

Comment 20 Heinz Mauelshagen 2006-02-20 19:02:27 UTC
Corey,
are you saying, you can delete snapshots with > 64KB chunk_size ok
*after* you wrote to either originor snapshot ?

Would be astonishing...

Comment 21 Corey Marthaler 2006-02-20 19:23:00 UTC
No one said anything about attempting to write anything earlier. :) 

Once I wrote to the snap and the origin, this issue is present when attempting
to remove.

Comment 22 Heinz Mauelshagen 2006-02-20 20:18:26 UTC
I was afraid so.

Kernel respin scheduled fo tonight anyway AFAIK.
Can you run a recursion with the new kernel tomorrow, please ?

Comment 23 Heinz Mauelshagen 2006-02-20 20:25:22 UTC
BTW: there's another minor issue with the kcopyd_client_destroy() functionality,
which *doesn't* cause a regression.

In case a client request a multi-segment copy, multiple wake_up() calls will be
carried out redundantly in run_complete_job() rather than just one.
Minor overhead not causing a functional problem.


Comment 30 Corey Marthaler 2006-02-21 21:21:09 UTC
fix verified.

Comment 32 Red Hat Bugzilla 2006-03-07 21:23:38 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0132.html



Note You need to log in before you can comment on or make changes to this bug.