Description of problem: Mirror creation and deletion has been deadlocking. strace of a mirror lvremove: [...] ioctl(8, DM_DEV_STATUS, 0x6b01e0) = 0 mlockall(MCL_CURRENT|MCL_FUTURE) = 0 brk(0xedb000) = 0xedb000 brk(0x10c3000) = 0x10c3000 getpriority(PRIO_PROCESS, 0) = 20 setpriority(PRIO_PROCESS, 0, 4294967278) = 0 ioctl(8, DM_DEV_STATUS, 0x6b01e0) = 0 ioctl(8, DM_TABLE_DEPS, 0x6b01e0) = 0 ioctl(8, DM_TABLE_DEPS, 0x6c2230) = 0 ioctl(8, DM_TABLE_DEPS, 0x6c2230) = 0 ioctl(8, DM_TABLE_DEPS, 0x6c2230) = 0 ioctl(8, DM_DEV_STATUS, 0x6b01e0) = -1 ENXIO (No such device or address) ioctl(8, DM_DEV_STATUS, 0x6b01e0) = -1 ENXIO (No such device or address) ioctl(8, DM_DEV_STATUS, 0x6b01e0) = -1 ENXIO (No such device or address) ioctl(8, DM_DEV_STATUS, 0x6b01e0) = -1 ENXIO (No such device or address) ioctl(8, DM_DEV_STATUS, 0x6b01e0) = -1 ENXIO (No such device or address) ioctl(8, DM_DEV_STATUS, 0x6b01e0) = -1 ENXIO (No such device or address) ioctl(8, DM_DEV_STATUS, 0x6b01e0) = 0 ioctl(8, DM_DEV_STATUS, 0x6b01e0) = -1 ENXIO (No such device or address) ioctl(8, DM_DEV_STATUS, 0x6b01e0) = -1 ENXIO (No such device or address) ioctl(8, DM_DEV_STATUS, 0x6b01e0) = -1 ENXIO (No such device or address) ioctl(8, DM_DEV_STATUS, 0x6b01e0) = -1 ENXIO (No such device or address) ioctl(8, DM_DEV_STATUS, 0x6b01e0) = -1 ENXIO (No such device or address) ioctl(8, DM_DEV_STATUS, 0x6b01e0) = -1 ENXIO (No such device or address) ioctl(8, DM_DEV_STATUS, 0x6b01e0) = 0 ioctl(8, DM_DEV_STATUS, 0x6b01e0) = -1 ENXIO (No such device or address) ioctl(8, DM_DEV_STATUS, 0x6b01e0) = -1 ENXIO (No such device or address) ioctl(8, DM_DEV_STATUS, 0x6b01e0) = -1 ENXIO (No such device or address) ioctl(8, DM_DEV_STATUS, 0x6b01e0) = -1 ENXIO (No such device or address) ioctl(8, DM_DEV_STATUS, 0x6b01e0) = -1 ENXIO (No such device or address) ioctl(8, DM_DEV_STATUS, 0x6b01e0) = -1 ENXIO (No such device or address) ioctl(8, DM_DEV_STATUS, 0x6b01e0) = 0 ioctl(8, DM_DEV_REMOVE [HANG] strace of an lvscan during this deadlock: [...] stat("/dev/sde1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 65), ...}) = 0 stat("/dev/sde1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 65), ...}) = 0 open("/dev/sde1", O_RDONLY|O_DIRECT|O_NOATIME) = 3 fstat(3, {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 65), ...}) = 0 ioctl(3, BLKBSZGET, 0x687fe8) = 0 lseek(3, 4096, SEEK_SET) = 4096 read(3, "\301k\352\217 LVM2 x[5A%r0N*>\1\0\0\0\0\20\0\0\0\0\0\0"..., 2048) = 2048 lseek(3, 6144, SEEK_SET) = 6144 read(3, " = [\"ALLOCATABLE\"]\npe_start = 38"..., 2048) = 2048 close(3) = 0 stat("/dev/sdf1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 81), ...}) = 0 stat("/dev/sdf1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 81), ...}) = 0 open("/dev/sdf1", O_RDONLY|O_DIRECT|O_NOATIME) = 3 fstat(3, {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 81), ...}) = 0 ioctl(3, BLKBSZGET, 0x6880b8) = 0 lseek(3, 0, SEEK_SET) = 0 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2048) = 2048 close(3) = 0 stat("/proc/lvm/VGs/VG", 0x7fbfffb500) = -1 ENOENT (No such file or directory) rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0 open("/var/lock/lvm/V_VG", O_RDWR|O_APPEND|O_CREAT, 0777) = 3 rt_sigaction(SIGINT, {0x44fd60, [INT], SA_RESTORER|SA_RESTART, 0x2a9579e380}, {SIG_DFL}, 8) = 0 rt_sigprocmask(SIG_SETMASK, ~[INT RTMIN RT_1], NULL, 8) = 0 rt_sigaction(SIGINT, NULL, {0x44fd60, [INT], SA_RESTORER|SA_RESTART, 0x2a9579e380}, 8) = 0 rt_sigaction(SIGINT, {0x44fd60, [INT], SA_RESTORER, 0x2a9579e380}, NULL, 8) = 0 flock(3, LOCK_SH [HANG] Version-Release number of selected component (if applicable): [root@link-08 lvm]# uname -ar Linux link-08 2.6.9-31.ELsmp #1 SMP Thu Feb 9 19:02:57 EST 2006 x86_64 x86_64 x86_64 GNU/Linux [root@link-08 lvm]# rpm -q device-mapper device-mapper-1.02.02-3.0.RHEL4 [root@link-08 lvm]# rpm -q lvm2 lvm2-2.02.01-1.3.RHEL4 How reproducible: often
With problems where a device-mapper operation is hanging in the kernel, I need diagnostics similar to those requested on bug 179786, such as sysrq t for every process on the machine, and dmsetup info -c etc.
This has nothing to do with mirror creation. The mirror gets created just fine, from there though, any LVM cmd on that mirror, it's VG, or PVs, deadlock.
[root@link-08 lvm]# ps -ef | grep vg root 7152 3709 0 09:49 pts/1 00:00:00 qarsh root link-08 vgchange -an root 7154 7153 0 09:49 ? 00:00:00 vgchange -an root 7429 7232 0 10:03 pts/2 00:00:00 grep vg [root@link-08 lvm]# lsof -p 7154 COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME lvm 7154 root cwd DIR 253,0 4096 2 / lvm 7154 root rtd DIR 253,0 4096 2 / lvm 7154 root txt REG 253,0 497688 4176867 /usr/sbin/lvm lvm 7154 root mem REG 253,0 48524848 81922 /usr/lib/locale/locale-archive lvm 7154 root mem REG 253,0 105080 1409026 /lib64/ld-2.3.4.so lvm 7154 root mem REG 253,0 1493186 1409049 /lib64/tls/libc-2.3.4.so lvm 7154 root mem REG 253,0 17943 1409051 /lib64/libdl-2.3.4.so lvm 7154 root mem REG 253,0 64904 1409075 /lib64/libdevmapper.so.1.02 lvm 7154 root mem REG 253,0 229824 4172535 /usr/lib64/libreadline.so.4.3 lvm 7154 root mem REG 253,0 62504 1409260 /lib64/libselinux.so.1 lvm 7154 root mem REG 253,0 1018858 4167671 /usr/lib64/libncurses.so.5.4 lvm 7154 root 0u IPv4 26789 TCP link-08.lab.msp.redhat.com:34666->link-08.lab.msp.redhat.com:5010 (ESTABLISHED) lvm 7154 root 1u IPv4 26791 TCP link-08.lab.msp.redhat.com:34667->link-08.lab.msp.redhat.com:5011 (ESTABLISHED) lvm 7154 root 2u IPv4 26793 TCP link-08.lab.msp.redhat.com:34668->link-08.lab.msp.redhat.com:5012 (ESTABLISHED) lvm 7154 root 3uR REG 253,0 0 722964 /var/lock/lvm/V_mirror_6_9215 lvm 7154 root 4r BLK 8,1 22662 /dev/sda1 lvm 7154 root 5r BLK 8,2 22681 /dev/sda2 lvm 7154 root 6r BLK 8,17 22798 /dev/sdb1 lvm 7154 root 7r BLK 8,18 22811 /dev/sdb2 lvm 7154 root 8r BLK 8,33 22923 /dev/sdc1 lvm 7154 root 9r BLK 8,34 22933 /dev/sdc2 lvm 7154 root 10r BLK 8,49 23046 /dev/sdd1 lvm 7154 root 11r BLK 8,50 23064 /dev/sdd2 lvm 7154 root 12r BLK 8,65 23185 /dev/sde1 lvm 7154 root 13r BLK 8,66 23180 /dev/sde2 lvm 7154 root 14r BLK 8,81 23308 /dev/sdf1 lvm 7154 root 15r BLK 8,82 23327 /dev/sdf2 lvm 7154 root 16u CHR 10,63 1258 /dev/mapper/control [root@link-08 bin]# dmsetup info -c --noopencount [HANG] Feb 15 10:16:39 link-08 kernel: lvm D 000001003c8075a8 0 7154 7153 (NOTLB) Feb 15 10:16:39 link-08 kernel: 000001001bbb9cb8 0000000000000002 000001001f4d38e0 ffffffff0000006a Feb 15 10:16:39 link-08 kernel: 000001001ab097f0 000000000000006a 0000010001709fc0 000000001bbb9c20 Feb 15 10:16:39 link-08 kernel: 000001001c9f4030 0000000000007f85 Feb 15 10:16:39 link-08 kernel: Call Trace:<ffffffffa00ad764>{:dm_mod:dev_remove+0} <ffffffffa00b0276>{:dm_mod:kcopyd_clie nt_destroy+118} Feb 15 10:16:39 link-08 kernel: <ffffffff80134dce>{autoremove_wake_function+0} <ffffffff80134dce>{autoremove_wake_f unction+0} Feb 15 10:16:39 link-08 kernel: <ffffffffa00f35cc>{:dm_mirror:mirror_dtr+101} <ffffffffa00ab7d9>{:dm_mod:table_dest roy+72} Feb 15 10:16:39 link-08 kernel: <ffffffffa00ad02e>{:dm_mod:__hash_remove+124} <ffffffffa00ad7fd>{:dm_mod:dev_remove +153} Feb 15 10:16:39 link-08 kernel: <ffffffffa00aeae6>{:dm_mod:ctl_ioctl+599} <ffffffff80189241>{sys_ioctl+853} Feb 15 10:16:39 link-08 kernel: <ffffffff801101c6>{system_call+126} Feb 15 10:16:39 link-08 kernel: sshd S 0000000000000008 0 7230 2622 7232 7437 3638 (NOTLB) Feb 15 10:16:39 link-08 kernel: 00000100177e3d78 0000000000000006 0000000000003fd0 00000100177e3d80 Feb 15 10:16:39 link-08 kernel: 000000d000000246 0000000000000246 0000000000000256 000000000000c780 Feb 15 10:16:39 link-08 kernel: 000001001ab097f0 0000000000000b91 Feb 15 10:16:39 link-08 kernel: Call Trace:<ffffffff80223e81>{tty_ldisc_try+60} <ffffffff80305461>{schedule_timeout+101} Feb 15 10:16:39 link-08 kernel: <ffffffff80223fb4>{tty_ldisc_deref+103} <ffffffff80189c3b>{do_select+939} Feb 15 10:16:39 link-08 kernel: <ffffffff801897d5>{__pollwait+0} <ffffffff80189fba>{sys_select+820} Feb 15 10:16:39 link-08 kernel: <ffffffff8019154c>{dnotify_parent+34} <ffffffff801101c6>{system_call+126} Feb 15 10:16:39 link-08 kernel: Feb 15 10:16:39 link-08 kernel: bash S 0000007fbffff7b4 0 7232 7230 7433 (NOTLB) Feb 15 10:16:39 link-08 kernel: 000001001ac1beb8 0000000000000002 0000000100000246 0000000000000075 Feb 15 10:16:39 link-08 kernel: 000001001f27b030 0000000000000075 0000010020713cc0 0000000100000000 Feb 15 10:16:39 link-08 kernel: 000001001b2547f0 0000000000005d23 Feb 15 10:16:39 link-08 kernel: Call Trace:<ffffffff8013af23>{do_wait+3298} <ffffffff801333a4>{default_wake_function+0} Feb 15 10:16:39 link-08 kernel: <ffffffff801333a4>{default_wake_function+0} <ffffffff801101c6>{system_call+126} Feb 15 10:16:39 link-08 kernel: Feb 15 10:16:39 link-08 kernel: dmsetup D 0000000000000000 0 7433 7232 (NOTLB) Feb 15 10:16:39 link-08 kernel: 000001003b653d08 0000000000000006 0000000000000000 000001003a4af608 Feb 15 10:16:39 link-08 kernel: 0000000100000001 0000000000000000 0000002a983bf000 000000011f27b030 Feb 15 10:16:39 link-08 kernel: 000001001f27b030 00000000001046fc Feb 15 10:16:39 link-08 kernel: Call Trace:<ffffffff801a8ca7>{proc_root_lookup+49} <ffffffff8030562b>{__down_write+134} Feb 15 10:16:39 link-08 kernel: <ffffffffa00ad150>{:dm_mod:list_devices+63} <ffffffff8016e9d4>{__vmalloc+216} Feb 15 10:16:40 link-08 kernel: <ffffffffa00ad111>{:dm_mod:list_devices+0} <ffffffffa00aeae6>{:dm_mod:ctl_ioctl+599 } Feb 15 10:16:40 link-08 kernel: <ffffffff80189241>{sys_ioctl+853} <ffffffff801101c6>{system_call+126} Feb 15 10:16:40 link-08 kernel: Feb 15 10:16:40 link-08 kernel: dmsetup D 0000000000000000 0 7434 3595 (NOTLB) Feb 15 10:16:40 link-08 kernel: 000001001befdd08 0000000000000002 0000000000000000 00000100177fa608 Feb 15 10:16:40 link-08 kernel: 0000000100000001 0000000000000000 0000002a983bf000 000000003a72a7f0 Feb 15 10:16:40 link-08 kernel: 000001003a72a7f0 0000000000150d32 Feb 15 10:16:40 link-08 kernel: Call Trace:<ffffffff801a8ca7>{proc_root_lookup+49} <ffffffff8030562b>{__down_write+134} Feb 15 10:16:40 link-08 kernel: <ffffffffa00ad150>{:dm_mod:list_devices+63} <ffffffff8016e9d4>{__vmalloc+216} Feb 15 10:16:40 link-08 kernel: <ffffffffa00ad111>{:dm_mod:list_devices+0} <ffffffffa00aeae6>{:dm_mod:ctl_ioctl+599 } Feb 15 10:16:40 link-08 kernel: <ffffffff80189241>{sys_ioctl+853} <ffffffff801101c6>{system_call+126} Feb 15 10:16:40 link-08 kernel:
These are the rpms I'm using: [root@tng1-1 ~]# rpm -q device-mapper device-mapper-1.02.03-1.0.RHEL4 [root@tng1-1 ~]# rpm -q lvm2 lvm2-2.02.02-1.0.RHEL4 [root@tng1-1 ~]# They seem to work fine.
They're somewhat hard to find... http://porkchop.devel.redhat.com/dist/4E-qu-candidate/device-mapper/1.02.03-1.0.RHEL4/SRPMS/device-mapper-1.02.03-1.0.RHEL4.src.rpm http://porkchop.devel.redhat.com/dist/4E-cluster-qu-candidate/lvm2/2.02.02-1.0.RHEL4/SRPMS/lvm2-2.02.02-1.0.RHEL4.src.rpm
on x86_64 with the lastest rpms, it's the same problem. ioctl(8, DM_DEV_REMOVE [root@link-08 ~]# rpm -q lvm2 lvm2-2.02.02-1.0.RHEL4 [root@link-08 ~]# rpm -q device-mapper device-mapper-1.02.03-1.0.RHEL4
yup, just confirmed this bug. I was using an earlier kernel (with patches) and the latest rpms, and it works fine. Switching to the new kernel causes lockups. I'll see if I can figure out what has changed.
There was a snapshot fix that involved tracking I/O's that go through kcopyd and making sure all were completed before returning from kcopyd_client_destroy. This patch was introduced somewhere between 28.EL and 32.EL. the mirror_dtr is hanging on this. Specifically, after 400 kcopyd_copy requests, and trying to remove the mirror, I get: Feb 17 15:46:58 tng1-1 kernel: Entering kcopyd_client_destroy -- nr_jobs == -400 Obviously, there is a problem incrementing the counter. I'm looking for this.
it's the extra atomic_dec in segment_complete. Either that has to be removed, or it needs to be balanced with an atomic_inc in split_job.
switching severity to regression, because 'pvmove' will never work with this bug in place.
patch posted to rhkernel
Does setting mirror_region_size = 32 in the activation section of lvm.conf workaround the problem? Do you also see the problem with snapshots if you set a chunksize larger than 32KB?
Alasdair, because unloading dmraid mappings online is rare, this should be a minor issue.
FYI on comment #13: should theoretically be 64KB, because SUB_JOB_SIZE is defined as 128 sectors of 512 bytes in kcopyd.c
But the test is < not <=.
Does setting mirror_region_size = 32 in the activation section of lvm.conf workaround the problem? It does appear to be a work around in this case. Do you also see the problem with snapshots if you set a chunksize larger than 32KB? Nope, snaps appear to work fine.
Corey, are you saying, you can delete snapshots with > 64KB chunk_size ok *after* you wrote to either originor snapshot ? Would be astonishing...
No one said anything about attempting to write anything earlier. :) Once I wrote to the snap and the origin, this issue is present when attempting to remove.
I was afraid so. Kernel respin scheduled fo tonight anyway AFAIK. Can you run a recursion with the new kernel tomorrow, please ?
BTW: there's another minor issue with the kcopyd_client_destroy() functionality, which *doesn't* cause a regression. In case a client request a multi-segment copy, multiple wake_up() calls will be carried out redundantly in run_complete_job() rather than just one. Minor overhead not causing a functional problem.
fix verified.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0132.html