Description of problem: I don't have a lot to go on right now, but I saw this hang trying to create a snapshot. I'll try and reproduce this and then add more info. [root@link-08 ~]# ps -C lvcreate PID TTY TIME CMD 5044 pts/0 00:00:01 lvcreate [root@link-08 ~]# lsof -p 5044 COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME lvcreate 5044 root cwd DIR 253,0 61440 3227649 /tmp lvcreate 5044 root rtd DIR 253,0 4096 2 / lvcreate 5044 root txt REG 253,0 497688 4167049 /usr/sbin/lvm lvcreate 5044 root mem REG 253,0 48524848 4165944 /usr/lib/locale/locale-archive lvcreate 5044 root mem REG 253,0 21546 4194946 /usr/lib64/gconv/gconv-modules.cache lvcreate 5044 root mem REG 253,0 105080 1409026 /lib64/ld-2.3.4.so lvcreate 5044 root mem REG 253,0 1493186 1409050 /lib64/tls/libc-2.3.4.so lvcreate 5044 root mem REG 253,0 17943 1409081 /lib64/libdl-2.3.4.so lvcreate 5044 root mem REG 253,0 64904 1409235 /lib64/libdevmapper.so.1.02 lvcreate 5044 root mem REG 253,0 229824 4166319 /usr/lib64/libreadline.so.4.3 lvcreate 5044 root mem REG 253,0 62504 1409151 /lib64/libselinux.so.1 lvcreate 5044 root mem REG 253,0 1018858 4168026 /usr/lib64/libncurses.so.5.4 lvcreate 5044 root 0u CHR 136,0 2 /dev/pts/0 lvcreate 5044 root 1u CHR 136,0 2 /dev/pts/0 lvcreate 5044 root 2u CHR 136,0 2 /dev/pts/0 lvcreate 5044 root 3u CHR 10,63 1250 /dev/mapper/control lvcreate 5044 root 4uW REG 253,0 0 722834 /var/lock/lvm/V_snapper lvcreate 5044 root 5u BLK 8,1 1101 /dev/sda1 Feb 2 10:19:13 link-08 kernel: lvcreate D ffffffff804df5c0 0 5044 5093 (NOTLB) Feb 2 10:19:13 link-08 kernel: 00000100345b7c88 0000000000000006 0000010019ec71d0 0000007400000246 Feb 2 10:19:13 link-08 kernel: 000001003542c570 00000000000004f2 0000010019ec71d0 000001003542c858 Feb 2 10:19:13 link-08 kernel: 00000021ea19da00 0000000000000206 Feb 2 10:19:13 link-08 kernel: Call Trace:<ffffffff80196417>{bdev_test+0} <ffffffff80196426>{bdev_set+0} Feb 2 10:19:13 link-08 kernel: <ffffffff801ae07b>{__wait_on_inode+113} <ffffffff80134bc6>{default_wake_function+0} Feb 2 10:19:13 link-08 kernel: <ffffffff80134bc6>{default_wake_function+0} <ffffffff801af77b>{iget5_locked+393} Feb 2 10:19:13 link-08 kernel: <ffffffffa00b8c63>{:dm_mod:dev_status+0} <ffffffff8019646d>{bdget+60} Feb 2 10:19:13 link-08 kernel: <ffffffffa00b8468>{:dm_mod:__dev_status+96} <ffffffffa00b8d55>{:dm_mod:dev_status+2 42} Feb 2 10:19:13 link-08 kernel: <ffffffffa00b9a96>{:dm_mod:ctl_ioctl+599} <ffffffff801a37e2>{sys_ioctl+1006} Feb 2 10:19:13 link-08 kernel: <ffffffff80110a12>{system_call+126} Version: [root@link-08 tmp]# uname -ar Linux link-08 2.6.9-29.EL #1 Wed Jan 25 11:18:11 EST 2006 x86_64 x86_64 x86_64 GNU/Linux [root@link-08 tmp]# rpm -q lvm2 lvm2-2.02.01-1.3.RHEL4
If an lvm2 process hangs like that happens again, another standard diagnostic to attempt is 'dmsetup info -c'. (And if that hangs too, 'dmsetup ls' for a list of devices then run 'dmsetup info -c' separately on each, like 'dmsetup info -c vg1-lvol0' to see which one is causing the hang.)
Sometimes adding --noopencount avoids the hang: 'dmsetup info -c --noopencount'.
It's hard to track this one down without more context - deducing exactly which ioctl caused the problem and what state the logical volumes were in at the time. If you see it again, also try to get a full task dump e.g. 'echo t > /proc/sysrq-trigger'. In short, the trace shows that the 'lvcreate' was waiting for a lock - we need to find out why whatever was holding it never released it. (We've seen similar problems before with dm multipath under low memory.)
And another diagnostice if you see it again: a listing of any dm* devices in the 'block' directory under the sysfs mountpoint.
Able to reproduce this now: strace: [...] ioctl(3, DM_DEV_STATUS, 0x6a4ac0) = -1 ENXIO (No such device or address) ioctl(3, DM_DEV_STATUS, 0x6a4ac0) = -1 ENXIO (No such device or address) ioctl(3, DM_DEV_STATUS, 0x6a4ac0) = -1 ENXIO (No such device or address) ioctl(3, DM_DEV_STATUS, 0x6a4ac0) = 0 ioctl(3, DM_DEV_STATUS, 0x6a4ac0) = 0 ioctl(3, DM_TABLE_DEPS, 0x6a4ac0) = 0 ioctl(3, DM_DEV_STATUS, 0x6a4ac0) = -1 ENXIO (No such device or address) ioctl(3, DM_DEV_STATUS, 0x6a4ac0) = -1 ENXIO (No such device or address) ioctl(3, DM_DEV_STATUS, 0x6a4ac0) = -1 ENXIO (No such device or address) ioctl(3, DM_DEV_STATUS, 0x6a4ac0) = 0 ioctl(3, DM_TABLE_DEPS, 0x6a4ac0) = 0 ioctl(3, DM_DEV_STATUS, 0x6a4ac0) = 0 ioctl(3, DM_DEV_SUSPEND [root@link-08 ~]# ps -C lvcreate PID TTY TIME CMD 20215 pts/0 00:00:01 lvcreate [root@link-08 ~]# lsof -p 20215 COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME lvcreate 20215 root cwd DIR 253,2 4096 2 /mnt/origin lvcreate 20215 root rtd DIR 253,0 4096 2 / lvcreate 20215 root txt REG 253,0 497688 4167049 /usr/sbin/lvm lvcreate 20215 root mem REG 253,0 48524848 4165944 /usr/lib/locale/locale-archive lvcreate 20215 root mem REG 253,0 21546 4194946 /usr/lib64/gconv/gconv-modules.cache lvcreate 20215 root mem REG 253,0 105080 1409026 /lib64/ld-2.3.4.so lvcreate 20215 root mem REG 253,0 1493186 1409050 /lib64/tls/libc-2.3.4.so lvcreate 20215 root mem REG 253,0 17943 1409081 /lib64/libdl-2.3.4.so lvcreate 20215 root mem REG 253,0 64904 1409235 /lib64/libdevmapper.so.1.02 lvcreate 20215 root mem REG 253,0 229824 4166319 /usr/lib64/libreadline.so.4.3 lvcreate 20215 root mem REG 253,0 62504 1409151 /lib64/libselinux.so.1 lvcreate 20215 root mem REG 253,0 1018858 4168026 /usr/lib64/libncurses.so.5.4 lvcreate 20215 root 0u CHR 136,0 2 /dev/pts/0 lvcreate 20215 root 1u CHR 136,0 2 /dev/pts/0 lvcreate 20215 root 2u CHR 136,0 2 /dev/pts/0 lvcreate 20215 root 3u CHR 10,63 1340 /dev/mapper/control lvcreate 20215 root 4uW REG 253,0 0 722848 /var/lock/lvm/V_snapper lvcreate 20215 root 5u BLK 8,1 1131 /dev/sda1 [root@link-08 ~]# dmsetup info -c Name Maj Min Stat Open Targ Event UUID snapper-lvol4 253 17 L--w 0 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmlOlOhrGqU1H764pZ2B0lLpNflZVzcObs snapper-lvol5-cow 253 18 L--w 1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmyupKyBdnGbdjDEPuv15YiiJXC1JTaIq4-cow snapper-lvol3 253 15 L--w 0 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmceFDtLfdZpSmOLhkJ5296fLYpEE1dsSY snapper-lvol2 253 13 L--w 0 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmQAOS8eMMFm4svRFEv249XKm3Ti1HyFIK snapper-block_snap1-cow 253 4 L--w 1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmkOwVujEF1zwNULkeXB3LmyQb0wxmFYTu-cow snapper-lvol0-cow 253 8 L--w 1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmWPxz7Nsb5jhB7V3pX3JlnERlJh248zJY-cow snapper-lvol1 253 11 L--w 0 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmfLm6Fda5WTHrWhMfN27w31xxiukxndJV snapper-lvol6-cow 253 20 L--w 1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8Vgmi1LRbFmQ1BG28IdaH3846x4MSh7y4J7C-cow snapper-block_snap1 253 5 L--w 0 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmkOwVujEF1zwNULkeXB3LmyQb0wxmFYTu snapper-lvol0 253 9 L--w 0 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmWPxz7Nsb5jhB7V3pX3JlnERlJh248zJY snapper-lvol1-cow 253 10 L--w 1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmfLm6Fda5WTHrWhMfN27w31xxiukxndJV-cow snapper-lvol7-cow 253 22 L--w 1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8Vgme7xnjQOdcLToqcJ25UHB6H3gVGofurxy-cow snapper-origin-real 253 3 L--w 12 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmQkko3GWkfsOEhlYeFS1fPNZlzwQYT3Wi-real snapper-lvol2-cow 253 12 L--w 1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmQAOS8eMMFm4svRFEv249XKm3Ti1HyFIK-cow snapper-lvol8-cow 253 24 L--w 1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmMAxMOpVAthyy7BoK8R77QKxRmfArSIVQ-cow snapper-lvol8 253 25 -I-w 0 0 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmMAxMOpVAthyy7BoK8R77QKxRmfArSIVQ VolGroup00-LogVol01 253 1 L--w 0 1 0 LVM-Jiq2dT75ikN4hqkfXJZKX3KUN46xUvkxFLlUlLBlQdoGxn84xAMvc9QnYz7u0DfS snapper-lvol3-cow 253 14 L--w 1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmceFDtLfdZpSmOLhkJ5296fLYpEE1dsSY-cow snapper-lvol7 253 23 L--w 0 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8Vgme7xnjQOdcLToqcJ25UHB6H3gVGofurxy snapper-fs_snap1-cow 253 6 L--w 1 1 2 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmMammTsqrl02PyP5pcmkJIoajl7WG3tjJ-cow VolGroup00-LogVol00 253 0 L--w 1 1 0 LVM-Jiq2dT75ikN4hqkfXJZKX3KUN46xUvkxIn0DTEG9BvNPGlXNaP75R49PXzGI1YRg snapper-lvol6 253 21 L--w 0 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8Vgmi1LRbFmQ1BG28IdaH3846x4MSh7y4J7C snapper-fs_snap1 253 7 L--w 0 1 2 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmMammTsqrl02PyP5pcmkJIoajl7WG3tjJ snapper-lvol4-cow 253 16 L--w 1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmlOlOhrGqU1H764pZ2B0lLpNflZVzcObs-cow snapper-lvol5 253 19 L--w 0 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmyupKyBdnGbdjDEPuv15YiiJXC1JTaIq4 [HANG] [root@link-08 origin]# lvs [HANG] [root@link-08 ~]# dmsetup info -c --noopencount Name Maj Min Stat Open Targ Event UUID snapper-lvol4 253 17 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmlOlOhrGqU1H764pZ2B0lLpNflZVzcObs snapper-lvol5-cow 253 18 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmyupKyBdnGbdjDEPuv15YiiJXC1JTaIq4-cow snapper-lvol3 253 15 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmceFDtLfdZpSmOLhkJ5296fLYpEE1dsSY snapper-lvol2 253 13 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmQAOS8eMMFm4svRFEv249XKm3Ti1HyFIK snapper-block_snap1-cow 253 4 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmkOwVujEF1zwNULkeXB3LmyQb0wxmFYTu-cow snapper-lvol0-cow 253 8 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmWPxz7Nsb5jhB7V3pX3JlnERlJh248zJY-cow snapper-lvol1 253 11 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmfLm6Fda5WTHrWhMfN27w31xxiukxndJV snapper-lvol6-cow 253 20 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8Vgmi1LRbFmQ1BG28IdaH3846x4MSh7y4J7C-cow snapper-block_snap1 253 5 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmkOwVujEF1zwNULkeXB3LmyQb0wxmFYTu snapper-lvol0 253 9 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmWPxz7Nsb5jhB7V3pX3JlnERlJh248zJY snapper-lvol1-cow 253 10 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmfLm6Fda5WTHrWhMfN27w31xxiukxndJV-cow snapper-lvol7-cow 253 22 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8Vgme7xnjQOdcLToqcJ25UHB6H3gVGofurxy-cow snapper-origin-real 253 3 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmQkko3GWkfsOEhlYeFS1fPNZlzwQYT3Wi-real snapper-lvol2-cow 253 12 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmQAOS8eMMFm4svRFEv249XKm3Ti1HyFIK-cow snapper-lvol8-cow 253 24 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmMAxMOpVAthyy7BoK8R77QKxRmfArSIVQ-cow snapper-lvol8 253 25 -I-w -1 0 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmMAxMOpVAthyy7BoK8R77QKxRmfArSIVQ VolGroup00-LogVol01 253 1 L--w -1 1 0 LVM-Jiq2dT75ikN4hqkfXJZKX3KUN46xUvkxFLlUlLBlQdoGxn84xAMvc9QnYz7u0DfS snapper-lvol3-cow 253 14 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmceFDtLfdZpSmOLhkJ5296fLYpEE1dsSY-cow snapper-lvol7 253 23 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8Vgme7xnjQOdcLToqcJ25UHB6H3gVGofurxy snapper-fs_snap1-cow 253 6 L--w -1 1 2 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmMammTsqrl02PyP5pcmkJIoajl7WG3tjJ-cow VolGroup00-LogVol00 253 0 L--w -1 1 0 LVM-Jiq2dT75ikN4hqkfXJZKX3KUN46xUvkxIn0DTEG9BvNPGlXNaP75R49PXzGI1YRg snapper-lvol6 253 21 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8Vgmi1LRbFmQ1BG28IdaH3846x4MSh7y4J7C snapper-fs_snap1 253 7 L--w -1 1 2 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmMammTsqrl02PyP5pcmkJIoajl7WG3tjJ snapper-lvol4-cow 253 16 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmlOlOhrGqU1H764pZ2B0lLpNflZVzcObs-cow snapper-lvol5 253 19 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmyupKyBdnGbdjDEPuv15YiiJXC1JTaIq4 snapper-origin 253 2 L--w -1 1 0 LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmQkko3GWkfsOEhlYeFS1fPNZlzwQYT3Wi
This is the specific volume which is hung: [root@link-08 ~]# dmsetup info -c snapper-origin [HANG] I/O to the origin filesystem is also blocked as a result (or cause) of this.
Hit this again today and was able to grab the requested system info this time. I/O is hung to the volume snapper-origin, as well as an lvcreate snapshot attempt. [root@link-08 tmp]# dmsetup info -c --noopencount Name Maj Min Stat Open Targ Event UUID snapper-lvol4 253 17 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9qWuv8tmwaNPtBGgsU6pu9rzroMd1ujZK snapper-lvol5-cow 253 18 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9pyFPjLkCI9UdBW7qOdSmaDnoT9jWSNnm-cow snapper-lvol3 253 15 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9Q1InKR38AzvhtnjqcV3kOx7pEkxUK62y snapper-fs_snap3-cow 253 20 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9fKnZKose1Tr6EiAbk6sTWxt2l3Plt4kA-cow snapper-lvol2 253 13 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9CIBB000uydlF0K7bKI8DFh7RUAsksDbx snapper-block_snap1-cow 253 4 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ99WGc0yeL12yMYHqD4JeIh40XBYnluDav-cow snapper-lvol1 253 11 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9e7lJnUqGcqdc6owZNuN8UBESaHS8wDwi snapper-block_snap1 253 5 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ99WGc0yeL12yMYHqD4JeIh40XBYnluDav snapper-lvol1-cow 253 10 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9e7lJnUqGcqdc6owZNuN8UBESaHS8wDwi-cow snapper-32_1-cow 253 22 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ97fK86KrtBzQX6D49J0ROPvR9Tf3xLQPm-cow snapper-32_2 253 25 -Isw -1 0 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ911t00T8qWeYOVLAECF4yDD5Pq5M56V3a snapper-32_1 253 23 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ97fK86KrtBzQX6D49J0ROPvR9Tf3xLQPm snapper-origin-real 253 3 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9I3ybJcpSCq275KtOSVD0Lh8amqu6zU3W-real snapper-lvol2-cow 253 12 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9CIBB000uydlF0K7bKI8DFh7RUAsksDbx-cow snapper-32_2-cow 253 24 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ911t00T8qWeYOVLAECF4yDD5Pq5M56V3a-cow snapper-fs_snap3 253 21 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9fKnZKose1Tr6EiAbk6sTWxt2l3Plt4kA VolGroup00-LogVol01 253 1 L--w -1 1 0 LVM-Jiq2dT75ikN4hqkfXJZKX3KUN46xUvkxFLlUlLBlQdoGxn84xAMvc9QnYz7u0DfS snapper-lvol3-cow 253 14 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9Q1InKR38AzvhtnjqcV3kOx7pEkxUK62y-cow snapper-fs_snap1-cow 253 6 L-sw -1 1 2 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9T7WCxOm5Y6EPN6zr5X0wsBnlAZiMwa4m-cow snapper-fs_snap2 253 9 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9v0IDZleyHWkqZTmOV9FhJmdHxQUF7d2D VolGroup00-LogVol00 253 0 L--w -1 1 0 LVM-Jiq2dT75ikN4hqkfXJZKX3KUN46xUvkxIn0DTEG9BvNPGlXNaP75R49PXzGI1YRg snapper-fs_snap1 253 7 L-sw -1 1 2 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9T7WCxOm5Y6EPN6zr5X0wsBnlAZiMwa4m snapper-lvol4-cow 253 16 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9qWuv8tmwaNPtBGgsU6pu9rzroMd1ujZK-cow snapper-lvol5 253 19 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9pyFPjLkCI9UdBW7qOdSmaDnoT9jWSNnm snapper-origin 253 2 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9I3ybJcpSCq275KtOSVD0Lh8amqu6zU3W snapper-fs_snap2-cow 253 8 L-sw -1 1 0 LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9v0IDZleyHWkqZTmOV9FhJmdHxQUF7d2D-cow Feb 9 07:47:00 link-08 kernel: snapper.sh S 00000000fffffe00 0 6522 3873 7419 (NOTLB) Feb 9 07:47:00 link-08 kernel: 0000010028323eb8 0000000000000002 00000000006beccc 0000007c00000206 Feb 9 07:47:00 link-08 kernel: 0000010002296920 000000000001dc14 00000100355342b0 0000010002296c08 Feb 9 07:47:00 link-08 kernel: 0000010028323f58 ffffffff8036ed05 Feb 9 07:47:00 link-08 kernel: Call Trace:<ffffffff8013ea2f>{do_wait+3862} <ffffffff80134bc6>{default_wake_function+0} Feb 9 07:47:00 link-08 kernel: <ffffffff8014c455>{sys_rt_sigaction+133} <ffffffff80134bc6>{default_wake_function+0} Feb 9 07:47:00 link-08 kernel: <ffffffff80110a12>{system_call+126} Feb 9 07:47:00 link-08 kernel: kcopyd S 000001003f500a00 0 6594 3 7514 6204 (L-TLB) Feb 9 07:47:00 link-08 kernel: 000001003d695e98 0000000000000046 0000010033afb4f0 00000073a00ba824 Feb 9 07:47:00 link-08 kernel: 000001000a304130 00000000000288bf 000001000a3049a0 000001000a304418 Feb 9 07:47:00 link-08 kernel: 00000100103c9a48 00000000fffffffc Feb 9 07:47:00 link-08 kernel: Call Trace:<ffffffff8015012f>{worker_thread+0} <ffffffff80150205>{worker_thread+214} Feb 9 07:47:00 link-08 kernel: <ffffffff80134bc6>{default_wake_function+0} <ffffffff80134bc6>{default_wake_function+0} Feb 9 07:47:00 link-08 kernel: <ffffffff80155dcb>{kthread+188} <ffffffff8015012f>{worker_thread+0} Feb 9 07:47:00 link-08 kernel: <ffffffff801114bf>{child_rip+8} <ffffffff8015012f>{worker_thread+0} Feb 9 07:47:00 link-08 kernel: <ffffffff80155df4>{keventd_create_kthread+0} <ffffffff80155d0f>{kthread+0} Feb 9 07:47:00 link-08 kernel: <ffffffff801114b7>{child_rip+0} Feb 9 07:47:00 link-08 kernel: dd D 0000010037a0c500 0 7419 6522 7469 (NOTLB) Feb 9 07:47:00 link-08 kernel: 0000010013ab77c8 0000000000000006 00000100022d3420 ffffffff8028566a Feb 9 07:47:00 link-08 kernel: 00000100355342b0 000000000009668e ffffffff80420300 0000010035534598 Feb 9 07:47:00 link-08 kernel: 0000000000000002 0000010037e2aa00 Feb 9 07:47:00 link-08 kernel: Call Trace:<ffffffff8028566a>{__generic_unplug_device+19} <ffffffff80355ead>{io_schedule+15} Feb 9 07:47:00 link-08 kernel: <ffffffff8018e43a>{__lock_buffer+125} <ffffffff8018e36b>{bh_wake_function+0} Feb 9 07:47:00 link-08 kernel: <ffffffff8018e36b>{bh_wake_function+0} <ffffffff80191573>{__block_write_full_page+306} Feb 9 07:47:00 link-08 kernel: <ffffffff80195f63>{blkdev_get_block+0} <ffffffff801b9fca>{mpage_writepages+563} Feb 9 07:47:00 link-08 kernel: <ffffffff80196057>{blkdev_writepage+0} <ffffffff801dc8d4>{read_block_bitmap+50} Feb 9 07:47:00 link-08 kernel: <ffffffff8018fd50>{__find_get_block+377} <ffffffff80192509>{__getblk+43} Feb 9 07:47:00 link-08 kernel: <ffffffff801dc8d4>{read_block_bitmap+50} <ffffffff801dce52>{ext2_free_blocks+647} Feb 9 07:47:00 link-08 kernel: <ffffffff801b835f>{__writeback_single_inode+580} <ffffffff801b8fd7>{sync_sb_inodes+470} Feb 9 07:47:00 link-08 kernel: <ffffffff801b95d2>{writeback_inodes+500} <ffffffff8016d187>{balance_dirty_pages_ratelimited+205} Feb 9 07:47:00 link-08 kernel: <ffffffff80169040>{generic_file_buffered_write+1043} Feb 9 07:47:00 link-08 kernel: <ffffffff8013fa71>{current_fs_time+89} <ffffffff8016957f>{__generic_file_aio_write_nolock+734} Feb 9 07:47:00 link-08 kernel: <ffffffff80169651>{__generic_file_write_nolock+158} Feb 9 07:47:00 link-08 kernel: <ffffffff801366b4>{autoremove_wake_function+0} <ffffffff8021a825>{__clear_user+29} Feb 9 07:47:00 link-08 kernel: <ffffffff80169791>{generic_file_write+102} <ffffffff8018ceab>{vfs_write+207} Feb 9 07:47:00 link-08 kernel: <ffffffff8018cf93>{sys_write+69} <ffffffff80110a12>{system_call+126} Feb 9 07:47:00 link-08 kernel: Feb 9 07:47:00 link-08 kernel: lvcreate D ffffffff804df700 0 7469 6522 7419 (NOTLB) Feb 9 07:47:00 link-08 kernel: 0000010010bcdc88 0000000000000002 000001002834d750 0000007600000246 Feb 9 07:47:00 link-08 kernel: 0000010035535390 000000000000042c 000001002834d750 0000010035535678 Feb 9 07:47:00 link-08 kernel: 000000087a671400 0000000000000206 Feb 9 07:47:00 link-08 kernel: Call Trace:<ffffffff80196417>{bdev_test+0} <ffffffff80196426>{bdev_set+0} Feb 9 07:47:00 link-08 kernel: <ffffffff801ae07b>{__wait_on_inode+113} <ffffffff80134bc6>{default_wake_function+0} Feb 9 07:47:00 link-08 kernel: <ffffffff80134bc6>{default_wake_function+0} <ffffffff801af77b>{iget5_locked+393} Feb 9 07:47:00 link-08 kernel: <ffffffffa00b8c63>{:dm_mod:dev_status+0} <ffffffff8019646d>{bdget+60} Feb 9 07:47:00 link-08 kernel: <ffffffffa00b8468>{:dm_mod:__dev_status+96} <ffffffffa00b8d55>{:dm_mod:dev_status+242} Feb 9 07:47:00 link-08 kernel: <ffffffffa00b9a96>{:dm_mod:ctl_ioctl+599} <ffffffff801a37e2>{sys_ioctl+1006} Feb 9 07:47:00 link-08 kernel: <ffffffff80110a12>{system_call+126} Feb 9 07:47:00 link-08 kernel: pdflush R running task 0 7514 3 6594 (L-TLB) Feb 9 07:47:00 link-08 kernel: bash S 000009a702927dfb 0 7598 3078 7629 (NOTLB) Feb 9 07:47:00 link-08 kernel: 00000100191bbeb8 0000000000000002 00000000006c3f28 0000007500000202 Feb 9 07:47:00 link-08 kernel: 000001000a305210 00000000000234ab 0000010035534b20 000001000a3054f8 Feb 9 07:47:00 link-08 kernel: 00000100191bbf58 ffffffff8036ed05 Feb 9 07:47:00 link-08 kernel: Call Trace:<ffffffff8013ea2f>{do_wait+3862} <ffffffff80134bc6>{default_wake_function+0} Feb 9 07:47:00 link-08 kernel: <ffffffff80134bc6>{default_wake_function+0} <ffffffff80110a12>{system_call+126} Feb 9 07:47:00 link-08 kernel: Feb 9 07:47:00 link-08 kernel: dmsetup D ffffffff804df700 0 7635 3907 (NOTLB) Feb 9 07:47:00 link-08 kernel: 000001001e67bc88 0000000000000006 000001003405b000 ffffffff80260628 Feb 9 07:47:00 link-08 kernel: 0000010023669110 00000000000096b0 ffffffff80420300 00000100236693f8 Feb 9 07:47:00 link-08 kernel: 000001003405b000 ffffffff80260628 Feb 9 07:47:00 link-08 kernel: Call Trace:<ffffffff80260628>{n_tty_receive_buf+2677} <ffffffff80260628>{n_tty_receive_buf+2677} Feb 9 07:47:00 link-08 kernel: <ffffffff80196417>{bdev_test+0} <ffffffff80196426>{bdev_set+0} Feb 9 07:47:00 link-08 kernel: <ffffffff801ae07b>{__wait_on_inode+113} <ffffffff80134bc6>{default_wake_function+0} Feb 9 07:47:00 link-08 kernel: <ffffffff80134bc6>{default_wake_function+0} <ffffffff801af77b>{iget5_locked+393} Feb 9 07:47:00 link-08 kernel: <ffffffffa00b8c63>{:dm_mod:dev_status+0} <ffffffff8019646d>{bdget+60} Feb 9 07:47:00 link-08 kernel: <ffffffffa00b8468>{:dm_mod:__dev_status+96} <ffffffffa00b8d55>{:dm_mod:dev_status+242} Feb 9 07:47:00 link-08 kernel: <ffffffffa00b9a96>{:dm_mod:ctl_ioctl+599} <ffffffff801a37e2>{sys_ioctl+1006} Feb 9 07:47:00 link-08 kernel: <ffffffff80110a12>{system_call+126}
Great - a full stack trace of all the processes! I'll tidy up the output and try to work out what's happening...
The dmsetup and the lvcreate are both stuck at the same place in bdget. The only other relevant process (the dump has a lot of processes missing unfortunately) is a dd in io_schedule() in a write(). Will trace through the code to determine whether or not this process is also stuck or not.
I haven't spotted the problem - really need to see all the other processes. 'ps' including the wchan field, run a couple of times separated by a minute or so would show whether that 'dd' process was really stuck or not, and might show whether any other processes are involved. Or try the sysrq again and see if you can find a way to capture the start of its output, not just the end.
It is bad that iget is called while something is suspended. Iget can wait for I/O or allocate memory, so the deadlock is quite obvious. It needs to be moved somewhere else, probably to table construction. I'm looking at it.
So I looked at upstream code and the bug is real. When you have an origin and a snapshot and you create another snapshot, LVM calls suspend with DM_SUSPEND_LOCKFS_FLAG on the origin, then suspend with DM_SUSPEND_LOCKFS_FLAG on the existing snapshot, then suspend with DM_SUSPEND_LOCKFS_FLAG on the -real and -cow devices, then loads new table and then calls resume on -real, -cow, the snapshots and finally on the origin. So the inner suspend operations call bdget_disk and it might go into __wait_for_inode (waiting for the IO on the outer device), as seen in this trace.
Created attachment 309020 [details] First patch (doesn't change functionality)
Created attachment 309021 [details] fix the deadlock (in upstream code) This patch fixes the deadlock in upstream code. Load reference to bdev when we create mapped device and keep it until we destroy it --- so that we won't have to ask for the reference while we are suspended (which triggered the deadlock).
RHEL 4 is in maintenance, so this likely won't be fixed at all. It is already fixed in upstream and the fix won't be backported.