Bug 179786 - snapshot creation hang in __wait_on_inode
Summary: snapshot creation hang in __wait_on_inode
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: lvm2
Version: 4.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Mikuláš Patočka
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-02-02 22:12 UTC by Corey Marthaler
Modified: 2010-04-28 14:57 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-04-28 14:57:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
First patch (doesn't change functionality) (3.56 KB, patch)
2008-06-12 01:48 UTC, Mikuláš Patočka
no flags Details | Diff
fix the deadlock (in upstream code) (3.42 KB, patch)
2008-06-12 01:51 UTC, Mikuláš Patočka
no flags Details | Diff

Description Corey Marthaler 2006-02-02 22:12:08 UTC
Description of problem:
I don't have a lot to go on right now, but I saw this hang trying to create a
snapshot. I'll try and reproduce this and then add more info.

[root@link-08 ~]# ps -C lvcreate
  PID TTY          TIME CMD
 5044 pts/0    00:00:01 lvcreate

[root@link-08 ~]# lsof -p 5044
COMMAND   PID USER   FD   TYPE DEVICE     SIZE    NODE NAME
lvcreate 5044 root  cwd    DIR  253,0    61440 3227649 /tmp
lvcreate 5044 root  rtd    DIR  253,0     4096       2 /
lvcreate 5044 root  txt    REG  253,0   497688 4167049 /usr/sbin/lvm
lvcreate 5044 root  mem    REG  253,0 48524848 4165944
/usr/lib/locale/locale-archive
lvcreate 5044 root  mem    REG  253,0    21546 4194946
/usr/lib64/gconv/gconv-modules.cache
lvcreate 5044 root  mem    REG  253,0   105080 1409026 /lib64/ld-2.3.4.so
lvcreate 5044 root  mem    REG  253,0  1493186 1409050 /lib64/tls/libc-2.3.4.so
lvcreate 5044 root  mem    REG  253,0    17943 1409081 /lib64/libdl-2.3.4.so
lvcreate 5044 root  mem    REG  253,0    64904 1409235 /lib64/libdevmapper.so.1.02
lvcreate 5044 root  mem    REG  253,0   229824 4166319 /usr/lib64/libreadline.so.4.3
lvcreate 5044 root  mem    REG  253,0    62504 1409151 /lib64/libselinux.so.1
lvcreate 5044 root  mem    REG  253,0  1018858 4168026 /usr/lib64/libncurses.so.5.4
lvcreate 5044 root    0u   CHR  136,0                2 /dev/pts/0
lvcreate 5044 root    1u   CHR  136,0                2 /dev/pts/0
lvcreate 5044 root    2u   CHR  136,0                2 /dev/pts/0
lvcreate 5044 root    3u   CHR  10,63             1250 /dev/mapper/control
lvcreate 5044 root    4uW  REG  253,0        0  722834 /var/lock/lvm/V_snapper
lvcreate 5044 root    5u   BLK    8,1             1101 /dev/sda1


Feb  2 10:19:13 link-08 kernel: lvcreate      D ffffffff804df5c0     0  5044  
5093                     (NOTLB)
Feb  2 10:19:13 link-08 kernel: 00000100345b7c88 0000000000000006
0000010019ec71d0 0000007400000246
Feb  2 10:19:13 link-08 kernel:        000001003542c570 00000000000004f2
0000010019ec71d0 000001003542c858
Feb  2 10:19:13 link-08 kernel:        00000021ea19da00 0000000000000206
Feb  2 10:19:13 link-08 kernel: Call Trace:<ffffffff80196417>{bdev_test+0}
<ffffffff80196426>{bdev_set+0}
Feb  2 10:19:13 link-08 kernel:        <ffffffff801ae07b>{__wait_on_inode+113}
<ffffffff80134bc6>{default_wake_function+0}

Feb  2 10:19:13 link-08 kernel:       
<ffffffff80134bc6>{default_wake_function+0} <ffffffff801af77b>{iget5_locked+393}
Feb  2 10:19:13 link-08 kernel:        <ffffffffa00b8c63>{:dm_mod:dev_status+0}
<ffffffff8019646d>{bdget+60}
Feb  2 10:19:13 link-08 kernel:       
<ffffffffa00b8468>{:dm_mod:__dev_status+96} <ffffffffa00b8d55>{:dm_mod:dev_status+2
42}
Feb  2 10:19:13 link-08 kernel:        <ffffffffa00b9a96>{:dm_mod:ctl_ioctl+599}
<ffffffff801a37e2>{sys_ioctl+1006}
Feb  2 10:19:13 link-08 kernel:        <ffffffff80110a12>{system_call+126}


Version:
[root@link-08 tmp]# uname -ar
Linux link-08 2.6.9-29.EL #1 Wed Jan 25 11:18:11 EST 2006 x86_64 x86_64 x86_64
GNU/Linux
[root@link-08 tmp]# rpm -q lvm2
lvm2-2.02.01-1.3.RHEL4

Comment 1 Alasdair Kergon 2006-02-03 14:54:29 UTC
If an lvm2 process hangs like that happens again, another standard diagnostic to
attempt is 'dmsetup info -c'.  (And if that hangs too, 'dmsetup ls' for a list
of devices then run 'dmsetup info -c' separately on each, like 'dmsetup info -c
vg1-lvol0' to see which one is causing the hang.)

Comment 2 Alasdair Kergon 2006-02-03 14:57:35 UTC
Sometimes adding --noopencount avoids the hang:  'dmsetup info -c --noopencount'.

Comment 3 Alasdair Kergon 2006-02-03 15:11:31 UTC
It's hard to track this one down without more context - deducing exactly which
ioctl caused the problem and what state the logical volumes were in at the time.

If you see it again, also try to get a full task dump e.g. 'echo t >
/proc/sysrq-trigger'.


In short, the trace shows that the 'lvcreate' was waiting for a lock - we need
to find out why whatever was holding it never released it.  (We've seen similar
problems before with dm multipath under low memory.)


Comment 4 Alasdair Kergon 2006-02-03 15:18:32 UTC
And another diagnostice if you see it again: a listing of any dm* devices in the
'block' directory under the sysfs mountpoint.

Comment 5 Corey Marthaler 2006-02-07 15:41:22 UTC
Able to reproduce this now:

strace:
[...]
ioctl(3, DM_DEV_STATUS, 0x6a4ac0)       = -1 ENXIO (No such device or address)
ioctl(3, DM_DEV_STATUS, 0x6a4ac0)       = -1 ENXIO (No such device or address)
ioctl(3, DM_DEV_STATUS, 0x6a4ac0)       = -1 ENXIO (No such device or address)
ioctl(3, DM_DEV_STATUS, 0x6a4ac0)       = 0
ioctl(3, DM_DEV_STATUS, 0x6a4ac0)       = 0
ioctl(3, DM_TABLE_DEPS, 0x6a4ac0)       = 0
ioctl(3, DM_DEV_STATUS, 0x6a4ac0)       = -1 ENXIO (No such device or address)
ioctl(3, DM_DEV_STATUS, 0x6a4ac0)       = -1 ENXIO (No such device or address)
ioctl(3, DM_DEV_STATUS, 0x6a4ac0)       = -1 ENXIO (No such device or address)
ioctl(3, DM_DEV_STATUS, 0x6a4ac0)       = 0
ioctl(3, DM_TABLE_DEPS, 0x6a4ac0)       = 0
ioctl(3, DM_DEV_STATUS, 0x6a4ac0)       = 0
ioctl(3, DM_DEV_SUSPEND


[root@link-08 ~]# ps -C lvcreate
  PID TTY          TIME CMD
20215 pts/0    00:00:01 lvcreate
[root@link-08 ~]# lsof -p 20215
COMMAND    PID USER   FD   TYPE DEVICE     SIZE    NODE NAME
lvcreate 20215 root  cwd    DIR  253,2     4096       2 /mnt/origin
lvcreate 20215 root  rtd    DIR  253,0     4096       2 /
lvcreate 20215 root  txt    REG  253,0   497688 4167049 /usr/sbin/lvm
lvcreate 20215 root  mem    REG  253,0 48524848 4165944
/usr/lib/locale/locale-archive
lvcreate 20215 root  mem    REG  253,0    21546 4194946
/usr/lib64/gconv/gconv-modules.cache
lvcreate 20215 root  mem    REG  253,0   105080 1409026 /lib64/ld-2.3.4.so
lvcreate 20215 root  mem    REG  253,0  1493186 1409050 /lib64/tls/libc-2.3.4.so
lvcreate 20215 root  mem    REG  253,0    17943 1409081 /lib64/libdl-2.3.4.so
lvcreate 20215 root  mem    REG  253,0    64904 1409235 /lib64/libdevmapper.so.1.02
lvcreate 20215 root  mem    REG  253,0   229824 4166319
/usr/lib64/libreadline.so.4.3
lvcreate 20215 root  mem    REG  253,0    62504 1409151 /lib64/libselinux.so.1
lvcreate 20215 root  mem    REG  253,0  1018858 4168026 /usr/lib64/libncurses.so.5.4
lvcreate 20215 root    0u   CHR  136,0                2 /dev/pts/0
lvcreate 20215 root    1u   CHR  136,0                2 /dev/pts/0
lvcreate 20215 root    2u   CHR  136,0                2 /dev/pts/0
lvcreate 20215 root    3u   CHR  10,63             1340 /dev/mapper/control
lvcreate 20215 root    4uW  REG  253,0        0  722848 /var/lock/lvm/V_snapper
lvcreate 20215 root    5u   BLK    8,1             1131 /dev/sda1


[root@link-08 ~]# dmsetup info -c
Name             Maj Min Stat Open Targ Event  UUID
snapper-lvol4    253  17 L--w    0    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmlOlOhrGqU1H764pZ2B0lLpNflZVzcObs
snapper-lvol5-cow 253  18 L--w    1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmyupKyBdnGbdjDEPuv15YiiJXC1JTaIq4-cow
snapper-lvol3    253  15 L--w    0    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmceFDtLfdZpSmOLhkJ5296fLYpEE1dsSY
snapper-lvol2    253  13 L--w    0    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmQAOS8eMMFm4svRFEv249XKm3Ti1HyFIK
snapper-block_snap1-cow 253   4 L--w    1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmkOwVujEF1zwNULkeXB3LmyQb0wxmFYTu-cow
snapper-lvol0-cow 253   8 L--w    1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmWPxz7Nsb5jhB7V3pX3JlnERlJh248zJY-cow
snapper-lvol1    253  11 L--w    0    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmfLm6Fda5WTHrWhMfN27w31xxiukxndJV
snapper-lvol6-cow 253  20 L--w    1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8Vgmi1LRbFmQ1BG28IdaH3846x4MSh7y4J7C-cow
snapper-block_snap1 253   5 L--w    0    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmkOwVujEF1zwNULkeXB3LmyQb0wxmFYTu
snapper-lvol0    253   9 L--w    0    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmWPxz7Nsb5jhB7V3pX3JlnERlJh248zJY
snapper-lvol1-cow 253  10 L--w    1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmfLm6Fda5WTHrWhMfN27w31xxiukxndJV-cow
snapper-lvol7-cow 253  22 L--w    1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8Vgme7xnjQOdcLToqcJ25UHB6H3gVGofurxy-cow
snapper-origin-real 253   3 L--w   12    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmQkko3GWkfsOEhlYeFS1fPNZlzwQYT3Wi-real
snapper-lvol2-cow 253  12 L--w    1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmQAOS8eMMFm4svRFEv249XKm3Ti1HyFIK-cow
snapper-lvol8-cow 253  24 L--w    1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmMAxMOpVAthyy7BoK8R77QKxRmfArSIVQ-cow
snapper-lvol8    253  25 -I-w    0    0      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmMAxMOpVAthyy7BoK8R77QKxRmfArSIVQ
VolGroup00-LogVol01 253   1 L--w    0    1      0
LVM-Jiq2dT75ikN4hqkfXJZKX3KUN46xUvkxFLlUlLBlQdoGxn84xAMvc9QnYz7u0DfS
snapper-lvol3-cow 253  14 L--w    1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmceFDtLfdZpSmOLhkJ5296fLYpEE1dsSY-cow
snapper-lvol7    253  23 L--w    0    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8Vgme7xnjQOdcLToqcJ25UHB6H3gVGofurxy
snapper-fs_snap1-cow 253   6 L--w    1    1      2
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmMammTsqrl02PyP5pcmkJIoajl7WG3tjJ-cow
VolGroup00-LogVol00 253   0 L--w    1    1      0
LVM-Jiq2dT75ikN4hqkfXJZKX3KUN46xUvkxIn0DTEG9BvNPGlXNaP75R49PXzGI1YRg
snapper-lvol6    253  21 L--w    0    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8Vgmi1LRbFmQ1BG28IdaH3846x4MSh7y4J7C
snapper-fs_snap1 253   7 L--w    0    1      2
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmMammTsqrl02PyP5pcmkJIoajl7WG3tjJ
snapper-lvol4-cow 253  16 L--w    1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmlOlOhrGqU1H764pZ2B0lLpNflZVzcObs-cow
snapper-lvol5    253  19 L--w    0    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmyupKyBdnGbdjDEPuv15YiiJXC1JTaIq4
[HANG]


[root@link-08 origin]# lvs
[HANG]


[root@link-08 ~]# dmsetup info -c --noopencount
Name             Maj Min Stat Open Targ Event  UUID
snapper-lvol4    253  17 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmlOlOhrGqU1H764pZ2B0lLpNflZVzcObs
snapper-lvol5-cow 253  18 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmyupKyBdnGbdjDEPuv15YiiJXC1JTaIq4-cow
snapper-lvol3    253  15 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmceFDtLfdZpSmOLhkJ5296fLYpEE1dsSY
snapper-lvol2    253  13 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmQAOS8eMMFm4svRFEv249XKm3Ti1HyFIK
snapper-block_snap1-cow 253   4 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmkOwVujEF1zwNULkeXB3LmyQb0wxmFYTu-cow
snapper-lvol0-cow 253   8 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmWPxz7Nsb5jhB7V3pX3JlnERlJh248zJY-cow
snapper-lvol1    253  11 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmfLm6Fda5WTHrWhMfN27w31xxiukxndJV
snapper-lvol6-cow 253  20 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8Vgmi1LRbFmQ1BG28IdaH3846x4MSh7y4J7C-cow
snapper-block_snap1 253   5 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmkOwVujEF1zwNULkeXB3LmyQb0wxmFYTu
snapper-lvol0    253   9 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmWPxz7Nsb5jhB7V3pX3JlnERlJh248zJY
snapper-lvol1-cow 253  10 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmfLm6Fda5WTHrWhMfN27w31xxiukxndJV-cow
snapper-lvol7-cow 253  22 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8Vgme7xnjQOdcLToqcJ25UHB6H3gVGofurxy-cow
snapper-origin-real 253   3 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmQkko3GWkfsOEhlYeFS1fPNZlzwQYT3Wi-real
snapper-lvol2-cow 253  12 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmQAOS8eMMFm4svRFEv249XKm3Ti1HyFIK-cow
snapper-lvol8-cow 253  24 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmMAxMOpVAthyy7BoK8R77QKxRmfArSIVQ-cow
snapper-lvol8    253  25 -I-w   -1    0      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmMAxMOpVAthyy7BoK8R77QKxRmfArSIVQ
VolGroup00-LogVol01 253   1 L--w   -1    1      0
LVM-Jiq2dT75ikN4hqkfXJZKX3KUN46xUvkxFLlUlLBlQdoGxn84xAMvc9QnYz7u0DfS
snapper-lvol3-cow 253  14 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmceFDtLfdZpSmOLhkJ5296fLYpEE1dsSY-cow
snapper-lvol7    253  23 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8Vgme7xnjQOdcLToqcJ25UHB6H3gVGofurxy
snapper-fs_snap1-cow 253   6 L--w   -1    1      2
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmMammTsqrl02PyP5pcmkJIoajl7WG3tjJ-cow
VolGroup00-LogVol00 253   0 L--w   -1    1      0
LVM-Jiq2dT75ikN4hqkfXJZKX3KUN46xUvkxIn0DTEG9BvNPGlXNaP75R49PXzGI1YRg
snapper-lvol6    253  21 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8Vgmi1LRbFmQ1BG28IdaH3846x4MSh7y4J7C
snapper-fs_snap1 253   7 L--w   -1    1      2
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmMammTsqrl02PyP5pcmkJIoajl7WG3tjJ
snapper-lvol4-cow 253  16 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmlOlOhrGqU1H764pZ2B0lLpNflZVzcObs-cow
snapper-lvol5    253  19 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmyupKyBdnGbdjDEPuv15YiiJXC1JTaIq4
snapper-origin   253   2 L--w   -1    1      0
LVM-OkcP6KxlqTe2MggyCgHCkKsovr2p8VgmQkko3GWkfsOEhlYeFS1fPNZlzwQYT3Wi


Comment 6 Corey Marthaler 2006-02-07 15:48:17 UTC
This is the specific volume which is hung:
[root@link-08 ~]# dmsetup info -c snapper-origin
[HANG]                                                               

I/O to the origin filesystem is also blocked as a result (or cause) of this.



Comment 7 Corey Marthaler 2006-02-09 19:06:40 UTC
Hit this again today and was able to grab the requested system info this time.

I/O is hung to the volume snapper-origin, as well as an lvcreate snapshot attempt.

[root@link-08 tmp]# dmsetup info -c --noopencount
Name             Maj Min Stat Open Targ Event  UUID
snapper-lvol4    253  17 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9qWuv8tmwaNPtBGgsU6pu9rzroMd1ujZK
snapper-lvol5-cow 253  18 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9pyFPjLkCI9UdBW7qOdSmaDnoT9jWSNnm-cow
snapper-lvol3    253  15 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9Q1InKR38AzvhtnjqcV3kOx7pEkxUK62y
snapper-fs_snap3-cow 253  20 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9fKnZKose1Tr6EiAbk6sTWxt2l3Plt4kA-cow
snapper-lvol2    253  13 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9CIBB000uydlF0K7bKI8DFh7RUAsksDbx
snapper-block_snap1-cow 253   4 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ99WGc0yeL12yMYHqD4JeIh40XBYnluDav-cow
snapper-lvol1    253  11 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9e7lJnUqGcqdc6owZNuN8UBESaHS8wDwi
snapper-block_snap1 253   5 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ99WGc0yeL12yMYHqD4JeIh40XBYnluDav
snapper-lvol1-cow 253  10 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9e7lJnUqGcqdc6owZNuN8UBESaHS8wDwi-cow
snapper-32_1-cow 253  22 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ97fK86KrtBzQX6D49J0ROPvR9Tf3xLQPm-cow
snapper-32_2     253  25 -Isw   -1    0      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ911t00T8qWeYOVLAECF4yDD5Pq5M56V3a
snapper-32_1     253  23 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ97fK86KrtBzQX6D49J0ROPvR9Tf3xLQPm
snapper-origin-real 253   3 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9I3ybJcpSCq275KtOSVD0Lh8amqu6zU3W-real
snapper-lvol2-cow 253  12 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9CIBB000uydlF0K7bKI8DFh7RUAsksDbx-cow
snapper-32_2-cow 253  24 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ911t00T8qWeYOVLAECF4yDD5Pq5M56V3a-cow
snapper-fs_snap3 253  21 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9fKnZKose1Tr6EiAbk6sTWxt2l3Plt4kA
VolGroup00-LogVol01 253   1 L--w   -1    1      0
LVM-Jiq2dT75ikN4hqkfXJZKX3KUN46xUvkxFLlUlLBlQdoGxn84xAMvc9QnYz7u0DfS
snapper-lvol3-cow 253  14 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9Q1InKR38AzvhtnjqcV3kOx7pEkxUK62y-cow
snapper-fs_snap1-cow 253   6 L-sw   -1    1      2
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9T7WCxOm5Y6EPN6zr5X0wsBnlAZiMwa4m-cow
snapper-fs_snap2 253   9 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9v0IDZleyHWkqZTmOV9FhJmdHxQUF7d2D
VolGroup00-LogVol00 253   0 L--w   -1    1      0
LVM-Jiq2dT75ikN4hqkfXJZKX3KUN46xUvkxIn0DTEG9BvNPGlXNaP75R49PXzGI1YRg
snapper-fs_snap1 253   7 L-sw   -1    1      2
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9T7WCxOm5Y6EPN6zr5X0wsBnlAZiMwa4m
snapper-lvol4-cow 253  16 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9qWuv8tmwaNPtBGgsU6pu9rzroMd1ujZK-cow
snapper-lvol5    253  19 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9pyFPjLkCI9UdBW7qOdSmaDnoT9jWSNnm
snapper-origin   253   2 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9I3ybJcpSCq275KtOSVD0Lh8amqu6zU3W
snapper-fs_snap2-cow 253   8 L-sw   -1    1      0
LVM-h3p5EaVAY4b32zKBy3Il63tJoMoINqZ9v0IDZleyHWkqZTmOV9FhJmdHxQUF7d2D-cow


Feb  9 07:47:00 link-08 kernel: snapper.sh    S 00000000fffffe00     0  6522  
3873  7419               (NOTLB)
Feb  9 07:47:00 link-08 kernel: 0000010028323eb8 0000000000000002
00000000006beccc 0000007c00000206
Feb  9 07:47:00 link-08 kernel:        0000010002296920 000000000001dc14
00000100355342b0 0000010002296c08
Feb  9 07:47:00 link-08 kernel:        0000010028323f58 ffffffff8036ed05
Feb  9 07:47:00 link-08 kernel: Call Trace:<ffffffff8013ea2f>{do_wait+3862}
<ffffffff80134bc6>{default_wake_function+0}
Feb  9 07:47:00 link-08 kernel:        <ffffffff8014c455>{sys_rt_sigaction+133}
<ffffffff80134bc6>{default_wake_function+0}
Feb  9 07:47:00 link-08 kernel:        <ffffffff80110a12>{system_call+126}
Feb  9 07:47:00 link-08 kernel: kcopyd        S 000001003f500a00     0  6594   
  3          7514  6204 (L-TLB)
Feb  9 07:47:00 link-08 kernel: 000001003d695e98 0000000000000046
0000010033afb4f0 00000073a00ba824
Feb  9 07:47:00 link-08 kernel:        000001000a304130 00000000000288bf
000001000a3049a0 000001000a304418
Feb  9 07:47:00 link-08 kernel:        00000100103c9a48 00000000fffffffc
Feb  9 07:47:00 link-08 kernel: Call Trace:<ffffffff8015012f>{worker_thread+0}
<ffffffff80150205>{worker_thread+214}
Feb  9 07:47:00 link-08 kernel:       
<ffffffff80134bc6>{default_wake_function+0}
<ffffffff80134bc6>{default_wake_function+0}
Feb  9 07:47:00 link-08 kernel:        <ffffffff80155dcb>{kthread+188}
<ffffffff8015012f>{worker_thread+0}
Feb  9 07:47:00 link-08 kernel:        <ffffffff801114bf>{child_rip+8}
<ffffffff8015012f>{worker_thread+0}
Feb  9 07:47:00 link-08 kernel:       
<ffffffff80155df4>{keventd_create_kthread+0} <ffffffff80155d0f>{kthread+0}
Feb  9 07:47:00 link-08 kernel:        <ffffffff801114b7>{child_rip+0}
Feb  9 07:47:00 link-08 kernel: dd            D 0000010037a0c500     0  7419  
6522          7469       (NOTLB)
Feb  9 07:47:00 link-08 kernel: 0000010013ab77c8 0000000000000006
00000100022d3420 ffffffff8028566a
Feb  9 07:47:00 link-08 kernel:        00000100355342b0 000000000009668e
ffffffff80420300 0000010035534598
Feb  9 07:47:00 link-08 kernel:        0000000000000002 0000010037e2aa00
Feb  9 07:47:00 link-08 kernel: Call
Trace:<ffffffff8028566a>{__generic_unplug_device+19}
<ffffffff80355ead>{io_schedule+15}
Feb  9 07:47:00 link-08 kernel:        <ffffffff8018e43a>{__lock_buffer+125}
<ffffffff8018e36b>{bh_wake_function+0}
Feb  9 07:47:00 link-08 kernel:        <ffffffff8018e36b>{bh_wake_function+0}
<ffffffff80191573>{__block_write_full_page+306}
Feb  9 07:47:00 link-08 kernel:        <ffffffff80195f63>{blkdev_get_block+0}
<ffffffff801b9fca>{mpage_writepages+563}
Feb  9 07:47:00 link-08 kernel:        <ffffffff80196057>{blkdev_writepage+0}
<ffffffff801dc8d4>{read_block_bitmap+50}
Feb  9 07:47:00 link-08 kernel:        <ffffffff8018fd50>{__find_get_block+377}
<ffffffff80192509>{__getblk+43}
Feb  9 07:47:00 link-08 kernel:        <ffffffff801dc8d4>{read_block_bitmap+50}
<ffffffff801dce52>{ext2_free_blocks+647}
Feb  9 07:47:00 link-08 kernel:       
<ffffffff801b835f>{__writeback_single_inode+580}
<ffffffff801b8fd7>{sync_sb_inodes+470}
Feb  9 07:47:00 link-08 kernel:        <ffffffff801b95d2>{writeback_inodes+500}
<ffffffff8016d187>{balance_dirty_pages_ratelimited+205}
Feb  9 07:47:00 link-08 kernel:       
<ffffffff80169040>{generic_file_buffered_write+1043}
Feb  9 07:47:00 link-08 kernel:        <ffffffff8013fa71>{current_fs_time+89}
<ffffffff8016957f>{__generic_file_aio_write_nolock+734}
Feb  9 07:47:00 link-08 kernel:       
<ffffffff80169651>{__generic_file_write_nolock+158}
Feb  9 07:47:00 link-08 kernel:       
<ffffffff801366b4>{autoremove_wake_function+0} <ffffffff8021a825>{__clear_user+29}
Feb  9 07:47:00 link-08 kernel:       
<ffffffff80169791>{generic_file_write+102} <ffffffff8018ceab>{vfs_write+207}
Feb  9 07:47:00 link-08 kernel:        <ffffffff8018cf93>{sys_write+69}
<ffffffff80110a12>{system_call+126}
Feb  9 07:47:00 link-08 kernel:
Feb  9 07:47:00 link-08 kernel: lvcreate      D ffffffff804df700     0  7469  
6522                7419 (NOTLB)
Feb  9 07:47:00 link-08 kernel: 0000010010bcdc88 0000000000000002
000001002834d750 0000007600000246
Feb  9 07:47:00 link-08 kernel:        0000010035535390 000000000000042c
000001002834d750 0000010035535678
Feb  9 07:47:00 link-08 kernel:        000000087a671400 0000000000000206
Feb  9 07:47:00 link-08 kernel: Call Trace:<ffffffff80196417>{bdev_test+0}
<ffffffff80196426>{bdev_set+0}
Feb  9 07:47:00 link-08 kernel:        <ffffffff801ae07b>{__wait_on_inode+113}
<ffffffff80134bc6>{default_wake_function+0}
Feb  9 07:47:00 link-08 kernel:       
<ffffffff80134bc6>{default_wake_function+0} <ffffffff801af77b>{iget5_locked+393}
Feb  9 07:47:00 link-08 kernel:        <ffffffffa00b8c63>{:dm_mod:dev_status+0}
<ffffffff8019646d>{bdget+60}
Feb  9 07:47:00 link-08 kernel:       
<ffffffffa00b8468>{:dm_mod:__dev_status+96}
<ffffffffa00b8d55>{:dm_mod:dev_status+242}
Feb  9 07:47:00 link-08 kernel:        <ffffffffa00b9a96>{:dm_mod:ctl_ioctl+599}
<ffffffff801a37e2>{sys_ioctl+1006}
Feb  9 07:47:00 link-08 kernel:        <ffffffff80110a12>{system_call+126}
Feb  9 07:47:00 link-08 kernel: pdflush       R  running task       0  7514    
 3                6594 (L-TLB)
Feb  9 07:47:00 link-08 kernel: bash          S 000009a702927dfb     0  7598  
3078  7629               (NOTLB)
Feb  9 07:47:00 link-08 kernel: 00000100191bbeb8 0000000000000002
00000000006c3f28 0000007500000202
Feb  9 07:47:00 link-08 kernel:        000001000a305210 00000000000234ab
0000010035534b20 000001000a3054f8
Feb  9 07:47:00 link-08 kernel:        00000100191bbf58 ffffffff8036ed05
Feb  9 07:47:00 link-08 kernel: Call Trace:<ffffffff8013ea2f>{do_wait+3862}
<ffffffff80134bc6>{default_wake_function+0}
Feb  9 07:47:00 link-08 kernel:       
<ffffffff80134bc6>{default_wake_function+0} <ffffffff80110a12>{system_call+126}
Feb  9 07:47:00 link-08 kernel:
Feb  9 07:47:00 link-08 kernel: dmsetup       D ffffffff804df700     0  7635  
3907                     (NOTLB)
Feb  9 07:47:00 link-08 kernel: 000001001e67bc88 0000000000000006
000001003405b000 ffffffff80260628
Feb  9 07:47:00 link-08 kernel:        0000010023669110 00000000000096b0
ffffffff80420300 00000100236693f8
Feb  9 07:47:00 link-08 kernel:        000001003405b000 ffffffff80260628
Feb  9 07:47:00 link-08 kernel: Call
Trace:<ffffffff80260628>{n_tty_receive_buf+2677}
<ffffffff80260628>{n_tty_receive_buf+2677}
Feb  9 07:47:00 link-08 kernel:        <ffffffff80196417>{bdev_test+0}
<ffffffff80196426>{bdev_set+0}
Feb  9 07:47:00 link-08 kernel:        <ffffffff801ae07b>{__wait_on_inode+113}
<ffffffff80134bc6>{default_wake_function+0}
Feb  9 07:47:00 link-08 kernel:       
<ffffffff80134bc6>{default_wake_function+0} <ffffffff801af77b>{iget5_locked+393}
Feb  9 07:47:00 link-08 kernel:        <ffffffffa00b8c63>{:dm_mod:dev_status+0}
<ffffffff8019646d>{bdget+60}
Feb  9 07:47:00 link-08 kernel:       
<ffffffffa00b8468>{:dm_mod:__dev_status+96}
<ffffffffa00b8d55>{:dm_mod:dev_status+242}
Feb  9 07:47:00 link-08 kernel:        <ffffffffa00b9a96>{:dm_mod:ctl_ioctl+599}
<ffffffff801a37e2>{sys_ioctl+1006}
Feb  9 07:47:00 link-08 kernel:        <ffffffff80110a12>{system_call+126}


Comment 8 Alasdair Kergon 2006-02-09 19:16:20 UTC
Great - a full stack trace of all the processes!  I'll tidy up the output and
try to work out what's happening...

Comment 9 Alasdair Kergon 2006-02-13 16:17:58 UTC
The dmsetup and the lvcreate are both stuck at the same place in bdget.

The only other relevant process (the dump has a lot of processes missing
unfortunately) is a dd in io_schedule() in a write().  Will trace through the
code to determine whether or not this process is also stuck or not.

Comment 10 Alasdair Kergon 2006-02-13 16:49:45 UTC
I haven't spotted the problem - really need to see all the other processes. 
'ps' including the wchan field, run a couple of times separated by a minute or
so would show whether that 'dd' process was really stuck or not, and might show
whether any other processes are involved.

Or try the sysrq again and see if you can find a way to capture the start of its
output, not just the end.

Comment 11 Mikuláš Patočka 2008-06-09 00:43:58 UTC
It is bad that iget is called while something is suspended.

Iget can wait for I/O or allocate memory, so the deadlock is quite obvious. It
needs to be moved somewhere else, probably to table construction. I'm looking at
it.

Comment 17 Mikuláš Patočka 2008-06-10 01:42:49 UTC
So I looked at upstream code and the bug is real.

When you have an origin and a snapshot and you create another snapshot, LVM
calls suspend with DM_SUSPEND_LOCKFS_FLAG on the origin, then suspend with
DM_SUSPEND_LOCKFS_FLAG on the existing snapshot, then suspend with
DM_SUSPEND_LOCKFS_FLAG on the -real and -cow devices, then loads new table and
then calls resume on -real, -cow, the snapshots and finally on the origin.

So the inner suspend operations call bdget_disk and it might go into
__wait_for_inode (waiting for the IO on the outer device), as seen in this
trace.

Comment 18 Mikuláš Patočka 2008-06-12 01:48:03 UTC
Created attachment 309020 [details]
First patch (doesn't change functionality)

Comment 19 Mikuláš Patočka 2008-06-12 01:51:26 UTC
Created attachment 309021 [details]
fix the deadlock (in upstream code)

This patch fixes the deadlock in upstream code. Load reference to bdev when we
create mapped device and keep it until we destroy it --- so that we won't have
to ask for the reference while we are suspended (which triggered the deadlock).

Comment 20 Mikuláš Patočka 2010-04-28 14:57:00 UTC
RHEL 4 is in maintenance, so this likely won't be fixed at all. It is already fixed in upstream and the fix won't be backported.


Note You need to log in before you can comment on or make changes to this bug.