Bug 568503 - snapshot bug: lock held when returning to user space
Summary: snapshot bug: lock held when returning to user space
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 14
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Eric Sandeen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-02-25 20:54 UTC by Zing
Modified: 2010-08-27 06:25 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-08-27 06:25:30 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Zing 2010-02-25 20:54:56 UTC
Description of problem:

A snapshot of my root lv produces this bug trace:

------------[ cut here ]------------
WARNING: at lib/debugobjects.c:291 __debug_object_init+0x2b9/0x370()
Hardware name: 
Modules linked in: ipv6 snd_ens1370 gameport snd_rawmidi snd_seq snd_seq_device snd_pcm joydev virtio_net virtio_balloon snd_timer i2c_piix4 snd soundcore i2c_core snd_page_alloc virtio_blk virtio_pci virtio_ring virtio [last unloaded: scsi_wait_scan]
Pid: 1075, comm: lvcreate Not tainted 2.6.33-0.48.rc8.git1.fc14.x86_64 #1
Call Trace:
 [<ffffffff8105034c>] warn_slowpath_common+0x7c/0x94
 [<ffffffff81050378>] warn_slowpath_null+0x14/0x16
 [<ffffffff812362b7>] __debug_object_init+0x2b9/0x370
 [<ffffffff8123639b>] debug_object_init+0x14/0x19
 [<ffffffff81067c42>] __init_work+0x27/0x29
 [<ffffffff8139cd1e>] chunk_io+0xba/0x127
 [<ffffffff810ff87b>] ? __vmalloc_area_node+0x12e/0x152
 [<ffffffff8139d100>] ? alloc_area+0x59/0x82
 [<ffffffff810ffa1c>] ? vmalloc+0x2a/0x2c
 [<ffffffff8139a43b>] ? dm_add_exception+0x0/0x4c
 [<ffffffff8139d261>] persistent_read_metadata+0xe1/0x332
 [<ffffffff8122ce65>] ? __up_write+0x42/0x47
 [<ffffffff8139c182>] snapshot_ctr+0x65f/0x7dd
 [<ffffffff81395902>] dm_table_add_target+0x14e/0x1ca
 [<ffffffff813977bc>] table_load+0x268/0x277
 [<ffffffff81397554>] ? table_load+0x0/0x277
 [<ffffffff8139818a>] ctl_ioctl+0x1ca/0x222
 [<ffffffff8107ba6a>] ? lock_release_holdtime+0x34/0xe3
 [<ffffffff813981f5>] dm_ctl_ioctl+0x13/0x17
 [<ffffffff8112bca8>] vfs_ioctl+0x32/0xa6
 [<ffffffff8112c228>] do_vfs_ioctl+0x490/0x4d6
 [<ffffffff8112c2c4>] sys_ioctl+0x56/0x79
 [<ffffffff81009c72>] system_call_fastpath+0x16/0x1b
---[ end trace 24dfbeebf5f624f4 ]---
ODEBUG: object is on stack, but not annotated
------------[ cut here ]------------
WARNING: at lib/debugobjects.c:291 __debug_object_init+0x2b9/0x370()
Hardware name: 
Modules linked in: ipv6 snd_ens1370 gameport snd_rawmidi snd_seq snd_seq_device snd_pcm joydev virtio_net virtio_balloon snd_timer i2c_piix4 snd soundcore i2c_core snd_page_alloc virtio_blk virtio_pci virtio_ring virtio [last unloaded: scsi_wait_scan]
Pid: 1075, comm: lvcreate Tainted: G        W  2.6.33-0.48.rc8.git1.fc14.x86_64 #1
Call Trace:
 [<ffffffff8105034c>] warn_slowpath_common+0x7c/0x94
 [<ffffffff81050378>] warn_slowpath_null+0x14/0x16
 [<ffffffff812362b7>] __debug_object_init+0x2b9/0x370
 [<ffffffff8123639b>] debug_object_init+0x14/0x19
 [<ffffffff81067c42>] __init_work+0x27/0x29
 [<ffffffff8139cd1e>] chunk_io+0xba/0x127
 [<ffffffff810ff87b>] ? __vmalloc_area_node+0x12e/0x152
 [<ffffffff81395581>] ? dm_vcalloc+0x2d/0x47
 [<ffffffff81395581>] ? dm_vcalloc+0x2d/0x47
 [<ffffffff8139a43b>] ? dm_add_exception+0x0/0x4c
 [<ffffffff8139a43b>] ? dm_add_exception+0x0/0x4c
 [<ffffffff8139cde4>] write_header+0x59/0x5b
 [<ffffffff8139d396>] persistent_read_metadata+0x216/0x332
 [<ffffffff8122ce65>] ? __up_write+0x42/0x47
 [<ffffffff8139c182>] snapshot_ctr+0x65f/0x7dd
 [<ffffffff81395902>] dm_table_add_target+0x14e/0x1ca
 [<ffffffff813977bc>] table_load+0x268/0x277
 [<ffffffff81397554>] ? table_load+0x0/0x277
 [<ffffffff8139818a>] ctl_ioctl+0x1ca/0x222
 [<ffffffff8107ba6a>] ? lock_release_holdtime+0x34/0xe3
 [<ffffffff813981f5>] dm_ctl_ioctl+0x13/0x17
 [<ffffffff8112bca8>] vfs_ioctl+0x32/0xa6
 [<ffffffff8112c228>] do_vfs_ioctl+0x490/0x4d6
 [<ffffffff8112c2c4>] sys_ioctl+0x56/0x79
 [<ffffffff81009c72>] system_call_fastpath+0x16/0x1b
---[ end trace 24dfbeebf5f624f5 ]---

================================================
[ BUG: lock held when returning to user space! ]
------------------------------------------------
lvcreate/1075 is leaving the kernel with locks still held!
1 lock held by lvcreate/1075:
 #0:  (&journal->j_barrier){+.+...}, at: [<ffffffff811c6214>] jbd2_journal_lock_updates+0xe1/0xf0
Version-Release number of selected component (if applicable):


How reproducible:
seems to consistently happen the first time after a fresh boot.  seems inconsistent when creating further snapshots.

Steps to Reproduce:
1. take a snapshot: $ lvcreate -s -n snaproot -L1G /dev/myvg/myroot
  
Actual results:
above BUG

Expected results:
no BUG

Additional info:
I'm also able to make this happen on non root volumes too.

2.6.33-0.48.rc8.git1.fc14.x86_64
lvm2-2.02.61-1.fc13.x86_64
device-mapper-1.02.44-1.fc13.x86_64

Comment 1 Mike Snitzer 2010-02-25 22:17:39 UTC
The fix for this was already included in the recently released 2.6.33 final:
http://git.kernel.org/linus/55f67f2dedec1e3049

Comment 2 Mike Snitzer 2010-02-25 22:46:26 UTC
The fix referenced in Comment #1 only addresses the "ODEBUG: object is on stack, but not annotated".

But the "lock held when returning to user space!" looks very similar to traces that were previously reported:
https://www.redhat.com/archives/dm-devel/2009-November/msg00115.html

and analyzed to be rooted in ext[34] (and a likely false positive):
https://www.redhat.com/archives/dm-devel/2009-November/msg00186.html

the linux-ext4 mailing list was cc'd but never responded:
https://www.redhat.com/archives/dm-devel/2009-November/msg00193.html

A much older instance of this was reported/discussed here (exactly 1 year ago!):
http://lkml.org/lkml/2009/2/25/208

Comment 3 Eric Sandeen 2010-02-25 22:59:24 UTC
Yup this is an ext3/ext4 bug.  we clearly take a mutex lock in jbd2_journal_lock_updates() and return to userspace.

-Eric

Comment 4 Mike Snitzer 2010-02-25 23:17:42 UTC
dm_suspend() will issue the freeze_bdev()
- freeze_bdev() -- lock a filesystem and force it into a consistent state.
  - On Linux >= 2.6.29, freeze_bdev() will then use the freeze_fs() hook
    that was added to various filesystems:
    http://git.kernel.org/linus/c4be0c1dc4cdc
  - On Linux < 2.6.29, freeze_bdev() will use write_super_lockfs() hook;
    which was comparable to freeze_fs()

dm_resume() will issue the thaw_bdev()

dm_suspend() / dm_resume() are 2 separate ioctls that each return to userspace

Comment 5 Alasdair Kergon 2010-02-26 01:41:30 UTC
Like the old xfs_freeze, that more recently got extended to other filesystems too, the concept of freezing a filesystem and returning to userspace was supported for as long as I've been involved.

Comment 6 Eric Sandeen 2010-03-30 16:52:39 UTC
Patch sent upstream, http://marc.info/?l=linux-ext4&m=126990209314147&w=2

Comment 7 Bug Zapper 2010-07-30 10:55:03 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 14 development cycle.
Changing version to '14'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 8 Chuck Ebbert 2010-08-10 16:56:13 UTC
Fixed by commit 6b0310fbf087ad6 in 2.6.35-rc1. But that commit (now also in 2.6.32.17) causes deadlocks, which are now fixed by commit 437f88cc031ffe7f37f3e705367f4fe1f4be8b0f.


Note You need to log in before you can comment on or make changes to this bug.