Bug 219775
| Summary: | GFS2 list corruption in journal code | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Steve Whitehouse <swhiteho> | ||||
| Component: | GFS-kernel | Assignee: | Steve Whitehouse <swhiteho> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | rawhide | ||||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | All | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | 2.6.20-1.2923 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2007-02-23 14:49:42 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 226994 | ||||||
| Attachments: |
|
||||||
|
Description
Steve Whitehouse
2006-12-15 12:31:24 UTC
Another report of the same or something very similar at any rate: [ 393.008737] list_add corruption. next->prev should be prev (ffff8101242356a8), but was ffff81012151e500. (next=ffff81012151e500). [ 393.008903] ------------[ cut here ]------------ [ 393.008953] kernel BUG at lib/list_debug.c:27! [ 393.008995] invalid opcode: 0000 [1] SMP [ 393.009113] CPU 0 [ 393.009186] Modules linked in: autofs4 i2c_dev i2c_core hidp rfcomm l2cap bluetooth sunrpc video button battery asus_acpi backlight ac floppy ohci1394 uhci_hcd ehci_hcd ieee1394 e1000 pcspkr shpchp dm_snapshot dm_zero dm_mirror dm_mod [ 393.010185] Pid: 2406, comm: dbench Not tainted 2.6.20-rc5 #122 [ 393.010227] RIP: 0010:[<ffffffff8034eb82>] [<ffffffff8034eb82>] __list_add+0x27/0x5b [ 393.010313] RSP: 0018:ffff810110f318e8 EFLAGS: 00010292 [ 393.010354] RAX: 0000000000000088 RBX: ffff810056b8c9b8 RCX: ffffffff80229f03[ 393.010398] RDX: 0000000000000008 RSI: ffff8101223207c8 RDI: ffff810122320040[ 393.010442] RBP: ffff810110f318e8 R08: 0000000000000002 R09: 0000000000000080[ 393.010486] R10: 0000000000000080 R11: 0000000000000002 R12: ffff810054e51f30 [ 393.010532] R13: ffff810124235640 R14: ffff81011799e000 R15: ffff810110f31a04[ 393.010577] FS: 00002ac6b6a16200(0000) GS:ffffffff80709000(0000) knlGS:0000000000000000 [ 393.010623] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 393.010672] CR2: 00002ac6b69f4000 CR3: 000000010f75d000 CR4: 00000000000006e0[ 393.010723] Process dbench (pid: 2406, threadinfo ffff810110f30000, task ffff810122320040) [ 393.010778] Stack: ffff810110f318f8 ffffffff8034ebc2 ffff810110f31928 ffffffff80316ead [ 393.011011] ffff810056b8c9b8 ffff81011799e000 ffff810056b8c9b8 ffff810110f319f0 [ 393.011209] ffff810110f31948 ffffffff803261f6 ffff8100532e97b0 ffff8100532e5200 [ 393.011363] Call Trace: [ 393.011448] [<ffffffff8034ebc2>] list_add+0xc/0xe [ 393.011497] [<ffffffff80316ead>] buf_lo_add+0x44/0xd3 [ 393.011544] [<ffffffff803261f6>] gfs2_trans_add_bh+0x4a/0x4f [ 393.011594] [<ffffffff803084b1>] lookup_block+0xc2/0x110 [ 393.011643] [<ffffffff803086ec>] gfs2_block_map+0x1ed/0x33e [ 393.011693] [<ffffffff80318d54>] gfs2_get_block+0x11/0x13 [ 393.011743] [<ffffffff8029e638>] __block_prepare_write+0x182/0x414 [ 393.011792] [<ffffffff80318d43>] gfs2_get_block+0x0/0x13 [ 393.011841] [<ffffffff8029e8ec>] block_prepare_write+0x22/0x2f [ 393.011891] [<ffffffff803199a6>] gfs2_prepare_write+0x1d7/0x222 [ 393.011942] [<ffffffff8025a9ab>] generic_file_buffered_write+0x2e9/0x70e [ 393.011995] [<ffffffff8022df8d>] current_fs_time+0x3f/0x42 [ 393.012045] [<ffffffff8025b15d>] __generic_file_aio_write_nolock+0x38d/0x400[ 393.012097] [<ffffffff8024022e>] debug_mutex_free_waiter+0x5b/0x5f [ 393.012148] [<ffffffff8025b234>] generic_file_aio_write+0x64/0xc0 [ 393.012199] [<ffffffff8027e7b0>] do_sync_write+0xe2/0x126 [ 393.012248] [<ffffffff8030fd29>] gfs2_glock_put+0x13f/0x146 [ 393.012298] [<ffffffff8023c34c>] autoremove_wake_function+0x0/0x38 [ 393.012349] [<ffffffff80234257>] kill_pid_info+0x52/0x70 [ 393.012399] [<ffffffff8027ef9a>] vfs_write+0xae/0x157 [ 393.012448] [<ffffffff8027f65c>] sys_pwrite64+0x55/0x76 [ 393.012497] [<ffffffff80209825>] tracesys+0xdc/0xe1 [ 393.012544] [ 393.012586] [ 393.012587] Code: 0f 0b eb fe 4c 8b 00 49 39 f0 74 18 48 89 c1 4c 89 c2 48 c7 [ 393.013407] RIP [<ffffffff8034eb82>] __list_add+0x27/0x5b [ 393.013490] RSP <ffff810110f318e8> [ 393.013535] This was while running dbench. Created attachment 146514 [details]
Patch to fix list corruption
The attached patch appears to fix the list corruption that we are seeing on
occasion. Although the transaction structure is private to a single thread,
when the queued structures are dismantled during an in-core commit, its
possible for a different thread to be trying to add the same structure to
another, new, transaction at the same time.
To avoid this, this patch takes the log spinlock during this operation.
Fixed in upstream git tree. |