Bug 1883932 - general protection fault in gfs2_withdraw (syzbot)
Summary: general protection fault in gfs2_withdraw (syzbot)
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Robert Peterson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-30 14:16 UTC by Andrew Price
Modified: 2021-03-13 13:16 UTC (History)
20 users (show)

Fixed In Version: 5.12-rc3
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-03-13 13:16:01 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
repro.c (200.14 KB, text/x-csrc)
2020-09-30 14:20 UTC, Andrew Price
no flags Details
Proposed patch to fix the problem (1.79 KB, patch)
2021-03-11 20:22 UTC, Robert Peterson
no flags Details | Diff

Description Andrew Price 2020-09-30 14:16:52 UTC
This is caused by the withdraw occurring in the init_inodes() path early enough (while looking up the jindex) that sdp->sd_jdesc is still NULL here:

  static void signal_our_withdraw(struct gfs2_sbd *sdp)
  {
          struct gfs2_glock *gl = sdp->sd_live_gh.gh_gl;
          struct inode *inode = sdp->sd_jdesc->jd_inode; 



gfs2: fsid=syz:syz.0: fatal: invalid metadata block
  bh = 2072 (magic number)
  function = gfs2_meta_indirect_buffer, file = fs/gfs2/meta_io.c, line = 417
gfs2: fsid=syz:syz.0: about to withdraw this file system
general protection fault, probably for non-canonical address 0xdffffc000000000e: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
CPU: 0 PID: 6842 Comm: syz-executor264 Not tainted 5.9.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:signal_our_withdraw fs/gfs2/util.c:97 [inline]
RIP: 0010:gfs2_withdraw+0x2b0/0xe20 fs/gfs2/util.c:294
Code: e8 03 48 89 44 24 38 42 80 3c 38 00 74 08 48 89 ef e8 34 f7 69 fe 48 89 6c 24 20 48 8b 6d 00 48 83 c5 70 48 89 e8 48 c1 e8 03 <42> 80 3c 38 00 74 08 48 89 ef e8 11 f7 69 fe 48 8b 45 00 48 89 44
RSP: 0018:ffffc900057474f0 EFLAGS: 00010202
RAX: 000000000000000e RBX: ffff8880a71e0000 RCX: 98268db4dfe86a00
RDX: ffff888092bb6100 RSI: 0000000000000000 RDI: ffff8880a71e0430
RBP: 0000000000000070 R08: ffffffff834ad50c R09: ffffed1015d041c3
R10: ffffed1015d041c3 R11: 0000000000000000 R12: 1ffff11014e3c04d
R13: ffff8880a71e0050 R14: ffff8880a71e026c R15: dffffc0000000000
FS:  000000000233b880(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f74f826d6c0 CR3: 00000000a04cc000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 gfs2_meta_check_ii+0x70/0x80 fs/gfs2/util.c:450
 gfs2_metatype_check_i fs/gfs2/util.h:126 [inline]
 gfs2_meta_indirect_buffer+0x29f/0x380 fs/gfs2/meta_io.c:417
 gfs2_meta_inode_buffer fs/gfs2/meta_io.h:70 [inline]
 gfs2_inode_refresh+0x65/0xc00 fs/gfs2/glops.c:438
 inode_go_lock+0x12c/0x480 fs/gfs2/glops.c:468
 do_promote+0x4db/0xcd0 fs/gfs2/glock.c:390
 finish_xmote+0x907/0x1350 fs/gfs2/glock.c:560
 do_xmote+0xadb/0x14c0 fs/gfs2/glock.c:686
 gfs2_glock_nq+0xac3/0x14d0 fs/gfs2/glock.c:1410
 gfs2_glock_nq_init fs/gfs2/glock.h:238 [inline]
 gfs2_lookupi+0x36f/0x4f0 fs/gfs2/inode.c:317
 gfs2_lookup_simple+0xa4/0x100 fs/gfs2/inode.c:268
 init_journal+0x132/0x1970 fs/gfs2/ops_fstype.c:620
 init_inodes fs/gfs2/ops_fstype.c:756 [inline]
 gfs2_fill_super+0x2717/0x3fe0 fs/gfs2/ops_fstype.c:1125
 get_tree_bdev+0x3e9/0x5f0 fs/super.c:1342
 gfs2_get_tree+0x4c/0x1f0 fs/gfs2/ops_fstype.c:1201
 vfs_get_tree+0x88/0x270 fs/super.c:1547
 do_new_mount fs/namespace.c:2875 [inline]
 path_mount+0x179d/0x29e0 fs/namespace.c:3192
 do_mount fs/namespace.c:3205 [inline]
 __do_sys_mount fs/namespace.c:3413 [inline]
 __se_sys_mount+0x126/0x180 fs/namespace.c:3390
 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x458e1a
Code: b8 08 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 fd ad fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f 83 da ad fb ff c3 66 0f 1f 84 00 00 00 00 00
RSP: 002b:00007ffc76f65c88 EFLAGS: 00000293 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007ffc76f65ce0 RCX: 0000000000458e1a
RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007ffc76f65ca0
RBP: 00007ffc76f65ca0 R08: 00007ffc76f65ce0 R09: 00007ffc00000015
R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000809
R13: 0000000000000004 R14: 0000000000000003 R15: 0000000000000003
Modules linked in:
---[ end trace 1e62174917573e95 ]---
RIP: 0010:signal_our_withdraw fs/gfs2/util.c:97 [inline]
RIP: 0010:gfs2_withdraw+0x2b0/0xe20 fs/gfs2/util.c:294
Code: e8 03 48 89 44 24 38 42 80 3c 38 00 74 08 48 89 ef e8 34 f7 69 fe 48 89 6c 24 20 48 8b 6d 00 48 83 c5 70 48 89 e8 48 c1 e8 03 <42> 80 3c 38 00 74 08 48 89 ef e8 11 f7 69 fe 48 8b 45 00 48 89 44
RSP: 0018:ffffc900057474f0 EFLAGS: 00010202
RAX: 000000000000000e RBX: ffff8880a71e0000 RCX: 98268db4dfe86a00
RDX: ffff888092bb6100 RSI: 0000000000000000 RDI: ffff8880a71e0430
RBP: 0000000000000070 R08: ffffffff834ad50c R09: ffffed1015d041c3
R10: ffffed1015d041c3 R11: 0000000000000000 R12: 1ffff11014e3c04d
R13: ffff8880a71e0050 R14: ffff8880a71e026c R15: dffffc0000000000
FS:  000000000233b880(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f74f826d6c0 CR3: 00000000a04cc000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Comment 1 Andrew Price 2020-09-30 14:20:51 UTC
Created attachment 1717899 [details]
repro.c

Comment 2 Robert Peterson 2021-03-11 20:22:29 UTC
Created attachment 1762823 [details]
Proposed patch to fix the problem

gfs2: bypass signal_our_withdraw if no journal

Before this patch, function signal_our_withdraw referenced the journal
inode immediately. But corrupt file systems may have some invalid
journals, in which case our attempt to read it in will withdraw and the
resulting signal_our_withdraw would dereference the NULL value.

This patch adds a check to signal_our_withdraw so that if the journal
has not yet been initialized, it simply returns and does the old-style
withdraw.

Comment 3 Robert Peterson 2021-03-11 20:23:06 UTC
Hi Andy. I just attached a proposed patch to fix the problem. Can you check it please?

Comment 4 Andrew Price 2021-03-12 11:47:58 UTC
It looks good - using the reproducer I get a harmless withdraw with the patch applied.

[  517.880553] loop0: detected capacity change from 0 to 33168
[  517.889871] gfs2: fsid=syz:syz: Trying to join cluster "lock_nolock", "syz:syz"
[  517.892273] gfs2: fsid=syz:syz: Now mounting FS (format 1801)...
[  517.895485] gfs2: fsid=syz:syz.0: fatal: invalid metadata block
[  517.895485]   bh = 2072 (magic number)
[  517.895485]   function = gfs2_meta_indirect_buffer, file = fs/gfs2/meta_io.c, line = 488
[  517.900035] gfs2: fsid=syz:syz.0: about to withdraw this file system
[  517.902609] gfs2: fsid=syz:syz.0: File system withdrawn
<snipped backtrace>


Looking at the original report, the bot wants us to add

Reported-by: syzbot+50a8a9cf8127f2c6f5df.com
Fixes: 601ef0d52e96 ("gfs2: Force withdraw to replay journals and wait for it to finish")

Comment 5 Andrew Price 2021-03-13 13:16:01 UTC
The fix is now upstream

commit d5bf630f355d8c532bef2347cf90e8ae60a5f1bd
Author: Bob Peterson <rpeterso>
Date:   Fri Mar 12 07:58:54 2021 -0500

    gfs2: bypass signal_our_withdraw if no journal


Note You need to log in before you can comment on or make changes to this bug.