Bug 628795
Summary: | panic in ext3_get_inode_loc() due to corrupt inode orphan list | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Lachlan McIlroy <lmcilroy> | ||||
Component: | kernel | Assignee: | Josef Bacik <jbacik> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 4.6.z | CC: | esandeen, jbacik, vgaikwad | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2010-09-09 15:40:04 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Lachlan McIlroy
2010-08-31 04:56:35 UTC
Looking at the vmcore: KERNEL: /cores/20100719105748/work/vmlinux DUMPFILE: /cores/20100719105748/work/2041314-ap15.1279420796.dmp CPUS: 8 DATE: Sat Jul 17 22:39:56 2010 UPTIME: 41 days, 20:06:37 LOAD AVERAGE: 3.04, 3.29, 4.25 TASKS: 1448 NODENAME: <hostname> RELEASE: 2.6.9-78.0.17.ELsmp VERSION: #1 SMP Thu Mar 5 04:58:34 EST 2009 MACHINE: x86_64 (2792 Mhz) MEMORY: 64 GB PANIC: "vmdump" PID: 955 COMMAND: "oracle" TASK: 10bc9f297f0 [THREAD_INFO: 10777b58000] CPU: 0 STATE: TASK_RUNNING (PANIC) general protection fault: 0000 [1] SMP CPU 0 Modules linked in: nfs lockd iptable_filter ip_tables hangcheck_timer md5 ipv6 oracleasm(U) autofs4 i2c_dev i2c_core dm_mirror hw_random nfs_acl sunrpc egenera_nmi(U) egenera_veth(U) sd_mod egenera_vscsi(U) scsi_mod egenera_vmdump(U) egenera_dumpdev(U) egenera_ipmi(U) egenera_base(U) egenera_virtual_bus(U) egenera_fs(U) ext3 jbd dm_mod Pid: 955, comm: oracle Tainted: PF 2.6.9-78.0.17.ELsmp RIP: 0010:[<ffffffffa002bfa2>] <ffffffffa002bfa2>{:ext3:ext3_get_inode_loc+61} RSP: 0018:0000010777b59d18 EFLAGS: 00010216 RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0a0d3e6465646f43 RDX: 0000000000000001 RSI: 0000010777b59da8 RDI: 7470697263736544 RBP: 0000010029865fd0 R08: 0000000000000000 R09: 0000000000000088 R10: 0000000000001000 R11: 000001092ffcfad8 R12: 0000010777b59da8 R13: 0000010777b59da8 R14: 0000010a2e407990 R15: 0000010029865fd0 FS: 0000002a970536e0(0000) GS:ffffffff8050d100(0000) knlGS:0000000008095440 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000002a960728c8 CR3: 0000000000101000 CR4: 00000000000006e0 Process oracle (pid: 955, threadinfo 0000010777b58000, task 0000010bc9f297f0) Stack: 0000010029865fd0 0000010e294bb620 0000010777b59da8 0000000000000000 0000010029865fd0 0000010777b59da8 0000010777b59da8 0000010a2e407990 0000010029865fd0 ffffffffa002c9cb Call Trace:<ffffffffa002c9cb>{:ext3:ext3_reserve_inode_write+40} <ffffffffa0031a8d>{:ext3:ext3_orphan_del+379} <ffffffffa00164c6>{:jbd:journal_start+223} <ffffffffa002e73f>{:ext3:ext3_delete_inode+116} <ffffffffa002e6cb>{:ext3:ext3_delete_inode+0} <ffffffff80195b0c>{generic_delete_inode+190} <ffffffff80192214>{dput+463} <ffffffff8017cb16>{__fput+278} <ffffffff8017b645>{filp_close+103} <ffffffff8013a39d>{put_files_struct+101} <ffffffff8013ac14>{do_exit+711} <ffffffff8014533f>{sys_rt_sigaction+133} <ffffffff8013b754>{sys_exit_group+0} <ffffffff801102f6>{system_call+126} Code: 48 8b 87 60 02 00 00 48 63 80 88 00 00 00 48 39 c1 72 12 48 RIP <ffffffffa002bfa2>{:ext3:ext3_get_inode_loc+61} RSP <0000010777b59d18> We've crashed at ext3_get_inode_loc+61, so let's look at that instruction: crash> dis ext3_get_inode_loc 0xffffffffa002bf65 <ext3_get_inode_loc>: push %r15 0xffffffffa002bf67 <ext3_get_inode_loc+2>: push %r14 0xffffffffa002bf69 <ext3_get_inode_loc+4>: mov %rdi,%r14 0xffffffffa002bf6c <ext3_get_inode_loc+7>: push %r13 0xffffffffa002bf6e <ext3_get_inode_loc+9>: push %r12 0xffffffffa002bf70 <ext3_get_inode_loc+11>: push %rbp 0xffffffffa002bf71 <ext3_get_inode_loc+12>: push %rbx 0xffffffffa002bf72 <ext3_get_inode_loc+13>: mov %edx,%ebx 0xffffffffa002bf74 <ext3_get_inode_loc+15>: sub $0x18,%rsp 0xffffffffa002bf78 <ext3_get_inode_loc+19>: mov 0x30(%r14),%rcx 0xffffffffa002bf7c <ext3_get_inode_loc+23>: mov %rsi,0x10(%rsp) 0xffffffffa002bf81 <ext3_get_inode_loc+28>: mov 0x108(%rdi),%rdi 0xffffffffa002bf88 <ext3_get_inode_loc+35>: cmp $0x2,%rcx 0xffffffffa002bf8c <ext3_get_inode_loc+39>: setne %dl 0xffffffffa002bf8f <ext3_get_inode_loc+42>: xor %eax,%eax 0xffffffffa002bf91 <ext3_get_inode_loc+44>: cmp $0x8,%rcx 0xffffffffa002bf95 <ext3_get_inode_loc+48>: setne %al 0xffffffffa002bf98 <ext3_get_inode_loc+51>: test %eax,%edx 0xffffffffa002bf9a <ext3_get_inode_loc+53>: je 0xffffffffa002bfb5 0xffffffffa002bf9c <ext3_get_inode_loc+55>: cmp $0x7,%rcx 0xffffffffa002bfa0 <ext3_get_inode_loc+59>: je 0xffffffffa002bfb5 0xffffffffa002bfa2 <ext3_get_inode_loc+61>: mov 0x260(%rdi),%rax We were dereferencing %rdi and it's value is 7470697263736544. That's definitely not a valid kernel address. Based on these instructions we are actually in this function: static unsigned long ext3_get_inode_block(struct super_block *sb, unsigned long ino, struct ext3_iloc *iloc) { unsigned long desc, group_desc, block_group; unsigned long offset, block; struct buffer_head *bh; struct ext3_group_desc * gdp; if ((ino != EXT3_ROOT_INO && ino != EXT3_JOURNAL_INO && ino != EXT3_RESIZE_INO && ino < EXT3_FIRST_INO(sb)) || #define EXT3_FIRST_INO(s) (EXT3_SB(s)->s_first_ino) static inline struct ext3_sb_info * EXT3_SB(struct super_block *sb) { return sb->s_fs_info; } So it looks like we have a bad superblock pointer value. Let's see what we do have - the inode has been saved into %r14: crash> inode 0000010a2e407990 struct inode struct: seek error: kernel virtual address: 10a2e407990 type: "gdb_readmem_callback" Cannot access memory at address 0x10a2e407990 Well that's no good either. The inode pointer has been successfully dereferenced earlier in the assembly above but for some reason it's no good now. It was already in %r14 when the function was called and has been saved onto the stack: crash> bt -f PID: 955 TASK: 10bc9f297f0 CPU: 0 COMMAND: "oracle" ... #5 [10777b59c60] error_exit at ffffffff80110e1d [exception RIP: ext3_get_inode_loc+61] RIP: ffffffffa002bfa2 RSP: 0000010777b59d18 RFLAGS: 00010216 RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0a0d3e6465646f43 RDX: 0000000000000001 RSI: 0000010777b59da8 RDI: 7470697263736544 RBP: 0000010029865fd0 R8: 0000000000000000 R9: 0000000000000088 R10: 0000000000001000 R11: 000001092ffcfad8 R12: 0000010777b59da8 R13: 0000010777b59da8 R14: 0000010a2e407990 R15: 0000010029865fd0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 10777b59c68: 0000010029865fd0 0000010a2e407990 10777b59c78: 0000010777b59da8 0000010777b59da8 10777b59c88: 0000010029865fd0 0000000000000001 10777b59c98: 000001092ffcfad8 0000000000001000 10777b59ca8: 0000000000000088 0000000000000000 10777b59cb8: 0000000000000001 0a0d3e6465646f43 10777b59cc8: 0000000000000001 0000010777b59da8 10777b59cd8: 7470697263736544 ffffffffffffffff 10777b59ce8: ffffffffa002bfa2 0000000000000010 10777b59cf8: 0000000000010216 0000010777b59d18 10777b59d08: 0000000000000018 0000000000000000 10777b59d18: 0000010029865fd0 0000010e294bb620 10777b59d28: 0000010777b59da8 0000000000000000 10777b59d38: 0000010029865fd0 0000010777b59da8 10777b59d48: 0000010777b59da8 0000010a2e407990 <---- 10777b59d58: 0000010029865fd0 ffffffffa002c9cb #6 [10777b59d60] ext3_reserve_inode_write at ffffffffa002c9cb That's the same value. It was also saved onto the stack in ext3_reserve_inode_write(): #6 [10777b59d60] ext3_reserve_inode_write at ffffffffa002c9cb 10777b59d68: 0000010a2e407958 0000000000000000 10777b59d78: 0000010a2e407990 0000010777b59da8 10777b59d88: 0000010bbac212c0 ffffffffa0031a8d ^^^ crash> inode 0000010bbac212c0 struct inode { i_hash = { next = 0x0, pprev = 0x10c11b76f10 }, i_list = { next = 0x10bbac212d0, prev = 0x10bbac212d0 }, i_dentry = { next = 0x10bbac212e0, prev = 0x10bbac212e0 }, i_ino = 2224091, i_count = { counter = 0 }, i_mode = 33200, i_nlink = 0, i_uid = 2735, i_gid = 28145, i_rdev = 0, i_size = 0, i_atime = { tv_sec = 1275886442, tv_nsec = 0 }, i_mtime = { tv_sec = 1279420796, tv_nsec = 0 }, i_ctime = { tv_sec = 1279420796, tv_nsec = 0 }, i_blkbits = 12, i_blksize = 4096, i_version = 1, i_blocks = 0, i_bytes = 0, i_sock = 0 '\000', i_lock = { lock = 1, magic = 3735899821 }, i_sem = { count = { counter = 1 }, sleepers = 0, wait = { lock = { lock = 1, magic = 3735899821 }, task_list = { next = 0x10bbac21388, prev = 0x10bbac21388 } } }, i_alloc_sem = { activity = 0, wait_lock = { lock = 1, magic = 3735899821 }, wait_list = { next = 0x10bbac213a8, prev = 0x10bbac213a8 } }, i_op = 0xffffffffa0048d60, i_fop = 0xffffffffa0048e00, i_sb = 0x100081da400, i_flock = 0x0, i_mapping = 0x10bbac213e0, i_data = { host = 0x10bbac212c0, page_tree = { height = 0, gfp_mask = 544, rnode = 0x0 }, tree_lock = { lock = 1, magic = 3735899821 }, i_mmap_writable = 0, i_mmap = { prio_tree_node = 0x0, index_bits = 1 }, i_mmap_nonlinear = { next = 0x10bbac21418, prev = 0x10bbac21418 }, i_mmap_lock = { lock = 1, magic = 3735899821 }, truncate_count = { counter = 0 }, nrpages = 0, writeback_index = 0, a_ops = 0xffffffffa0048fa0, flags = 210, backing_dev_info = 0x10bf63a43d0, private_lock = { lock = 1, magic = 3735899821 }, private_list = { next = 0x10bbac21468, prev = 0x10bbac21468 }, assoc_mapping = 0x0 }, i_dquot = {0x0, 0x0}, i_devices = { next = 0x10bbac21490, prev = 0x10bbac21490 }, i_pipe = 0x0, i_bdev = 0x0, i_cdev = 0x0, i_cindex = 0, i_generation = 3739034195, i_dnotify_mask = 0, i_dnotify = 0x0, i_state = 23, dirtied_when = 4375203442, i_flags = 0, i_writecount = { counter = 0 }, i_security = 0x0, u = { generic_ip = 0x0 } } That's better. Strange that it has been corrupted on the stack. Taking a closer look at ext3_orphan_del() there are two inodes being accessed. The primary inode passed into ext3_orphan_del() is okay. It's the previous inode on the orphan list that is corrupt and may be related to the bad orphan message: EXT3-fs warning (device dm-0): ext3_orphan_get: bad orphan inode 17235! e2fsck was run? This looks to be identical to https://bugzilla.redhat.com/show_bug.cgi?id=434878. *nod* and bug #434878 as you noted can't be a dup of bug #196915 since it was hit on a kernel which contains that fix already.... Hrm well the only thing I can think of is the DIO problem. The nlink check in ext3_link is protected by the i_sem so there shouldn't be a race. Is their app doing DIO? Can they test a patch to make sure it works? Created attachment 446064 [details]
possible fix
Heh so I finally get the core opened and it's oracle thats running, so it could very well be the DIO problem. Here is a RHEL4 version of the DIO+orphan patch, let me know if this fixes their problem.
Josef, is that the same patch I posted to 629143? I've confirmed that patch fixes BZ629143 using the test case from BZ588599 but I wasn't entirely convinced it was the same bug as this one since I was never able to get it to crash with the stack trace above - but it makes sense that it could. The customer's panic was a one-off and we currently have no reproducer for this bug. If we can get the patch into BZ629143 and you think it's the same thing then we can dupe this bug to it. Lachlan, Aha, I knew I had seen that patch recently. Yes it's the same thing, if we don't clean up the orphan entry when doing DIO and then remove the file, we'll get a corrupted list and it will blow up like this. I'll close this as a dupe of 629143, thanks. *** This bug has been marked as a duplicate of bug 629143 *** |