Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 608962

Summary: terminating process panicked in ocfs2 accessing current task's signal handlers
Product: Red Hat Enterprise Linux 5 Reporter: Lachlan McIlroy <lmcilroy>
Component: kernelAssignee: Dave Anderson <anderson>
Status: CLOSED WONTFIX QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.3CC: rwheeler, tao, vgaikwad
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-08-09 13:22:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Patch to move proc_flush_task() before __exit_signal(). none

Description Lachlan McIlroy 2010-06-29 03:32:10 UTC
Description of problem:

Looking at the vmcore we can see that we've panicked due to a bad pointer value in spin_lock_irq():

Unable to handle kernel NULL pointer dereference at 0000000000000808 RIP:
[<ffffffff80064bce>] _spin_lock_irq+0x1/0xb
PGD 15b4fe5067 PUD 15b3177067 PMD 0
Oops: 0002 [1] SMP
last sysfs file: /class/fc_remote_ports/rport-1:0-9/roles
CPU 7
Modules linked in: rfcomm nfs lockd fscache nfs_acl crashme(U) hangcheck_timer ipv6 xfrm_nalgo crypto_api oracleasm(U) autofs4 ipmi_devintf ipmi_si ipmi_msghandler hidp l2cap bluetooth ocfs2(U) ocfs2_dlmfs(U) ocfs2_dlm(U) ocfs2_nodemanager(U) configfs sunrpc bonding dm_round_robin dm_multipath scsi_dh video hwmon backlight sbs i2c_ec i2c_core button battery asus_acpi acpi_memhotplug ac parport_pc lp parport joydev st shpchp sg bnx2 e1000e ide_cd pcspkr cdrom dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod usb_storage qla2xxx scsi_transport_fc ata_piix libata megaraid_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 1075, comm: crsd.bin Tainted: G      2.6.18-128.7.1.el5 #1
RIP: 0010:[<ffffffff80064bce>]  [<ffffffff80064bce>] _spin_lock_irq+0x1/0xb
RSP: 0018:ffff8109ee9bfcb0  EFLAGS: 00010002
RAX: ffff8113354097e0 RBX: 0000000000000000 RCX: ffff8111f9f6a680
RDX: ffff8109ee9bfd68 RSI: ffff8109ee9bfd78 RDI: 0000000000000808
RBP: ffff8109ee9bfd68 R08: 0000000000000000 R09: 0000000000000000
R10: ffff81027c3ef290 R11: ffffffff88599b37 R12: ffff8109ee9bfd78
R13: ffff8111ffc70ca0 R14: 0000000000000003 R15: ffff8109ee9bfeb8
FS:  0000000000000000(0000) GS:ffff81202fcc0340(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000808 CR3: 00000015b33f4000 CR4: 00000000000006e0
Process crsd.bin (pid: 1075, threadinfo ffff8109ee9be000, task ffff8113354097e0)
Stack:  ffffffff8001d7e4 ffff8111ffc70ca0 ffffffff8859a195 ffff810170b38400
ffffffff8859a2c5 ffffffff31303735 ffff81027c3ef218 ffff81027c3ef218
0000000000000297 ffff8111ffc70ca0 ffff81027c3ef228 0000000000000246
Call Trace:
[<ffffffff8001d7e4>] sigprocmask+0x28/0xdb
[<ffffffff8859a195>] :ocfs2:ocfs2_delete_inode+0x0/0x1693
[<ffffffff8859a2c5>] :ocfs2:ocfs2_delete_inode+0x130/0x1693
[<ffffffff88585f16>] :ocfs2:ocfs2_drop_lock+0x67a/0x77b
[<ffffffff8858426a>] :ocfs2:ocfs2_remove_lockres_tracking+0x10/0x45
[<ffffffff8859a195>] :ocfs2:ocfs2_delete_inode+0x0/0x1693
[<ffffffff8002f191>] generic_delete_inode+0xc6/0x143
[<ffffffff88599c27>] :ocfs2:ocfs2_drop_inode+0xf0/0x161
[<ffffffff8000d1a8>] dput+0xf6/0x114
[<ffffffff800e62a7>] prune_one_dentry+0x66/0x76
[<ffffffff8002e651>] prune_dcache+0x10f/0x149
[<ffffffff8004d016>] shrink_dcache_parent+0x1c/0xe1
[<ffffffff801013c2>] proc_flush_task+0x17c/0x1f6
[<ffffffff8008e014>] sched_exit+0x27/0xb6
[<ffffffff80017bfb>] release_task+0x324/0x368
[<ffffffff800158b5>] do_exit+0x865/0x911
[<ffffffff80090a5e>] complete_and_exit+0x0/0x16
[<ffffffff8005d28d>] tracesys+0xd5/0xe0

The stack trace above contains some bogus entries, this one looks correct:

crash> bt
PID: 1075   TASK: ffff8113354097e0  CPU: 7   COMMAND: "crsd.bin"
#0 [ffff8109ee9bfa10] crash_kexec at ffffffff800aaa72
#1 [ffff8109ee9bfad0] __die at ffffffff8006520f
#2 [ffff8109ee9bfb10] do_page_fault at ffffffff80066e21
#3 [ffff8109ee9bfc00] error_exit at ffffffff8005dde9
   [exception RIP: _spin_lock_irq+1]
   RIP: ffffffff80064bce  RSP: ffff8109ee9bfcb0  RFLAGS: 00010002
   RAX: ffff8113354097e0  RBX: 0000000000000000  RCX: ffff8111f9f6a680
   RDX: ffff8109ee9bfd68  RSI: ffff8109ee9bfd78  RDI: 0000000000000808
   RBP: ffff8109ee9bfd68   R8: 0000000000000000   R9: 0000000000000000
   R10: ffff81027c3ef290  R11: ffffffff88599b37  R12: ffff8109ee9bfd78
   R13: ffff8111ffc70ca0  R14: 0000000000000003  R15: ffff8109ee9bfeb8
   ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
#4 [ffff8109ee9bfcb0] sigprocmask at ffffffff8001d7e4
#5 [ffff8109ee9bfcd0] ocfs2_delete_inode at ffffffff8859a2c5
#6 [ffff8109ee9bfdc0] generic_delete_inode at ffffffff8002f191
#7 [ffff8109ee9bfde0] ocfs2_drop_inode at ffffffff88599c27
#8 [ffff8109ee9bfe00] dput at ffffffff8000d1a8
#9 [ffff8109ee9bfe20] prune_one_dentry at ffffffff800e62a7
#10 [ffff8109ee9bfe40] prune_dcache at ffffffff8002e651
#11 [ffff8109ee9bfe70] shrink_dcache_parent at ffffffff8004d016
#12 [ffff8109ee9bfea0] proc_flush_task at ffffffff801013c2
#13 [ffff8109ee9bff00] release_task at ffffffff80017bfb
#14 [ffff8109ee9bff30] do_exit at ffffffff800158b5
#15 [ffff8109ee9bff80] tracesys at ffffffff8005d28d (via system_call)
   RIP: 00000039a22063e2  RSP: 000000005d293140  RFLAGS: 00000246
   RAX: ffffffffffffffda  RBX: ffffffff8005d28d  RCX: ffffffffffffffff
   RDX: 00002aaab80091d0  RSI: 0000000000a01000  RDI: 0000000000000000
   RBP: 000000005d293940   R8: 00002aaab800b400   R9: 00000000ffffffff
   R10: 000000005d2939e0  R11: 0000000000000246  R12: ffffffff80090a5e
   R13: 000000005d293940  R14: 0000000000000000  R15: 000000005d293940
   ORIG_RAX: 000000000000003c  CS: 0033  SS: 002b

This stack trace shows that the crsd.bin process is terminating and that has triggered a flush of the dentry cache.  This has resulted in an ocfs2 dentry releasing it's final reference and being destroyed.  In ocfs2_delete_inode() it wants to block signals (so that it does not have to handle the error case due to a signal) so it calls sigprocmask():

int sigprocmask(int how, sigset_t *set, sigset_t *oldset)
{
int error;

spin_lock_irq(&current->sighand->siglock);

It's this call to spin_lock_irq() that's run into trouble.  So let's look at the current task's sighand:

crash> px ((struct task_struct *)0xffff8113354097e0)->sighand
$5 = (struct sighand_struct *) 0x0

That's not good, it's NULL and dereferencing sighand to access siglock has caused the panic.  So what should sighand look like?

crash> struct -o sighand_struct
struct sighand_struct {
    [0] atomic_t count;
    [8] struct k_sigaction action[64];
 [2056] spinlock_t siglock;
}

crash> px 2056
$9 = 0x808

And that matches the address we panicked on.  So why is sighand NULL here and is it safe to be accessing it this late in the process termination cycle?

If we take a look at release_task() we can see that it's not safe.  The stack above shows release_task() calling proc_flush_task() which is after release_task() calls __exit_signal().

void release_task(struct task_struct * p)
{
struct task_struct *leader;
int zap_leader;
int inhibited_leader;
repeat:
tracehook_release_task(p);
atomic_dec(&p->user->processes);
BUG_ON(tracehook_check_released(p));
write_lock_irq(&tasklist_lock);
__exit_signal(p);    <--- signal handlers terminated here
...
proc_flush_task(p);  <--- currently executing in here
...

In __exit_signal() we do this:

static void __exit_signal(struct task_struct *tsk)
{
struct signal_struct *sig = tsk->signal;
struct sighand_struct *sighand;
...
tsk->signal = NULL;
tsk->sighand = NULL;
...

So the current task's signal handlers have been cleaned up by the time ocfs2 tries to access them in ocfs_delete_inode().  This bug could be addressed by ocfs2 by checking for a NULL sighand.  It's likely that this ocfs2 dentry has nothing to do with the current process - it just happened to be ready for purging at the time - and reordering the code in release_task may make more sense.

Version-Release number of selected component (if applicable):
kernel-2.6.18-128.7.1

How reproducible:
Unknown

Additional info:
vmcore is available from megatron:/cores/20100628212540

This bug is fixed upstream by reordering the code in release_task() but it's not fixed in RHEL5 yet.

Comment 2 Lachlan McIlroy 2010-06-29 03:57:39 UTC
This is the upstream change that removes this bug but it appears the change was not made to fix this particular bug.  We only need the change to release_task().

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=60347f6716aa49831ac311e04d77ccdc50dc024a

Comment 3 Lachlan McIlroy 2010-06-29 03:59:53 UTC
Created attachment 427561 [details]
Patch to move proc_flush_task() before __exit_signal().

Comment 4 Ric Wheeler 2010-06-29 10:24:09 UTC
We don't support or compile ocfs2 unless I am confused... Does the customer here build this on their own?

Comment 5 Lachlan McIlroy 2010-06-30 00:55:29 UTC
The customer has ocfs2 support from Oracle but that seems irrelevant since the bug is in the kernel proper.  We can tell the customer to go to Oracle but they'll probably come back to us for this fix eventually.

Comment 7 RHEL Program Management 2011-06-20 21:55:40 UTC
This request was evaluated by Red Hat Product Management for inclusion in Red Hat Enterprise Linux 5.7 and Red Hat does not plan to fix this issue the currently developed update.

Contact your manager or support representative in case you need to escalate this bug.

Comment 8 Dave Anderson 2011-08-08 20:43:34 UTC
> Additional info:
> vmcore is available from megatron:/cores/20100628212540

[anderson@megatron 20110722144432]$ cd /cores/20100628212540
-bash: cd: /cores/20100628212540: No such file or directory
[anderson@megatron 20110722144432]$

Is this vmcore still available?

Comment 9 Lachlan McIlroy 2011-08-08 23:54:38 UTC
(In reply to comment #8)
> > Additional info:
> > vmcore is available from megatron:/cores/20100628212540
> 
> [anderson@megatron 20110722144432]$ cd /cores/20100628212540
> -bash: cd: /cores/20100628212540: No such file or directory
> [anderson@megatron 20110722144432]$
> 
> Is this vmcore still available?

No, the vmcore has been removed since the case was closed over a year ago.  The customer left a message about changes to their oracle RAC servers and requested the case to be closed.  I think I've captured all of the important details about the crash in this BZ.

Comment 10 Dave Anderson 2011-08-09 12:56:50 UTC
The reason I ask is because the issue has apparently never been seen
before -- until ocfs2 created some /proc file with an ops->delete handler
that ends up getting pruned at the wrong time.

Comment 13 Ric Wheeler 2011-08-09 13:22:08 UTC
Closing this since we have not seen it in non-OCFS2 deployments.

Please reopen if seen again.

Thanks!