Bug 462150 - Concurrent CIFS mount/umount processes to same windows machine, different shares hangs umount processes or crashes kernel
Summary: Concurrent CIFS mount/umount processes to same windows machine, different sha...
Keywords:
Status: CLOSED DUPLICATE of bug 465143
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.4
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Jeff Layton
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
: 458210 (view as bug list)
Depends On:
Blocks: 467442
TreeView+ depends on / blocked
 
Reported: 2008-09-13 00:57 UTC by Jeff Layton
Modified: 2018-10-20 03:56 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-04-22 17:01:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
...also adds locking arround CIFSTCon() (4.54 KB, patch)
2008-09-15 17:24 UTC, Olivier Fourdan
no flags Details | Diff
lockdep warning and memory corruption detected during reproducer run (11.62 KB, text/plain)
2008-09-15 19:41 UTC, Jeff Layton
no flags Details
Updated patch (5.49 KB, patch)
2008-09-17 10:00 UTC, Olivier Fourdan
no flags Details | Diff
cifs: eliminate usage of kthread_stop for cifsd (6.33 KB, patch)
2008-10-03 20:25 UTC, Jeff Layton
no flags Details | Diff
cifs: make sure to do CIFSTCon while holding session semaphore (1.46 KB, patch)
2008-10-03 20:25 UTC, Jeff Layton
no flags Details | Diff
cifs: handle the TCP_Server_Info->tsk field more carefully (4.55 KB, patch)
2008-10-03 20:26 UTC, Jeff Layton
no flags Details | Diff
cifs: take tcon->useCount reference as soon as it's found (1.70 KB, patch)
2008-10-03 20:26 UTC, Jeff Layton
no flags Details | Diff
disable sharing of mount structures + remove kthread_stop usage (8.29 KB, patch)
2008-10-10 10:19 UTC, Jeff Layton
no flags Details | Diff
serialize cifs mounts + remove kthread_stop usage (10.77 KB, patch)
2008-10-10 12:14 UTC, Jeff Layton
no flags Details | Diff
patch -- fix mount races in RHEL5 (30.76 KB, patch)
2008-10-13 13:26 UTC, Jeff Layton
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Samba Project 5720 0 None None None 2019-03-25 01:33:36 UTC

Description Jeff Layton 2008-09-13 00:57:21 UTC
Description from samba BZ:

https://bugzilla.samba.org/show_bug.cgi?id=5720

--------------------[snip]--------------------

Running scripts to mimick the behaviour of our server (but speeded up), results
in either hanging umount processes or crashing the kernel.

Setup:
1 windows machine, 2 shares (SAMBARAPE3 and SAMBARAPE4).
1 Linux machine mounting cifs shares.

Running the scripts:

while true ; do
mount //win2000machine/SAMBARAPE3 /mnt/SAMBA3/ -o username=XXX,password=YYY &&
echo 2
ls /mnt/SAMBA3
umount -f /mnt/SAMBA3 && echo -2
ls /mnt/SAMBA3
done

and 

Running scripts to mimick the behaviour of our server (but speeded up), results
in either hanging umount processes or crashing the kernel.

Setup:
1 windows machine, 2 shares (SAMBARAPE3 and SAMBARAPE4).
1 Linux machine mounting cifs shares.

Running the scripts:

while true ; do
mount //win2000machine/SAMBARAPE4 /mnt/SAMBA4/ -o username=XXX,password=YYY &&
echo 2
ls /mnt/SAMBA4
umount -f /mnt/SAMBA4 && echo -2
ls /mnt/SAMBA4
done

and

while true ; do
mount //win2000machine/SAMBARAPE3 /mnt/SAMBA3/ -o username=XXX,password=YYY &&
echo 1
ls /mnt/SAMBA3
umount -f /mnt/SAMBA3 && echo -1
ls /mnt/SAMBA3
done

Concurrently either crashes the kernel or result in hanging umount processes.

In the last instance the kernel module unloading gets buggy. The echo
statements and ls commands can be omitted.
Playing aroung with options, -i and -f works better (takes longer time to
crash, but do not prevent the problem).

This is extreme usage of mount/umount processes, and we have minimised the
problem by reducing the risk of these happening at the same time. It may be of
interest for the SAMBA team (and maybe Kernel devs) to investigate anyway.

It happens on RHEL 5 and Ubuntu Hardy Heron and probably other distro's

Best Regards
Esben Bjerrum

Comment 1 Olivier Fourdan 2008-09-15 17:24:43 UTC
Created attachment 316764 [details]
...also adds locking arround CIFSTCon()

Hi Jeff,

This patch is based on your previous patch and adds down(&pSesInfo->sesSem); / up(&pSesInfo->sesSem); arround CIFSTCon() in connect.c in cifs_mount() and get_dfs_path(()

With this the system seems to be handling several mount/umounts in parallel for a few minutes. Probably not perfect yet, but I hope this helps.

Cheers,
Olivier.

Comment 2 Jeff Layton 2008-09-15 19:41:44 UTC
Created attachment 316776 [details]
lockdep warning and memory corruption detected during reproducer run

Thanks Olivier. I forward-ported your patch to rawhide and gave it a run there. It definitely helps keep things moving along. I see a number of places where we get this:

 CIFS VFS: cifs_read_super: get root inode failed

...which may indicate another race -- instrumenting that to give a return code might be interesting, but it probably means that cifs_get_inode_info_unix failed for some reason (probably due to races with the session getting torn down).

After this ran for a few minutes though, I got the attached message. It looks pretty clearly like a use-after-free. This isn't terribly surprising since the tcon struct doesn't have any sort of refcounting.

I think fixing this is going to take some careful thought and redesign to make sure that we aren't tearing down the sessions while another one is being brought up. That will probably mean adding some reference counting to some of these data structures.

Comment 3 Olivier Fourdan 2008-09-17 10:00:47 UTC
Created attachment 316932 [details]
Updated patch

I believe the problems comes from a race condition when doing mount/umounts in parallel. The effect are various memory and slab corruption.

This updated version gives some very good results here, no crash anymore for almost 2 hours (used to crash within a few minutes at best previously).

What this does is to actually nullify the server->tsk structure after force_sig(SIGKILL,...) to avoid a second potention kill of the same task if a umount occurs immediately after.

It also protects pSesInfo->status change in cifs_setup_session().

Comment 4 Jeff Layton 2008-09-17 10:42:47 UTC
Nice work. Zeroing out the tsk pointer makes sense. There is one problem though -- we don't seem to hold any sort of lock while sending the signal so there's nothing to prevent another thread from racing in with a second signal before you can set the tsk to NULL.

The best thing might be a new helper function or inline that:

1) saves off the tsk pointer to a local variable
2) NULLs out the server->tsk
3) sends the signal

Parts 1 and 2 should probably be done while holding a spinlock, or maybe doing it with xchg() would be atomic enough...

Comment 5 Olivier Fourdan 2008-09-18 16:54:33 UTC
The race condition still persists.

After instrumenting the code, this is the last few lines preceeding a crash:


	 fs/cifs/connect.c: CIFS VFS: in cifs_umount as Xid: 1360 with uid: 0
	 fs/cifs/connect.c: XID=1360: Disconnecting tcon 0xccef3e34
	 fs/cifs/cifssmb.c: In tree disconnect
	 fs/cifs/transport.c: For smb_command 113
	 fs/cifs/transport.c: Sending smb:  total_len 39
	 fs/cifs/connect.c: XID=1359: Freeing tcon 0xccef36e0
	 fs/cifs/connect.c: XID=1359: About to do SMBLogoff 
	 fs/cifs/connect.c: XID=1359: Mount active on server: 2, session 0xccef3a04, server 0xcf0b4288, task 0xccd6c070
	 fs/cifs/cifssmb.c: In SMBLogoff for session disconnect
	 fs/cifs/cifssmb.c: Currently in use: 1
	 fs/cifs/connect.c: CIFS VFS: leaving cifs_umount (xid = 1359) rc = -16

	 fs/cifs/connect.c: rfc1002 length 0x27
	 fs/cifs/connect.c: XID=1360: Freeing tcon 0xccef3e34
	 fs/cifs/connect.c: XID=1360: About to do SMBLogoff 
	 fs/cifs/connect.c: XID=1360: Mount active on server: 1, session 0xccef3a04, server 0xcf0b4288, task 0xccd6c070
	 fs/cifs/cifssmb.c: In SMBLogoff for session disconnect
	 fs/cifs/cifssmb.c: Currently in use: 0
	 fs/cifs/transport.c: For smb_command 116
	 fs/cifs/transport.c: Sending smb:  total_len 43
	 fs/cifs/connect.c: rfc1002 length 0x2b
	 fs/cifs/cifsfs.c: Devname: //172.16.239.132/samba4 flags: 64 

	 fs/cifs/connect.c: CIFS VFS: in cifs_mount as Xid: 1361 with uid: 0
	 fs/cifs/connect.c: Entering cifs_mount. Xid: 1361 with: unc=//172.16.239.132\samba4,ip=172.16.239.132,ver=1,rw,username=root,password=redhat
	 fs/cifs/connect.c: XID=1361: Username: root
	 fs/cifs/connect.c: XID=1361: UNC: \\172.16.239.132\samba4 ip: 172.16.239.132
	 fs/cifs/connect.c: cifs_find_tcp_session
	 fs/cifs/connect.c: Found exact match server, session 0xccef3a04, server 0xcf0b4288, task 0xccd6c070
	 fs/cifs/connect.c: XID=1361: Existing tcp session with server found
	 fs/cifs/connect.c: XID=1361: Existing smb sess found, session 0xccef3a04, server 0xcf0b4288, task 0xccd6c070
	 fs/cifs/connect.c: file mode: 0x7f7  dir mode: 0x1ff
	 fs/cifs/connect.c: XID=1361: Allocating tcon 0xccef3e34
	 fs/cifs/connect.c: XID=1361: CIFSTCon tcon 0xccef3e34, session 0xccef3a04, server 0xcf0b4288, task 0xccd6c070
	 fs/cifs/connect.c: XID=1360: Waking up socket by sending signal, tcon 0xccef3e34, session 0xccef3a04, server 0xcf0b4288, task 0xccd6c070
	 fs/cifs/connect.c: XID=1361: CIFS Tcon rc = -2
	 fs/cifs/connect.c: XID=1361: session 0xccef3a04, server 0xcf0b4288, task 0x0
	BUG: unable to handle kernel NULL pointer dereference at virtual address 00000450

XID 1359: cifs_umount
XID 1360: cifs_umount
XID 1361: cifs_mount

- So what seems to happen here is:

1) umount is called (XID 1360) and checks the use on the server, only 1 active mount
2) mount is called (XID 1361) and looks for existing server
3) mount finds one that matches, session 0xccef3a04, server 0xcf0b4288, task 0xccd6c070
4) umount kill the server
5) mount now tries to use a dead server

- How can we fix the issue?

There is fairly huge gap between the time the existing session is found and the time it's actially marked as "in use". The field "inUse" of the CIFS session is incremented fairly late in cifs_mount() and this field is used in CIFSSMBLogoff() to check if the session is in use.

A possible solution would be to mark the session "in use" as soon as it's been selected by cifs_find_tcp_session() so a concurrent umount() would not kill the server in the mean time.

I am working on a patch that implements this, but when adding the write/read locks required, I end up with lockups after a few concurrent mounts/umounts which is odd.

Comment 6 Jeff Layton 2008-10-01 12:13:03 UTC
Agreed on the last idea.

Once we find the session in cifs_find_tcp_session, we should be taking a reference to it. The best thing though might be to try and overhaul the refcounting on the sessions altogether, possibly even convert it to use krefs.

Comment 7 Olivier Fourdan 2008-10-02 12:15:29 UTC
Created attachment 319217 [details]
Patch for current status

Hi Jeff,

Following our discussion on IRC, here is the patch as it is today. This patch is against RHEL5 kernel and it does not work (well, it builds, but you'll get a lockup fairly quickly after a few mounts/umounts and I've not had time to investigate why yet).

Hope this helps,
Cheers,
Olivier.

Comment 8 Jeff Layton 2008-10-02 18:12:29 UTC
Thanks Olivier, I didn't try out your patch directly, but did snarf a few pieces from it. I'll post my patchset in a little while. I ran a test for a while on the one I've been testing and saw this (this is on a rawhide kernel so it catches accesses to uninitialized memory). It looks like we probably tried to use a tcon struct that had already been freed.

I think the patchset I have now does a passable job with making sure that the server tsk pointer doesn't get used when it shouldn't, but I suspect that we need to go through the session and tcon handling and convert them to more sane refcounting.

------------[ cut here ]------------
WARNING: at kernel/lockdep.c:2609 __lock_acquire+0xb33/0xc01() (Not tainted)
Modules linked in: nls_utf8 cifs bridge stp bnep l2cap bluetooth autofs4 sunrpc ipv6 xfs dm_multipath floppy pcspkr 8139cp 8139too mii i2c_piix4 i2c_core dm_snapshot dm_zero dm_mirror dm_log pata_acpi ata_generic [last unloaded: freq_table]
Pid: 4628, comm: mount.cifs Not tainted 2.6.27-0.372.rc8.fc10.x86_64 #1

Call Trace:
 [<ffffffff81045916>] warn_on_slowpath+0x65/0x9a
 [<ffffffff810133e8>] ? dump_trace+0x24f/0x25e
 [<ffffffff81067c7c>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff8101b616>] ? save_stack_trace+0x2f/0x4d
 [<ffffffff81066ad7>] ? save_trace+0x3f/0x95
 [<ffffffff810659c1>] ? graph_unlock+0x5e/0x6a
 [<ffffffff81067a11>] ? mark_lock+0x38d/0x3a2
 [<ffffffff81068f15>] __lock_acquire+0xb33/0xc01
 [<ffffffff81066c39>] ? trace_hardirqs_off_caller+0x21/0xa4
 [<ffffffff81069070>] lock_acquire+0x8d/0xba
 [<ffffffff8137ae62>] ? __down_common+0x98/0xcf
 [<ffffffff8137c1bf>] _spin_lock_irq+0x44/0x74
 [<ffffffff8137ae62>] ? __down_common+0x98/0xcf
 [<ffffffff8137bebe>] ? _spin_unlock_irq+0x30/0x3c
 [<ffffffff8137ae62>] __down_common+0x98/0xcf
 [<ffffffff8137af0c>] __down+0x1d/0x1f
 [<ffffffff8105e4b7>] down+0x2d/0x3e
 [<ffffffffa02e1210>] CIFSSMBTDis+0x4c/0x11a [cifs]
 [<ffffffffa02e67b3>] cifs_umount+0x6a/0x1ba [cifs]
 [<ffffffffa02db86a>] cifs_get_sb+0x1f8/0x26f [cifs]
 [<ffffffff810d4b11>] vfs_kern_mount+0xa2/0x131
 [<ffffffff810d4c07>] do_kern_mount+0x4c/0xe7
 [<ffffffff810ea97a>] do_new_mount+0x64/0x97
 [<ffffffff810eb77f>] do_mount+0x20c/0x23b
 [<ffffffff810e98f2>] ? copy_mount_options+0xd4/0x138
 [<ffffffff8118e9c9>] ? _raw_spin_lock+0x68/0x116
 [<ffffffff810eb83c>] sys_mount+0x8e/0xd6
 [<ffffffff8137ba76>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff8101139a>] system_call_fastpath+0x16/0x1b

---[ end trace 91b2b9d8ad1ebe92 ]---
general protection fault: 0000 [1] SMP DEBUG_PAGEALLOC
CPU 1 
Modules linked in: nls_utf8 cifs bridge stp bnep l2cap bluetooth autofs4 sunrpc ipv6 xfs dm_multipath floppy pcspkr 8139cp 8139too mii i2c_piix4 i2c_core dm_snapshot dm_zero dm_mirror dm_log pata_acpi ata_generic [last unloaded: freq_table]
Pid: 4628, comm: mount.cifs Tainted: G        W 2.6.27-0.372.rc8.fc10.x86_64 #1
RIP: 0010:[<ffffffff8118ee80>]  [<ffffffff8118ee80>] list_del+0xc/0x85
RSP: 0018:ffff880010d23c48  EFLAGS: 00010096
RAX: ffff88001d9b4bc8 RBX: 6b6b6b6b6b6b6b6b RCX: ffff88001d402a00
RDX: ffffffff8105e36d RSI: ffff88001d9b4ba0 RDI: 6b6b6b6b6b6b6b6b
RBP: ffff880010d23c58 R08: 0000000000000002 R09: 0000000000000001
R10: ffffffff81b37370 R11: 0000000000100100 R12: 0000000000000292
R13: ffff88001d9b4b88 R14: 00000000000009fb R15: ffff8800131cc000
FS:  00007f371d2506f0(0000) GS:ffff88001f8117d0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fb8c35eb000 CR3: 000000001d44f000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process mount.cifs (pid: 4628, threadinfo ffff880010d22000, task ffff88001d4022d0)
Stack:  ffff880010d23c78 6b6b6b6b6b6b6b6b ffff880010d23c78 ffffffff8137adb6
 00000000fffffff0 ffff88001d9b4b88 ffff880010d23c98 ffffffff8105e387
 ffff88001d9b4b68 00000000fffffff0 ffff880010d23cd8 ffffffffa02e122e
Call Trace:
 [<ffffffff8137adb6>] __up+0x1a/0x2e
 [<ffffffff8105e387>] up+0x2e/0x3e
 [<ffffffffa02e122e>] CIFSSMBTDis+0x6a/0x11a [cifs]
 [<ffffffffa02e67b3>] cifs_umount+0x6a/0x1ba [cifs]
 [<ffffffffa02db86a>] cifs_get_sb+0x1f8/0x26f [cifs]
 [<ffffffff810d4b11>] vfs_kern_mount+0xa2/0x131
 [<ffffffff810d4c07>] do_kern_mount+0x4c/0xe7
 [<ffffffff810ea97a>] do_new_mount+0x64/0x97
 [<ffffffff810eb77f>] do_mount+0x20c/0x23b
 [<ffffffff810e98f2>] ? copy_mount_options+0xd4/0x138
 [<ffffffff8118e9c9>] ? _raw_spin_lock+0x68/0x116
 [<ffffffff810eb83c>] sys_mount+0x8e/0xd6
 [<ffffffff8137ba76>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff8101139a>] system_call_fastpath+0x16/0x1b


Code: 48 8d 53 18 49 8d 7c 24 18 e8 96 00 00 00 4c 89 ef e8 85 fe ff ff 59 5b 41 5c 41 5d c9 c3 90 55 48 89 e5 53 48 89 fb 48 83 ec 08 <48> 8b 47 08 48 8b 00 48 39 f8 74 20 49 89 c0 48 89 f9 48 c7 c2 
RIP  [<ffffffff8118ee80>] list_del+0xc/0x85
 RSP <ffff880010d23c48>
---[ end trace 91b2b9d8ad1ebe92 ]---
=============================================================================
BUG kmalloc-512 (Tainted: G      D W): Poison overwritten
-----------------------------------------------------------------------------

INFO: 0xffff88001d9b4b88-0xffff88001d9b4c48. First byte 0x6c instead of 0x6b
INFO: Allocated in tconInfoAlloc+0x1f/0xd3 [cifs] age=775 cpu=0 pid=4619
INFO: Freed in tconInfoFree+0x6a/0x6e [cifs] age=280 cpu=0 pid=4630
INFO: Slab 0xffffe20000b1a380 objects=14 used=5 fp=0xffff88001d9b4b68 flags=0x200000000020c3
INFO: Object 0xffff88001d9b4b68 @offset=2920 fp=0xffff88001d9b4920

Bytes b4 0xffff88001d9b4b58:  0f e5 ff ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a .���....ZZZZZZZZ
  Object 0xffff88001d9b4b68:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4b78:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4b88:  6c 6b 6d 6b 6b 6b 6b 6b 01 00 00 00 6b 6b 6b 6b lkmkkkkk....kkkk
  Object 0xffff88001d9b4b98:  d0 22 40 1d 00 88 ff ff 6b 6b 6b 6b 6b 6b 6b 6b �"@...��kkkkkkkk
  Object 0xffff88001d9b4ba8:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4bb8:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4bc8:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4bd8:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4be8:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4bf8:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4c08:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4c18:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4c28:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4c38:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4c48:  6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b jkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4c58:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4c68:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4c78:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4c88:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4c98:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4ca8:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4cb8:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4cc8:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4cd8:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4ce8:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4cf8:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4d08:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4d18:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4d28:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4d38:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4d48:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
  Object 0xffff88001d9b4d58:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk�
 Redzone 0xffff88001d9b4d68:  bb bb bb bb bb bb bb bb                         ��������        
 Padding 0xffff88001d9b4da8:  5a 5a 5a 5a 5a 5a 5a 5a                         ZZZZZZZZ        
Pid: 4632, comm: mountumount1 Tainted: G      D W 2.6.27-0.372.rc8.fc10.x86_64 #1

Call Trace:
 [<ffffffff810c9e9c>] print_trailer+0x140/0x149
 [<ffffffff810ca505>] check_bytes_and_report+0xb7/0xf7
 [<ffffffff81109900>] ? load_elf_binary+0xf7/0x16c7
 [<ffffffff810ca616>] check_object+0xd1/0x1b4
 [<ffffffff810cc529>] __slab_alloc+0x33a/0x3ed
 [<ffffffff81109900>] ? load_elf_binary+0xf7/0x16c7
 [<ffffffff81109900>] ? load_elf_binary+0xf7/0x16c7
 [<ffffffff810cd78c>] __kmalloc+0xbc/0x111
 [<ffffffff81109809>] ? load_elf_binary+0x0/0x16c7
 [<ffffffff81109900>] load_elf_binary+0xf7/0x16c7
 [<ffffffff81109809>] ? load_elf_binary+0x0/0x16c7
 [<ffffffff810d6f39>] search_binary_handler+0xcf/0x255
 [<ffffffff810d8228>] do_execve+0x1c8/0x27a
 [<ffffffff810dcff1>] ? getname+0x171/0x1d7
 [<ffffffff8100f68e>] sys_execve+0x43/0x5e
 [<ffffffff8101196a>] stub_execve+0x6a/0xc0

FIX kmalloc-512: Restoring 0xffff88001d9b4b88-0xffff88001d9b4c48=0x6b

FIX kmalloc-512: Marking all objects used

Comment 9 Jeff Layton 2008-10-03 13:55:34 UTC
Did an initial patch to try and add some sanity to the tcon struct reference counting and it totally crashed and burned. I've gone over it several times and don't see why my patch would cause more problems than it solves, so I think it must just be changing the timing such that another existing race is exposed.

I'll plan to go over this more closely and do some more incremental testing to see if I can figure out the nature of the race here.

Comment 10 Jeff Layton 2008-10-03 20:25:06 UTC
Created attachment 319417 [details]
cifs: eliminate usage of kthread_stop for cifsd 

Patch 1 -- eliminate usage of kthread_stop for cifsd

Comment 11 Jeff Layton 2008-10-03 20:25:39 UTC
Created attachment 319418 [details]
cifs: make sure to do CIFSTCon while holding session semaphore 

patch 2

Comment 12 Jeff Layton 2008-10-03 20:26:11 UTC
Created attachment 319419 [details]
cifs: handle the TCP_Server_Info->tsk field more carefully 

patch 3

Comment 13 Jeff Layton 2008-10-03 20:26:46 UTC
Created attachment 319420 [details]
cifs: take tcon->useCount reference as soon as it's found

patch 4

Comment 14 Jeff Layton 2008-10-03 20:34:18 UTC
Here's what I have so far. This doesn't seem to cure the problems, however. I still get the occasional use-after-free of the tcon struct or the session.

Right now, I'm testing a patch that eliminates the sharing of the tcon and session structs (and sockets, for that matter). So far, while running that I see no oopses and it's been 15-20 minutes or so. I may try to propose a patch that does this for upstream and plan to work from that to overhaul the tcon and session sharing code. It's probably easier to work from a working state than to try and fix the code that's broken.

There's another somewhat related problem here, occasionally we end up with extra entries in the mtab with this reproducer. Either they're getting added when the mount didn't really work or we're not removing them on a successful umount.

This is almost certainly a problem with the mount.cifs and umount.cifs code. Once we get the kernel-based races patched up, we can have a look and fix that too.

Comment 15 Jeff Layton 2008-10-03 21:45:40 UTC
So far so good with the non-sharing patch. I'll probably let this run for a while overnight to make sure we don't hit any races. Once we do that, I'll have a closer look at what's really required to share tcon/ses/server structs.

It might not even hurt to really do some hatchet work with this code and start with some heavily simplified cifs_mount/cifs_umount functions. We can then go back and try to reimplement sharing the various structures.

Comment 16 Jeff Layton 2008-10-06 17:20:33 UTC
I spent some time over the weekend looking at how to fix this. I started by trying add back in code to share TCP_Server_Info structs between mounts. This also means that we're sharing cifsd kernel threads and sockets. I took great care to try and make this race free, but it ended up still oopsing with this:

 CIFS VFS: cifs_mount failed w/return code = -88
 CIFS VFS: Send error in SessSetup = -88 (tcpStatus = 1)
 CIFS VFS: cifs_mount failed w/return code = -88
NULL pointer dereference at 00000000000004a8
IP: [<ffffffff811579c9>] socket_has_perm+0xd/0x60
PGD 1786d067 PUD 17946067 PMD 1610c067 PTE 0
Oops: 0000 [1] SMP 
CPU 1 
Modules linked in: cifs nls_utf8 sco bridge stp bnep l2cap bluetooth autofs4 sunrpc ipv6 xfs dm_multipath floppy pcspkr 8139cp 8139too mii i2c_piix4 i2c_core pata_acpi ata_generic [last unloaded: cifs]
Pid: 15266, comm: cifsd Not tainted 2.6.27-0.391.rc8.git7.fc10.x86_64.debug #1
RIP: 0010:[<ffffffff811579c9>]  [<ffffffff811579c9>] socket_has_perm+0xd/0x60
RSP: 0018:ffff88000b421bd0  EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff880013d422d0
RBP: ffff88000b421c30 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000004 R11: ffff88000dd11198 R12: 0000000000000000
R13: 0000000000000004 R14: ffff88000b421ea0 R15: ffff88000b421ca0
FS:  0000000000000000(0000) GS:ffff88001f8047d0(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000000004a8 CR3: 0000000016116000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process cifsd (pid: 15266, threadinfo ffff88000b420000, task ffff880013d422d0)
Stack:  ffff88000b421c10 ffffffff810674fe ffff88000b421c00 ffff880013d422d0
 ffff88001f950c68 0000000000000002 ffff880013d422d0 ffff88000b421c20
 ffffffff810c8fac ffffe2000043c8c0 5aff88000b421c70 0000000000000000
Call Trace:
 [<ffffffff810674fe>] ? mark_lock+0x22/0x3a2
 [<ffffffff810c8fac>] ? compound_order+0x1a/0x2b
 [<ffffffff81157ace>] selinux_socket_recvmsg+0x22/0x24
 [<ffffffff811536c5>] security_socket_recvmsg+0x16/0x18
 [<ffffffff812c8530>] __sock_recvmsg+0x54/0x7f
 [<ffffffff812c8c2d>] sock_recvmsg+0xcf/0xe8
 [<ffffffff810674fe>] ? mark_lock+0x22/0x3a2
 [<ffffffff8105a241>] ? autoremove_wake_function+0x0/0x3d
 [<ffffffff81066b21>] ? trace_hardirqs_off+0xd/0xf
 [<ffffffff81017e89>] ? native_sched_clock+0x8e/0xa8
 [<ffffffff81017d26>] ? sched_clock+0x9/0xc
 [<ffffffff81066009>] ? lock_release_holdtime+0x2c/0x111
 [<ffffffff81067ad4>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff81036295>] ? __wake_up+0x43/0x4f
 [<ffffffff812c8c8a>] kernel_recvmsg+0x44/0x59
 [<ffffffffa02aa73a>] cifs_demultiplex_thread+0x1d0/0xc25 [cifs]
 [<ffffffffa02aa56a>] ? cifs_demultiplex_thread+0x0/0xc25 [cifs]
 [<ffffffff81059edb>] kthread+0x4e/0x7b
 [<ffffffff810128f9>] child_rip+0xa/0x11
 [<ffffffff81011c0e>] ? restore_args+0x0/0x30
 [<ffffffff81059e68>] ? kthreadd+0x17b/0x1a0
 [<ffffffff81059e8d>] ? kthread+0x0/0x7b
 [<ffffffff810128ef>] ? child_rip+0x0/0x11


Code: 45 e0 e8 7f 4c f7 ff eb 06 41 bc a4 ff ff ff 5b 44 89 e0 41 5c 5b 41 5c 41 5d 41 5e c9 c3 55 48 89 e5 48 83 ec 60 0f 1f 44 00 00 <4c> 8b 96 a8 04 00 00 31 c0 4c 8b 87 18 06 00 00 41 89 d1 41 83 
RIP  [<ffffffff811579c9>] socket_has_perm+0xd/0x60
 RSP <ffff88000b421bd0>
CR2: 00000000000004a8
---[ end trace 5a038d2cae9f6247 ]---


...I think we ended up calling into kernel_recvmsg while ipv4_connect was being run. I don't see anything that prevents this, so it seems plausible. So in addition to fixing the code that shares the data structures mentioned, we probably will also need to overhaul the cifs_reconnect code to make sure we aren't trying to use the socket while it's being reconnected.

I've gone ahead and set a set of patches upstream to remove the kthread_stop usage, and to just disable the sharing code. This allows the reproducer to run for a long time (at least overnight).

Comment 17 Jeff Layton 2008-10-07 10:45:58 UTC
*** Bug 458210 has been marked as a duplicate of this bug. ***

Comment 21 Jeff Layton 2008-10-10 10:19:10 UTC
Created attachment 320000 [details]
disable sharing of mount structures + remove kthread_stop usage

uhhh...sorry...

Those patches are definitely not a complete solution (not even close, actually). 

This patchset actually seems to prevent the panics with the reproducer, but the upstream maintainer has some concerns. Most notably, this will probably break multi-user mounts. Still, it might be something to consider.

The other idea I have for a stopgap fix is to just wrap all of cifs_mount and cifs_umount with a global mutex. It's an ugly "solution" but would probably prevent most of these crashes.

In the meantime, I'm working on a better fix, but the changes required are substantial. It'll probably take some time to get it right.

Comment 22 Jeff Layton 2008-10-10 12:14:46 UTC
Created attachment 320011 [details]
serialize cifs mounts + remove kthread_stop usage

Ugh...

I'm really not fond of this patch, but it should allow the existing code to limp along while preventing races between mounts and unmounts. It just adds a global mutex that we use to serialize the cifs_mount and cifs_umount functions. Essentially, we're papering over the bugs with this...

I'm testing it now and it seems to be working. I still want to clean up the structure sharing code to make this unnecessary (and to make the code simpler and easier to debug).

It might be reasonable to consider this for RHEL as a stopgap measure until a better fix is in place.

Comment 26 Jeff Layton 2008-10-13 13:26:44 UTC
Created attachment 320186 [details]
patch -- fix mount races in RHEL5

First, I notice that you're running a -108.el5 based kernel. That kernel is pretty old by now and doesn't contain the update to the CIFS code that's slated for 5.3. The 5.3 kernel should alreadu include the patch to clean up error handling in cifs_mount.

My guess is that you're running into a different, unrelated bug of some sort. I'd recommend testing this on something closer to what will be released in 5.3 (unless you feel like chasing bugs that have already been fixed).

Here's the RHEL5 patchset I've been testing today. It's based on the patchset that I sent upstream over the weekend. I think it should fix this without needing to wrap everything in a new semaphore.

There are some other patches in the kernel I've been testing too, but I don't think they have any bearing on this problem.

This patchset should apply reasonably cleanly to the current RHEL5.3 kernels.

Comment 28 Jeff Layton 2008-11-24 11:24:33 UTC
A patchset is now in mainline kernels for this. I've backported it to RHEL5 and put it in the test kernels on my people.redhat.com page:

http://people.redhat.com/jlayton/

...if you're able to test them, then please do and let me know how they fare...

Comment 32 RHEL Program Management 2009-01-27 20:42:59 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 33 RHEL Program Management 2009-02-16 15:36:26 UTC
Updating PM score.

Comment 41 Jeff Layton 2009-04-22 17:01:43 UTC

*** This bug has been marked as a duplicate of bug 465143 ***


Note You need to log in before you can comment on or make changes to this bug.