Bug 593422 - busy inodes after umount problem with CIFS
busy inodes after umount problem with CIFS
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel (Show other bugs)
6.0
All Linux
high Severity high
: rc
: ---
Assigned To: Jeff Layton
yanfu,wang
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-05-18 14:39 EDT by Jeff Layton
Modified: 2010-11-11 10:53 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-11-11 10:53:35 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Samba Project 7433 None None None Never

  None (edit)
Description Jeff Layton 2010-05-18 14:39:56 EDT
We've had sporradic reports over the last few months of busy inodes after umount problems with cifs. Recently, Suresh J. came up with a reproducer upstream. With that, I've been able to track down what I think is the cause, which is some really broken handling of how open files are handed off from the create routines to cifs_open. Opening this bug to track that problem and to ensure that it doesn't end up falling on the floor for RHEL6.

RHEL5 may also be susceptible to this problem as well with the changes that went into the oplock break handling code in 5.5, but I'm not aware of any reports of that as of yet. Not using server inode numbers (which is the default for RHEL5), seems to make this harder to reproduce for some reason, so that could be a factor there.
Comment 1 RHEL Product and Program Management 2010-05-18 14:55:03 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.
Comment 3 Aristeu Rozanski 2010-07-01 12:23:44 EDT
Patch(es) available on kernel-2.6.32-42.el6
Comment 6 yanfu,wang 2010-07-27 02:31:29 EDT
hi Jeff,

is the problem similar with bz 591229, if I could use https://bugzilla.redhat.com/show_bug.cgi?id=591229#c13 to verify it or it need separate reproducer?
Comment 7 Jeff Layton 2010-07-27 06:26:53 EDT
The reproducer for this is in the associated samba.org bug:

    https://bugzilla.samba.org/show_bug.cgi?id=7433
Comment 8 yanfu,wang 2010-07-28 01:51:59 EDT
(In reply to comment #7)
> The reproducer for this is in the associated samba.org bug:
> 
>     https://bugzilla.samba.org/show_bug.cgi?id=7433    

hi Jeff,

when I run the below on the client on the cifs mount point, it reports permission denied even if I chmod and chown these file to 777:
# while true; do touch 1 2; echo 1 > 1;echo 2 > 2; rm 1 2; done
touch: cannot touch `1': Permission denied
touch: cannot touch `2': Permission denied
-bash: 1: Permission denied
-bash: 2: Permission denied

# cat /proc/mounts
//10.16.64.168/public/ /mnt cifs rw,mand,relatime,unc=\\10.16.64.168\public,username=root,uid=0,noforceuid,gid=0,noforcegid,addr=10.16.64.168,posixpaths,serverino,acl,rsize=16384,wsize=57344 0 0

meanwhile,the same files(1,2) have created with 'root' user by runing on the server on the share:
# while true; do touch 1 2; echo 1 > 1;echo 2 > 2; rm 1 2; done


I tried to create a samba user on the server. 
on the server:
create files with the samba_user on the share dir.
on the client:
mounted with -o username=samba_user, and write/access the server's file owned by the samba_user, then got Permission denied again.

That's mean I could not access these files owned by the user of the server from client, how could I do it?
Comment 9 Jeff Layton 2010-07-28 06:38:35 EDT
(In reply to comment #8)
> 
> I tried to create a samba user on the server. 
> on the server:
> create files with the samba_user on the share dir.
> on the client:
> mounted with -o username=samba_user, and write/access the server's file owned
> by the samba_user, then got Permission denied again.
> 
> That's mean I could not access these files owned by the user of the server from
> client, how could I do it?    

CIFS permissions handling can be very confusing:

samba maps the username used during the mount to a local user. So if you create files on the mount on the client, they'll generally end up being owned by the local user to which the username= option maps.

With unix extensions enabled, the server will also send along ownership information in the form of numeric uids/gids and the client will use that info.

So to do this correctly, I think you'll want to run the reproducer as the same user on client and server. The user on client and server should have the same uid, and the mount needs to be done with the samba user that maps to the same username on the server. Opening up permissions on the directory wouldn't hurt either.
Comment 10 yanfu,wang 2010-07-29 04:35:08 EDT
(In reply to comment #9)
> (In reply to comment #8)
> > 
> > I tried to create a samba user on the server. 
> > on the server:
> > create files with the samba_user on the share dir.
> > on the client:
> > mounted with -o username=samba_user, and write/access the server's file owned
> > by the samba_user, then got Permission denied again.
> > 
> > That's mean I could not access these files owned by the user of the server from
> > client, how could I do it?    
> 
> CIFS permissions handling can be very confusing:
> 
> samba maps the username used during the mount to a local user. So if you create
> files on the mount on the client, they'll generally end up being owned by the
> local user to which the username= option maps.
> 
> With unix extensions enabled, the server will also send along ownership
> information in the form of numeric uids/gids and the client will use that info.
> 
> So to do this correctly, I think you'll want to run the reproducer as the same
> user on client and server. The user on client and server should have the same
> uid, and the mount needs to be done with the samba user that maps to the same
> username on the server. Opening up permissions on the directory wouldn't hurt
> either.    

hi Jeff,

I've resolve the above problem, but can't get busy inodes error, I used 2.6.32-37.el6 kernel which should not fix the bug, pls refer to my enviroment and steps:

server:
[root@dell-pem710-01 tmp]# uname -a
Linux dell-pem710-01.rhts.eng.bos.redhat.com 2.6.32-37.el6.x86_64 #1 SMP Sun Jun 20 19:29:35 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

[root@dell-pem710-01 tmp]# rpm -qa|grep samba
samba-common-3.5.2-62.el6.x86_64
samba-3.5.2-62.el6.x86_64
samba-winbind-clients-3.5.2-62.el6.x86_64
samba-client-3.5.2-62.el6.x86_64

smb.conf:
        [public]
        comment = Public Stuff
        path = /tmp
;       public = yes
        writable = yes
        valid users = autofs
        printable = no

and I've added the autofs user:
# useradd autofs
autofs:x:501:501::/home/autofs:/bin/bash
# smbpasswd -a -L -s autofs

# su - autofs
# [autofs@dell-pem710-01 ~]$ while true; do touch 1 2; echo 1 > 1;echo 2 > 2; rm 1 2; done


client:
.qa.[root@x86-64-5s-m1 ~]# uname -a
Linux x86-64-5s-m1.ss.eng.bos.redhat.com 2.6.18-194.8.1.el5 #1 SMP Wed Jun 23 10:52:51 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

.qa.[root@x86-64-5s-m1 ~]# mount -t cifs -o username=autofs,password=autofs,uid=501,gid=501 //10.16.64.168/public /tmp

.qa.[root@x86-64-5s-m1 ~]# cd /tmp

run the same loop while the server's running

then :
* Stop the script on the client
* Stop on the server
* cd to a different dir other than mountpoint on the client
* umount the mountpoint on the client

not got busy inode error, could you pls give me some hints?
Comment 11 Jeff Layton 2010-07-29 07:30:11 EDT
Ahh yes...actually I think you were correct with the first setup you had.

The idea here is that you want to run the script on the client and have it hit errors, particularly on open calls. The bug is that it's possible for the things to be opened during lookup, and then for the actual open call never to happen due to permissions issues or other problems. So getting these sorts of errors on the client is expected:

touch: cannot touch `1': Permission denied
touch: cannot touch `2': Permission denied
-bash: 1: Permission denied
-bash: 2: Permission denied

...though you generally have to let it run for a little while before the busy inodes problem occurs.

So in short, run this as root on the server and non-root on the client. The directory where it runs needs to be writable by the non-root user on the client, of course.
Comment 12 yanfu,wang 2010-07-30 05:53:29 EDT
(In reply to comment #11)
> Ahh yes...actually I think you were correct with the first setup you had.
> 
> The idea here is that you want to run the script on the client and have it hit
> errors, particularly on open calls. The bug is that it's possible for the
> things to be opened during lookup, and then for the actual open call never to
> happen due to permissions issues or other problems. So getting these sorts of
> errors on the client is expected:
> 
> touch: cannot touch `1': Permission denied
> touch: cannot touch `2': Permission denied
> -bash: 1: Permission denied
> -bash: 2: Permission denied
> 
> ...though you generally have to let it run for a little while before the busy
> inodes problem occurs.
> 
> So in short, run this as root on the server and non-root on the client. The
> directory where it runs needs to be writable by the non-root user on the
> client, of course.    

I think the permission problem is same with https://bugzilla.samba.org/show_bug.cgi?id=7433#c20, but I don't know Suresh Jayaraman how to deal with it, I'' check it more.
Comment 13 yanfu,wang 2010-08-02 05:32:08 EDT
I have reproduced it using RHEL6.0-20100414.0 as client and RHEL6-snapshot8 as server. I got "VFS: Busy inodes after unmount of cifs..." when I umount with cifs,pls refer to below:
[root@dell-pem605-01 ~]# dmesg|tail
 CIFS VFS: Send error in Close = -5
 CIFS VFS: Send error in Close = -5
 CIFS VFS: Send error in Close = -5
 CIFS VFS: Send error in Close = -5
 CIFS VFS: Send error in Close = -5
 CIFS VFS: Send error in Close = -5
 CIFS VFS: Send error in Close = -5
 CIFS VFS: Send error in Close = -5
 CIFS VFS: Send error in Close = -5
VFS: Busy inodes after unmount of cifs. Self-destruct in 5 seconds.  Have a nice day...

panic info shown on console:
 BUG: unable to handle kernel NULL pointer dereference at 00000014
IP: [<f9625abc>] cifs_writepages+0x2c/0x870 [cifs]
*pdpt = 0000000033e53001 *pde = 000000011eaaa067 
Oops: 0000 [#1] SMP 
last sysfs file: /sys/devices/pci0000:00/0000:00:0d.0/0000:05:00.0/0000:06:00.0/irq
Modules linked in: nls_utf8(U) cifs(U) autofs4(U) sunrpc(U) cpufreq_ondemand(U) powernow_k8(U) ipv6(U) dm_mirror(U) dm_region_hash(U) dm_log(U) dcdbas(U) i2c_nforce2(U) sr_mod(U) cdrom(U) k8temp(U) sg(U) se]

Pid: 25, comm: bdi-default Not tainted (2.6.32-19.el6.i686 #1) PowerEdge M605
EIP: 0060:[<f9625abc>] EFLAGS: 00010286 CPU: 0
EIP is at cifs_writepages+0x2c/0x870 [cifs]
EAX: 00000000 EBX: ee6b5204 ECX: f9625a90 EDX: ee6b51d4
ESI: ee6b52b4 EDI: c09e7f80 EBP: f70fdeec ESP: f70fddb4
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process bdi-default (pid: 25, ti=f70fc000 task=f70eca90 task.ti=f70fc000)
Stack:
 00000000 c3804534 c3844834 00000000 c05d2d87 00000000 c38c8a40 c0441373
<0> 00000000 00000000 c3844834 c0aaea40 f70fdeec c0aaea40 c380482c fffffffc
<0> ffffffff 00000000 00000000 c3804740 00000001 00008108 ee6b51d4 c3804828
Call Trace:
 [<c05d2d87>] ? cpumask_next_and+0x17/0x30
 [<c0441373>] ? find_busiest_group+0x1b3/0x830
 [<c0409227>] ? __switch_to+0xd7/0x1a0
 [<c04d5414>] ? do_writepages+0x14/0x30
 [<c0532556>] ? writeback_single_inode+0x96/0x270
 [<c053297f>] ? writeback_inodes_wb+0x13f/0x420
 [<c0532d52>] ? wb_writeback+0xf2/0x190
 [<c0532f54>] ? wb_do_writeback+0x164/0x1c0
 [<c04e9371>] ? bdi_forker_task+0x51/0x2a0
 [<c04e95c0>] ? bdi_start_fn+0x0/0xa0
 [<c043c0b0>] ? complete+0x40/0x60
 [<c04e9320>] ? bdi_forker_task+0x0/0x2a0
 [<c046e5b4>] ? kthread+0x74/0x80
 [<c046e540>] ? kthread+0x0/0x80
 [<c040b047>] ? kernel_thread_helper+0x7/0x10
Code: 57 56 89 c6 53 81 ec b4 00 00 00 89 54 24 30 8b 78 40 8b 00 8d 50 d0 89 54 24 58 8b 80 a4 00 00 00 8b 80 84 01 00 00 89 44 24 48 <81> 78 14 ff 0f 00 00 0f 86 c5 02 00 00 8b 4c 24 48 8b 01 8b 40 
EIP: [<f9625abc>] cifs_writepages+0x2c/0x870 [cifs] SS:ESP 0068:f70fddb4
CR2: 0000000000000014
---[ end trace 8d258a8c60accaa3 ]---
Kernel panic - not syncing: Fatal exception
Pid: 25, comm: bdi-default Tainted: G      D    2.6.32-19.el6.i686 #1
Call Trace:
 [<c08055d5>] ? panic+0x42/0xed
 [<c0808bfc>] ? oops_end+0xbc/0xd0
 [<c042e2ea>] ? no_context+0xba/0x190
 [<c042e52f>] ? bad_area_nosemaphore+0xf/0x20
 [<c080a1c0>] ? do_page_fault+0x320/0x400
 [<c0809ea0>] ? do_page_fault+0x0/0x400
 [<c0807fcb>] ? error_code+0x73/0x78
 [<f9625a90>] ? cifs_writepages+0x0/0x870 [cifs]
 [<f9625abc>] ? cifs_writepages+0x2c/0x870 [cifs]
 [<c05d2d87>] ? cpumask_next_and+0x17/0x30
 [<c0409227>] ? __switch_to+0xd7/0x1a0
 [<c04d5414>] ? do_writepages+0x14/0x30
 [<c0532556>] ? writeback_single_inode+0x96/0x270
 [<c053297f>] ? writeback_inodes_wb+0x13f/0x420
 [<c0532d52>] ? wb_writeback+0xf2/0x190
 [<c0532f54>] ? wb_do_writeback+0x164/0x1c0
 [<c04e9371>] ? bdi_forker_task+0x51/0x2a0
 [<c04e95c0>] ? bdi_start_fn+0x0/0xa0
 [<c043c0b0>] ? complete+0x40/0x60
 [<c04e9320>] ? bdi_forker_task+0x0/0x2a0
 [<c046e5b4>] ? kthread+0x74/0x80
 [<c046e540>] ? kthread+0x0/0x80
 [<c040b047>] ? kernel_thread_helper+0x7/0x10
[drm:drm_fb_helper_panic] *ERROR* panic occurred, switching back to text console
------------[ cut here ]------------
WARNING: at arch/x86/kernel/smp.c:117 resched_task+0x58/0x70() (Tainted: G      D   )
Hardware name: PowerEdge M605
Modules linked in: nls_utf8(U) cifs(U) autofs4(U) sunrpc(U) cpufreq_ondemand(U) powernow_k8(U) ipv6(U) dm_mirror(U) dm_region_hash(U) dm_log(U) dcdbas(U) i2c_nforce2(U) sr_mod(U) cdrom(U) k8temp(U) sg(U) se]
Pid: 3, comm: migration/0 Tainted: G      D    2.6.32-19.el6.i686 #1
Call Trace:
 [<c044e197>] ? warn_slowpath_common+0x77/0xb0
 [<c0436618>] ? resched_task+0x58/0x70
 [<c044e1e3>] ? warn_slowpath_null+0x13/0x20
 [<c0436618>] ? resched_task+0x58/0x70
 [<c0440361>] ? pull_task+0x41/0x50
 [<c044041b>] ? move_one_task_fair+0xab/0xf0
 [<c044af24>] ? migration_thread+0x224/0x2b0
 [<c044ad00>] ? migration_thread+0x0/0x2b0
 [<c046e5b4>] ? kthread+0x74/0x80
 [<c046e540>] ? kthread+0x0/0x80
 [<c040b047>] ? kernel_thread_helper+0x7/0x10
---[ end trace 8d258a8c60accaa4 ]---


and I verified on kernel 2.6.32-52.el6 (bothe server and client) on x86_64 and i386, and umount could be successful, but there's "CIFS VFS: Send error in Close = -5" shown on dmesg.
# dmesg|tail
CIFS VFS: Send error in Close = -5
CIFS VFS: Send error in Close = -5
CIFS VFS: Send error in Close = -5
CIFS VFS: Send error in Close = -5
CIFS VFS: Send error in Close = -5
CIFS VFS: Send error in Close = -5
CIFS VFS: Send error in Close = -5
CIFS VFS: Send error in Close = -5
CIFS VFS: Send error in Close = -5
CIFS VFS: Send error in Close = -5


hi Jeff, umount without busy inodes now but "CIFS VFS: Send error in Close = -5" can be found in syslog, is it expected?
Comment 14 Jeff Layton 2010-08-02 07:07:10 EDT
It's not uncommon. CIFS is chattier than it should be with regard to some of these sorts of errors. If you like, we can have a look at eliminating those, but that should probably be done in the context of a different bug.
Comment 15 releng-rhel@redhat.com 2010-11-11 10:53:35 EST
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.

Note You need to log in before you can comment on or make changes to this bug.