Bug 503192 - CIFS crashes server after windows share is unavailable
CIFS crashes server after windows share is unavailable
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.0
x86_64 Linux
low Severity medium
: rc
: ---
Assigned To: Jeff Layton
Red Hat Kernel QE team
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-05-29 10:40 EDT by Jonathan Schwehm
Modified: 2014-06-18 03:38 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 452028
Environment:
Last Closed: 2010-01-13 16:31:49 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Contents of messages file through beginning of reboot (165.61 KB, text/plain)
2009-05-29 10:42 EDT, Jonathan Schwehm
no flags Details

  None (edit)
Description Jonathan Schwehm 2009-05-29 10:40:22 EDT
+++ This bug was initially created as a clone of Bug #452028 +++

Description of problem:
I found bug #452028 via a search on google.  We experienced a similar issue last night when a share on a Windows Server 2003 box was unavailable for 7 hours.  The original bug mentioned an updated kernel for a CentOS system (2.6.18-129.el5.jtltest.60), but did not indicate if an updated kernel resolved the issue.

After logging multiple kernel messages regarding CIFS VFS (over 2200 lines), the server stopped responding and wrote no additional information to its logs (including messages, secure, and log files for our applications).  The server required a hard reboot and did not report any hardware failures.

Software Versions:
uname -a 
Linux webserve2 2.6.18-92.1.10.el5 #1 SMP Wed Jul 23 03:55:54 EDT 2008 i686 i686 i386 GNU/Linux

cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5 (Tikanga)

samba-3.0.23c-2.el5.2.0.2

Sample from /var/log/messages:
May 28 20:47:26 webserve2 kernel:  CIFS VFS: close with pending writes
May 28 20:47:58 webserve2 kernel:  CIFS VFS: server not responding
May 28 20:47:58 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 30854
May 28 20:48:01 webserve2 kernel:  CIFS VFS: No response to cmd 47 mid 30853
May 28 20:48:01 webserve2 kernel:  CIFS VFS: Write2 ret -11, wrote 0
May 28 20:48:01 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 30855
May 28 20:48:09 webserve2 kernel:  CIFS VFS: Write2 ret -11, wrote 0
May 28 20:48:14 webserve2 kernel:  CIFS VFS: Send error in Close = -9
May 28 20:48:45 webserve2 kernel:  CIFS VFS: close with pending writes
May 28 20:48:58 webserve2 kernel:  CIFS VFS: server not responding
May 28 20:48:58 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 30886
May 28 20:48:58 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 30887
May 28 20:49:00 webserve2 kernel:  CIFS VFS: No response to cmd 47 mid 30885
May 28 20:49:00 webserve2 kernel:  CIFS VFS: No response to cmd 4 mid 30888
May 28 20:49:00 webserve2 kernel:  CIFS VFS: Write2 ret -11, wrote 0
May 28 20:49:00 webserve2 kernel:  CIFS VFS: Send error in Close = -11
May 28 20:49:52 webserve2 kernel:  CIFS VFS: close with pending writes
May 28 20:51:20 webserve2 last message repeated 2 times
May 28 20:52:29 webserve2 kernel:  CIFS VFS: close with pending writes
...
May 29 01:59:23 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 44712
May 29 01:59:23 webserve2 kernel:  CIFS VFS: server not responding
May 29 01:59:23 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 44717
May 29 01:59:23 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 44711
May 29 01:59:23 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 44713
May 29 01:59:23 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 44710
May 29 01:59:23 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 44714
May 29 01:59:23 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 44719
May 29 01:59:23 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 44715
May 29 01:59:23 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 44716
May 29 01:59:23 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 44718
May 29 01:59:33 webserve2 kernel:  CIFS VFS: Error 0xffffff90 on cifs_get_inode_info in lookup of \output\njinvoice226769_702403.pdf
May 29 01:59:43 webserve2 kernel:  CIFS VFS: Error 0xffffff90 on cifs_get_inode_info in lookup of \upload_docs\Correspondence\2009
...
May 29 03:12:37 webserve2 kernel:  CIFS VFS: server not responding
May 29 03:12:37 webserve2 kernel:  CIFS VFS: server not responding
May 29 03:12:37 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 47303
May 29 03:12:37 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 47304
May 29 03:12:37 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 47305
May 29 03:12:37 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 47306
May 29 03:12:42 webserve2 kernel:  CIFS VFS: No response to cmd 47 mid 47302
May 29 03:12:42 webserve2 kernel:  CIFS VFS: Write2 ret -11, wrote 0
May 29 03:12:42 webserve2 kernel:  CIFS VFS: writes pending, delay free of handle
May 29 03:12:48 webserve2 last message repeated 4 times
May 29 03:13:37 webserve2 kernel:  CIFS VFS: server not responding
May 29 03:13:37 webserve2 kernel:  CIFS VFS: server not responding
May 29 03:13:37 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 47314
May 29 03:13:37 webserve2 kernel:  CIFS VFS: server not responding
May 29 03:13:37 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 47317
May 29 03:13:37 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 47315
May 29 03:13:37 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 47316
May 29 03:13:43 webserve2 kernel:  CIFS VFS: No response to cmd 47 mid 47313
May 29 03:13:43 webserve2 kernel:  CIFS VFS: Write2 ret -11, wrote 0
May 29 03:13:43 webserve2 kernel:  CIFS VFS: No response for cmd 50 mid 47318
Comment 1 Jonathan Schwehm 2009-05-29 10:42:34 EDT
Created attachment 345912 [details]
Contents of messages file through beginning of reboot
Comment 2 Jeff Layton 2009-06-02 09:13:47 EDT
Unfortunately, I can't tell much from that log. The box may have oopsed, or just been hung. It may have even had nothing at all to do with CIFS. If it happens again, it would be helpful to capture some sysrq-t data while the box is in this state, or capture a vmcore.

Also when you say the box "went unresponsive" can you elaborate on what you mean? Did it respond to pings, for instance?
Comment 3 Jonathan Schwehm 2009-06-02 11:45:04 EDT
The box did not respond to pings when it became unresponsive.  The box was still powered-on, but the tech who rebooted it did not connect a console to see what it displayed.

If this happens again and it's responsive through the console, how can I capture some sysrq-t data or capture a vmcore?
Comment 4 Jeff Layton 2009-06-23 10:21:05 EDT
Sysrq info:
http://kbase.redhat.com/faq/docs/DOC-2024

How to force a core:
http://kbase.redhat.com/faq/docs/DOC-4264

How to set up netdump:
http://kbase.redhat.com/faq/docs/DOC-6855

Please open a support case if you require more detailed help.
Comment 5 Jeff Layton 2010-01-13 08:45:58 EST
Is this still a problem with more recent kernels? In particular with the rhel5.5 test kernels on jwilson's homepage?

http://people.redhat.com/jwilson/
Comment 6 Jonathan Schwehm 2010-01-13 16:24:54 EST
We haven't experienced this problem since June 2009 and since we upgraded the kernel.  We're currently running 2.6.18-164.10.1.

Thanks for checking-in; feel free to close this report since the current version of the kernel doesn't exhibit this behavior.
Comment 7 Jeff Layton 2010-01-13 16:31:49 EST
Thanks, closing case.

Note You need to log in before you can comment on or make changes to this bug.