Bug 452028

Summary: CIFS crashes server
Product: Red Hat Enterprise Linux 5 Reporter: Emerson Dorow <emerson.dorow>
Component: kernelAssignee: Jeff Layton <jlayton>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: low Docs Contact:
Priority: low    
Version: 5.1CC: gdeschner, hturesson, mgoodwin, ssorce, steved
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 503192 (view as bug list) Environment:
Last Closed: 2009-04-13 13:07:50 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
excerpt from /var/log/messages
none
/var/log/messages after a cifs related crash
none
dmesg after a cifs related crash none

Description Emerson Dorow 2008-06-18 18:47:02 UTC
Description of problem:
  I need to report a bug. When i copy files through CIFS (1.48aRH) on "Red Hat
Enterprise Linux Server release 5.1 (Tikanga)" to windows share, my server
crashes. I know that in many versions, CIFS has the same problem, but i didn't
see this problem reported in this version of CIFS over RH5.1. 

Version-Release number of selected component (if applicable):
cifs 1.48aRH

How reproducible:
Copying a file through CIFS file system

Steps to Reproduce:
1. To mount a share in a windows system with cifs
2. copy a large amount of data to the windows share
3. Wait the server crash
  
Actual results:
server crashing

Expected results:
Not crash more

Additional info:

Comment 1 Emerson Dorow 2008-06-18 18:49:51 UTC
This problem occur in many servers we monitor.

Comment 2 Hjalmar Turesson 2008-06-27 13:30:21 UTC
Created attachment 310437 [details]
excerpt from /var/log/messages

Comment 3 Hjalmar Turesson 2008-06-27 13:33:23 UTC
I have a similar problem, but with centos 5.1.
Complete system lockup caused by a windows file-share mounted with cifs. The
crashes occur somewhat randomly, but can generally be reproduced through
browsing and opening files from the windows share.

Computer:
HP xw9300 Workstation
2 x AMD Opteron Processor 250
4 GB ram
Nvidia Quadro Fx 1400

Kernel: 2.6.18-53.1.21.el5 0000001 SMP, x86_64

I have the same share mounted on another computer also, but without any crashes.

Comment 4 Jeff Layton 2008-11-04 15:59:04 UTC
>   I need to report a bug. When i copy files through CIFS (1.48aRH) on "Red Hat
> Enterprise Linux Server release 5.1 (Tikanga)" to windows share, my server
> crashes.

Specifically, on what kernels have you seen this problem? The messages file you attached is pretty garbled, do you happen to have the oops message from any of these crashes?

> I have a similar problem, but with centos 5.1.
> Complete system lockup caused by a windows file-share mounted with cifs. The
> crashes occur somewhat randomly, but can generally be reproduced through
> browsing and opening files from the windows share.

Can you determine whether the box is oopsing? If so, then I'll need to see the stack trace from it to determine much of anything. If it's not oopsing and is just locking up, then we'll probably need to collect some sysrq-t info and see what the hung processes are doing.

In both cases, it might be helpful to try out the kernels on my people.redhat.com page and see if they help you:

http://people.redhat.com/jlayton

...the 5.1 kernel is pretty old and I've done a large number of CIFS updates since then.

Comment 5 Hjalmar Turesson 2008-11-09 20:33:15 UTC
(In reply to comment #4)
> >   I need to report a bug. When i copy files through CIFS (1.48aRH) on "Red Hat
> > Enterprise Linux Server release 5.1 (Tikanga)" to windows share, my server
> > crashes.
> 
> Specifically, on what kernels have you seen this problem? The messages file you
> attached is pretty garbled, do you happen to have the oops message from any of
> these crashes?

It wasn't Emerson who attached the messages file, but Hjalmar.
I ran 2.6.18-53.1.21.el5 at the moment, but have now changed to 2.6.18-92.1.6.el5.centos.plus. I "solved" my problem through not mounting the share. But, in response to your questions I mounted it again, and induced some crashes with the new kernel.


> Can you determine whether the box is oopsing? If so, then I'll need to see the
> stack trace from it to determine much of anything. If it's not oopsing and is
> just locking up, then we'll probably need to collect some sysrq-t info and see
> what the hung processes are doing.

Will I see if the kernel is oopsing in /var/log/messages and dmesg? If that is the case, then I guess the kernel isn't oopsing (if not, please tell me how to figure out if the kernel is oopsing). 
I attached both the output from dmesg and messages. I couldn't even find any cifs errors there.

> In both cases, it might be helpful to try out the kernels on my
> people.redhat.com page and see if they help you:
> 
> http://people.redhat.com/jlayton
> 
> ...the 5.1 kernel is pretty old and I've done a large number of CIFS updates
> since then.

I'll try.
Thanks

Comment 6 Hjalmar Turesson 2008-11-09 20:35:07 UTC
Created attachment 323025 [details]
/var/log/messages after a cifs related crash

Comment 7 Hjalmar Turesson 2008-11-09 20:35:49 UTC
Created attachment 323026 [details]
dmesg after a cifs related crash

Comment 8 Jeff Layton 2008-11-25 12:20:12 UTC
Those files, I'm afraid don't have much info to go on. I see this:

Nov  9 14:10:18 pnel03 kernel:  CIFS VFS: Error 0x0 on cifs_get_inode_info in lookup of \public_html\.hidden

...and the next message logged is this which is the box starting up again:

Nov  9 10:12:11 pnel03 syslogd 1.4.1: restart.
Nov  9 10:12:11 pnel03 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Nov  9 10:12:11 pnel03 kernel: Linux version 2.6.18-92.1.6.el5.centos.plus (mockbuild.org) (gcc version 4.1.2 20071124 (Red Hat 4.1.2-42)) #1 SMP Thu Jun 26 12:18:07 EDT 2008
Nov  9 10:12:11 pnel03 kernel: Command line: ro root=LABEL=/ rhgb 0x31F verbose mem=4096M

...unfortunately, it's common for oops messages not to make it to the logs. The machine is often not in a position to write out log data while it's crashing. Usually, collecting oopses means logging a serial console or collecting a core dump. kdump is pretty easy to use, so you may want to give that a try.

Have newer kernels fared any better here?

Comment 9 Jeff Layton 2009-01-15 12:05:30 UTC
Ping...any more info on this?

Comment 10 Hjalmar Turesson 2009-01-15 18:11:07 UTC
(In reply to comment #9)
> Ping...any more info on this?

I haven't had the opportunity/time to crash the computer yet. Hopefully, I get time this weekend.

Comment 11 Jeff Layton 2009-01-22 15:43:28 UTC
No response in several months. Reducing severity to low.

Comment 12 Hjalmar Turesson 2009-02-08 18:44:32 UTC
Sorry for the excessive delay,

I installed and tried a kernel (2.6.18-129.el5.jtltest.60) from your page.
It seems to work fine, although I need to use it more to be sure.

Thanks

Comment 13 Jeff Layton 2009-04-13 13:07:50 UTC
No response in several months. Closing case.