247464 – soft lockup in cifs mountpoint leads to unresponsive server

Bug 247464 - soft lockup in cifs mountpoint leads to unresponsive server

Summary: soft lockup in cifs mountpoint leads to unresponsive server

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.0
Hardware:	All
OS:	Linux
Priority:	low
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Jeff Layton
QA Contact:	Martin Jenner
Docs Contact:
URL:	http://git.kernel.org/?p=linux/kernel...
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2007-07-09 14:23 UTC by Arenas Belon, Carlo Marcelo
Modified:	2008-01-15 11:24 UTC (History)
CC List:	2 users (show)
Fixed In Version:	RHEL5.1
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2008-01-15 11:24:07 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Linux Kernel	7903	0	None	None	None	Never

Description Arenas Belon, Carlo Marcelo 2007-07-09 14:23:05 UTC

Description of problem:
soft lockup in cifs module when doing "stat", as shown by :

kernel: BUG: soft lockup detected on CPU#1!
kernel: [<c0447f3f>] softlockup_tick+0x98/0xa6
kernel: [<c042d138>] update_process_times+0x39/0x5c
kernel: [<c04176f0>] smp_apic_timer_interrupt+0x5c/0x64
kernel: [<c04049bf>] apic_timer_interrupt+0x1f/0x24
kernel: [<c0470f5e>] generic_fillattr+0x76/0xa4
kernel: [<f8d9cce8>] cifs_getattr+0x1e/0x2b [cifs]
kernel: [<f8d9ccca>] cifs_getattr+0x0/0x2b [cifs]
kernel: [<c047132e>] vfs_getattr+0x40/0x9b
kernel: [<c0471445>] vfs_stat_fd+0x2a/0x3c
kernel: [<c04714e4>] sys_stat64+0xf/0x23
kernel: [<c04dec11>] copy_to_user+0x31/0x48
kernel: [<c0403eff>] syscall_call+0x7/0xb
kernel: ======================= 

Version-Release number of selected component (if applicable):
2.6.18-8.1.6.el5PAE

How reproducible:
regularly (4 cases reported in the last 2 days from a universe of 336 boxes)

Steps to Reproduce:
1. create a read only mountpoint to a samba server
2. run 2 processes (better more) in an SMP server with a PAE enabled kernel that
stat files in that mount point
3. wait for it to lock (ping will work, but won't be able to ssh anymore)
  
Actual results:
system locks and requires a manual reboot (not even sysrq will respond in the
console)

Expected results:
no lock

Additional info:
in the server the smb process is still up and connected from the defunct server,
including oplocks to the file that was being stat as exclusive.

the TCP/IP stack of the defunct server seems to be operative, but no process
that requires a fork/exec would work (rsync/sshd) and will hung TCP connections
after the 3 way handshake is completed.

probably already fixed upstream in version of cifs 1.48 as shown by the linked
bug, with the commit linked in URL

Comment 1 Jeff Layton 2007-07-10 12:01:44 UTC

There's a pending update of the CIFS code already slated for inclusion into 5.1.
If you have a non-critical place to do so, could you test the kernels on my
people page and see if they help this issue:

http://people.redhat.com/jlayton

Comment 2 Jeff Layton 2008-01-15 11:24:07 UTC

No response to my query from July. I believe this is already fixed in the 5.1
release. Please reopen if that is not the case.

Note You need to log in before you can comment on or make changes to this bug.