Bug 247464 - soft lockup in cifs mountpoint leads to unresponsive server
soft lockup in cifs mountpoint leads to unresponsive server
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.0
All Linux
low Severity high
: ---
: ---
Assigned To: Jeff Layton
Martin Jenner
http://git.kernel.org/?p=linux/kernel...
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-07-09 10:23 EDT by Arenas Belon, Carlo Marcelo
Modified: 2008-01-15 06:24 EST (History)
2 users (show)

See Also:
Fixed In Version: RHEL5.1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-01-15 06:24:07 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Linux Kernel 7903 None None None Never

  None (edit)
Description Arenas Belon, Carlo Marcelo 2007-07-09 10:23:05 EDT
Description of problem:
soft lockup in cifs module when doing "stat", as shown by :

kernel: BUG: soft lockup detected on CPU#1!
kernel: [<c0447f3f>] softlockup_tick+0x98/0xa6
kernel: [<c042d138>] update_process_times+0x39/0x5c
kernel: [<c04176f0>] smp_apic_timer_interrupt+0x5c/0x64
kernel: [<c04049bf>] apic_timer_interrupt+0x1f/0x24
kernel: [<c0470f5e>] generic_fillattr+0x76/0xa4
kernel: [<f8d9cce8>] cifs_getattr+0x1e/0x2b [cifs]
kernel: [<f8d9ccca>] cifs_getattr+0x0/0x2b [cifs]
kernel: [<c047132e>] vfs_getattr+0x40/0x9b
kernel: [<c0471445>] vfs_stat_fd+0x2a/0x3c
kernel: [<c04714e4>] sys_stat64+0xf/0x23
kernel: [<c04dec11>] copy_to_user+0x31/0x48
kernel: [<c0403eff>] syscall_call+0x7/0xb
kernel: ======================= 

Version-Release number of selected component (if applicable):
2.6.18-8.1.6.el5PAE

How reproducible:
regularly (4 cases reported in the last 2 days from a universe of 336 boxes)

Steps to Reproduce:
1. create a read only mountpoint to a samba server
2. run 2 processes (better more) in an SMP server with a PAE enabled kernel that
stat files in that mount point
3. wait for it to lock (ping will work, but won't be able to ssh anymore)
  
Actual results:
system locks and requires a manual reboot (not even sysrq will respond in the
console)

Expected results:
no lock

Additional info:
in the server the smb process is still up and connected from the defunct server,
including oplocks to the file that was being stat as exclusive.

the TCP/IP stack of the defunct server seems to be operative, but no process
that requires a fork/exec would work (rsync/sshd) and will hung TCP connections
after the 3 way handshake is completed.

probably already fixed upstream in version of cifs 1.48 as shown by the linked
bug, with the commit linked in URL
Comment 1 Jeff Layton 2007-07-10 08:01:44 EDT
There's a pending update of the CIFS code already slated for inclusion into 5.1.
If you have a non-critical place to do so, could you test the kernels on my
people page and see if they help this issue:

http://people.redhat.com/jlayton
Comment 2 Jeff Layton 2008-01-15 06:24:07 EST
No response to my query from July. I believe this is already fixed in the 5.1
release. Please reopen if that is not the case.

Note You need to log in before you can comment on or make changes to this bug.