Bug 444653 - oops in free_uid when using smbd
oops in free_uid when using smbd
Status: CLOSED DUPLICATE of bug 441282
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.5.z
All Linux
urgent Severity high
: rc
: ---
Assigned To: Michal Schmidt
Martin Jenner
:
Depends On:
Blocks: 391511 461297
  Show dependency treegraph
 
Reported: 2008-04-29 15:26 EDT by Mike Snitzer
Modified: 2008-09-22 13:19 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-09-22 13:19:45 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Mike Snitzer 2008-04-29 15:26:23 EDT
Description of problem:
When smbd is under heavy load the RHEL4.5 kernel (and likely all current RHEL
kernels, RHEL5 included) will eventually hit a race that causes free_uid to NULL
pointer.

Version-Release number of selected component (if applicable):
2.6.9-55.0.12.ELsmp

How reproducible:
In production environments that make heavy use of smbd; the issue has hit 4
times in the past 2 weeks.

Steps to Reproduce:
1. run RHEL4.5 kernel
2. put heavy load on samba with many users
3. eventually you'll lose this race
  
Actual results:
Unable to handle kernel paging request at 0000000000100108 RIP: 
<ffffffff801411f7>{free_uid+45}
...
Process smbd (pid: 2227, threadinfo 0000010038b72000, task 0000010087814030)
Stack: 0000000000000000 0000000000000002 0000010237e066e8 ffffffff801419c9 
       0000000000000000 0000010038b73e78 0000010087814030 0000010087814708 
       0000010038b73f58 ffffffff80141a7e 
Call Trace:<ffffffff801419c9>{__dequeue_signal+347}
           <ffffffff80141a7e>{dequeue_signal+58} 
           <ffffffff801435ca>{get_signal_to_deliver+338}
           <ffffffff8010f6fb>{do_signal+131} 
           <ffffffff8030c8f6>{thread_return+88}
           <ffffffff801102f3>{sysret_signal+28} 
           <ffffffff801105df>{ptregscall_common+103} 

Code: 48 89 50 08 48 89 02 48 c7 41 08 00 02 20 00 48 8b 7b 38 48 
RIP <ffffffff801411f7>{free_uid+45} RSP <0000010038b73d98>
CR2: 0000000000100108

Expected results:
No NULL pointer.

Additional info:
Linus Torvalds fixed the issue upstream in 2.6.19-rc4:
http://lkml.org/lkml/2006/11/4/45
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=45c18b0

RHEL5 also doesn't have this fix and it should.
Comment 1 Mike Snitzer 2008-04-29 15:31:48 EDT
I mistakenly said "NULL pointer" in a couple places where I should've said "Oops"
Comment 2 Issue Tracker 2008-07-15 12:41:59 EDT
(just triaging)

99% sure this is the same problem reported in IT 173279 / BZ 441282.  A
hotfix kernel was released for this issue last week and the 4.7 kernel
will also have the fix.

Hotfix # is 2756.

--vince


This event sent from IssueTracker by vincew 
 issue 191745
Comment 5 RHEL Product and Program Management 2008-09-03 08:56:13 EDT
Updating PM score.
Comment 6 Peter Martuccelli 2008-09-22 13:19:45 EDT

*** This bug has been marked as a duplicate of bug 441282 ***

Note You need to log in before you can comment on or make changes to this bug.