476094 – do_page_fault causes kernel OOPS and crash dump

Bug 476094 - do_page_fault causes kernel OOPS and crash dump

Summary: do_page_fault causes kernel OOPS and crash dump

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.2
Hardware:	i686
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Danny Feng
QA Contact:	Red Hat Kernel QE team
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2008-12-11 21:47 UTC by Peter Koczan
Modified:	2009-08-19 02:34 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2009-08-19 02:34:46 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
The dmesg output from a crashdump'd host (100.80 KB, text/plain) 2009-08-17 15:35 UTC, Peter Koczan	no flags	Details
The vmcore from a recent crashdump (2 GB) (54 bytes, text/plain) 2009-08-18 18:16 UTC, Peter Koczan	no flags	Details
View All

Description Peter Koczan 2008-12-11 21:47:56 UTC

Description of problem:
There is a kernel OOPS and crash dump in certain processes during I/O operations. There is little in common except the top of the backtrace, which routinely shows errors being thrown in do_page_fault.

Version-Release number of selected component (if applicable):
2.6.18-92.1.17.el5 stock Linux kernel (version of crash is 4.0-5.0.3)

How reproducible:
Not very. It has happened during a backup process and during an rsync, so it appears to happen during times of high I/O usage. There's no apparent regularity and there's no high load otherwise on the machine, and there's no one module which appears to be tickling the bug.

Actual results:
Kernel OOPS and crash dump.

Expected results:
No kernel OOPS or crash dump.

Additional info:
Here's some basic info using the crash dump utility (just the initial crash output and a backtrace) from two such dumps which happened today.

(rsync):
[root@aardvark scratch]# crash /boot/System.map-2.6.18-92.1.17.el5 /usr/lib/debug/lib/modules/2.6.18-92.1.17.el5/vmlinux /scratch/2008-12-11-13\:17/vmcore 

...

  SYSTEM MAP: /boot/System.map-2.6.18-92.1.17.el5                      
DEBUG KERNEL: /usr/lib/debug/lib/modules/2.6.18-92.1.17.el5/vmlinux (2.6.18-92.1.17.el5)
    DUMPFILE: /scratch/2008-12-11-13:17/vmcore
        CPUS: 2
        DATE: Thu Dec 11 13:16:33 2008
      UPTIME: 3 days, 22:27:13
LOAD AVERAGE: 0.25, 0.37, 0.29
       TASKS: 97
    NODENAME: aardvark.cs.wisc.edu
     RELEASE: 2.6.18-92.1.17.el5
     VERSION: #1 SMP Wed Oct 22 04:19:07 EDT 2008
     MACHINE: i686  (2992 Mhz)
      MEMORY: 1 GB
       PANIC: "Oops: 0000 [#1]" (check log for details)
         PID: 5358
     COMMAND: "rsync"
        TASK: dd2c4000  [THREAD_INFO: dad60000]
         CPU: 0
       STATE: TASK_RUNNING (PANIC)

crash> bt
PID: 5358   TASK: dd2c4000  CPU: 0   COMMAND: "rsync"
 #0 [dad60b80] crash_kexec at c0444202
 #1 [dad60bc4] die at c04064ae
 #2 [dad60bf4] do_page_fault at c060ac56
 #3 [dad60c2c] error_code (via page_fault) at c0405a6f
    EAX: 0000003e  EBX: 30363773  ECX: 7776e92c  EDX: 00000b12  EBP: 00000007 
    DS:  007b      ESI: dad60cd0  ES:  007b      EDI: dad60d2c
    CS:  0060      EIP: f961b9ec  ERR: ffffffff  EFLAGS: 00010202 
 #4 [dad60c60] afs_FindVCache at f961b9ec
 #5 [dad60c74] afs_GetVCache at f961da12
 #6 [dad60ce4] afs_lookup at f962716e
 #7 [dad60d7c] afs_linux_lookup at f964a677
 #8 [dad60dec] do_lookup at c047c33c
 #9 [dad60e10] __link_path_walk at c047e0f0
#10 [dad60e64] link_path_walk at c047e5f2
#11 [dad60ecc] do_path_lookup at c047e9bf
#12 [dad60ee8] __user_walk_fd at c047f103
#13 [dad60f00] vfs_lstat_fd at c0478cba
#14 [dad60f64] sys_lstat64 at c0478d26
#15 [dad60fb8] system_call at c0404ef8
    EAX: 000000c4  EBX: bfbe5470  ECX: bfbe5390  EDX: 00715ff4 
    DS:  007b      ESI: bfbe5390  ES:  007b      EDI: 0000000a
    SS:  007b      ESP: bfbe30f0  EBP: bfbe30f8
    CS:  0073      EIP: 00825402  ERR: 000000c4  EFLAGS: 00000212 

(sushiremote, our backups process):
[root@bombus-vestalis crash]# crash /boot/System.map-2.6.18-92.1.17.el5 /usr/lib/debug/lib/modules/2.6.18-92.1.17.el5/vmlinux 2008-12-11-03\:09/vmcore 

...

  SYSTEM MAP: /boot/System.map-2.6.18-92.1.17.el5                      
DEBUG KERNEL: /usr/lib/debug/lib/modules/2.6.18-92.1.17.el5/vmlinux (2.6.18-92.1.17.el5)
    DUMPFILE: 2008-12-11-03:09/vmcore
        CPUS: 2
        DATE: Thu Dec 11 03:08:43 2008
      UPTIME: 17:03:33
LOAD AVERAGE: 4.53, 1.47, 0.57
       TASKS: 115
    NODENAME: bombus-vestalis.cs.wisc.edu
     RELEASE: 2.6.18-92.1.17.el5
     VERSION: #1 SMP Wed Oct 22 04:19:07 EDT 2008
     MACHINE: i686  (3400 Mhz)
      MEMORY: 1 GB
       PANIC: "Oops: 0000 [#1]" (check log for details)
         PID: 12881
     COMMAND: "sushiremote"
        TASK: dcd41000  [THREAD_INFO: ec7f0000]
         CPU: 0
       STATE: TASK_RUNNING (PANIC)

crash> bt
PID: 12881  TASK: dcd41000  CPU: 0   COMMAND: "sushiremote"
 #0 [ec7f0eb8] crash_kexec at c0444202
 #1 [ec7f0efc] die at c04064ae
 #2 [ec7f0f2c] do_page_fault at c060ac56
 #3 [ec7f0f64] error_code (via page_fault) at c0405a6f
    EAX: 000001b6  EBX: 00008241  ECX: 000001a4  EDX: 000000e8  EBP: 00000004 
    DS:  007b      ESI: ffffff9c  ES:  007b      EDI: 00000003
    CS:  0060      EIP: c046fb67  ERR: ffffffff  EFLAGS: 00010207 
 #4 [ec7f0f98] do_sys_open at c046fb67
 #5 [ec7f0fb0] sys_open at c046fbe9
 #6 [ec7f0fb8] system_call at c0404ef8
    EAX: 00000005  EBX: 0967a0e0  ECX: 00008241  EDX: 000001b6 
    DS:  007b      ESI: 00008241  ES:  007b      EDI: 0029dff4
    SS:  007b      ESP: bfae9a78  EBP: bfae9a98
    CS:  0073      EIP: 00b15402  ERR: 00000005  EFLAGS: 00000206

Comment 1 Peter Koczan 2008-12-15 19:22:22 UTC

We've had about 10 crash dumps in the past two days. They all show the OOPS error and do_page_fault as #2 on the backtrace. Some of them crashed when running vi and vim.

We've also had quite a few kernel panics in this same time. Unfortunately these hosts don't crash dump, but I suspect a similar problem. This might be a bigger problem than I initially thought.

Comment 2 Peter Koczan 2008-12-18 22:19:17 UTC

One of my co-workers sent in a bug report to the OpenAFS folks. I can't say for certain, but it might be related to this.

http://rt.central.org/rt/Ticket/Display.html?id=123995

A few more have crashed in the past couple days. vmcores are available upon request.

Comment 3 Peter Koczan 2009-02-11 19:06:05 UTC

Point of interest, this is still happening on the RHEL 5.3 kernel (2.6.18-128.el5)

Comment 4 Danny Feng 2009-08-11 04:29:57 UTC

Could you please attach the vmcore and the full dmesg log?

Comment 5 Peter Koczan 2009-08-17 15:35:43 UTC

Created attachment 357665 [details]
The dmesg output from a crashdump'd host

Comment 6 Peter Koczan 2009-08-17 21:19:14 UTC

I am unable to upload the 2 GB vmcore file. Please advise.

Comment 7 Peter Koczan 2009-08-18 18:16:53 UTC

Created attachment 357837 [details]
The vmcore from a recent crashdump (2 GB)

The kernel is kernel 2.6.18-128.2.1.el5 on an i686, but this has happened for several kernels.

Comment 8 Dave Anderson 2009-08-18 21:14:26 UTC

Because Danny is in China, he won't be seeing this for several more
hours, so I took the liberty of looking at the vmcore.
  
There's a bunch of AFS-related soft lockup error messages, which in all
probability have something to do with this eventual page fault BUG:

  BUG: unable to handle kernel NULL pointer dereference at virtual address 0000000b

The crash occurred in do_sys_open():
  
  crash> bt
  PID: 22328  TASK: dbdeb000  CPU: 0   COMMAND: "sushiremote"
   #0 [eb62beb8] crash_kexec at c0442d4e
   #1 [eb62befc] die at c04064c6
   #2 [eb62bf2c] do_page_fault at c0611537
   #3 [eb62bf64] error_code (via page_fault) at c0405a87
      EAX: 000001b6  EBX: 00008241  ECX: 000001b6  EDX: 0000007c  EBP: 00000006 
      DS:  007b      ESI: ffffff9c  ES:  007b      EDI: 00000003
      CS:  0060      EIP: c04712df  ERR: ffffffff  EFLAGS: 00010207 
   #4 [eb62bf98] do_sys_open at c04712df
   #5 [eb62bfb0] sys_open at c0471361
   #6 [eb62bfb8] system_call at c0404f10
      EAX: 00000005  EBX: 0a148548  ECX: 00008241  EDX: 000001b6 
      DS:  007b      ESI: 00008241  ES:  007b      EDI: 00379ff4
      SS:  007b      ESP: bf8bc8c8  EBP: bf8bc8e8
      CS:  0073      EIP: 00a2d402  ERR: 00000005  EFLAGS: 00000206 
  crash>

at address c04712df:

  crash> dis -rl c04712df
  /usr/src/debug/kernel-2.6.18/linux-2.6.18.i686/fs/open.c: 1156
  0xc047128b <do_sys_open>:       push   %ebp
  0xc047128c <do_sys_open+1>:     push   %edi
  0xc047128d <do_sys_open+2>:     push   %esi
  0xc047128e <do_sys_open+3>:     mov    %eax,%esi
  0xc0471290 <do_sys_open+5>:     push   %ebx
  0xc0471291 <do_sys_open+6>:     mov    %edx,%eax
  0xc0471293 <do_sys_open+8>:     sub    $0x4,%esp
  0xc0471296 <do_sys_open+11>:    mov    %ecx,%ebx
  /usr/src/debug/kernel-2.6.18/linux-2.6.18.i686/fs/open.c: 1157
  0xc0471298 <do_sys_open+13>:    call   0xc047f47e <getname>
  /usr/src/debug/kernel-2.6.18/linux-2.6.18.i686/fs/open.c: 1160
  0xc047129d <do_sys_open+18>:    cmp    $0xfffff000,%eax
  0xc04712a2 <do_sys_open+23>:    mov    %eax,%ebp
  /usr/src/debug/kernel-2.6.18/linux-2.6.18.i686/fs/open.c: 1157
  0xc04712a4 <do_sys_open+25>:    mov    %eax,(%esp)
  /usr/src/debug/kernel-2.6.18/linux-2.6.18.i686/fs/open.c: 1160
  0xc04712a7 <do_sys_open+28>:    ja     0xc0471331 <do_sys_open+166>
  /usr/src/debug/kernel-2.6.18/linux-2.6.18.i686/fs/open.c: 1161
  0xc04712ad <do_sys_open+34>:    call   0xc0470f87 <get_unused_fd>
  /usr/src/debug/kernel-2.6.18/linux-2.6.18.i686/fs/open.c: 1162
  0xc04712b2 <do_sys_open+39>:    test   %eax,%eax
  /usr/src/debug/kernel-2.6.18/linux-2.6.18.i686/fs/open.c: 1161
  0xc04712b4 <do_sys_open+41>:    mov    %eax,%ebp
  /usr/src/debug/kernel-2.6.18/linux-2.6.18.i686/fs/open.c: 1162
  0xc04712b6 <do_sys_open+43>:    js     0xc0471329 <do_sys_open+158>
  /usr/src/debug/kernel-2.6.18/linux-2.6.18.i686/fs/open.c: 1163
  0xc04712b8 <do_sys_open+45>:    pushl  0x18(%esp)
  0xc04712bc <do_sys_open+49>:    mov    %ebx,%ecx
  0xc04712be <do_sys_open+51>:    mov    0x4(%esp),%edx
  0xc04712c2 <do_sys_open+55>:    mov    %esi,%eax
  0xc04712c4 <do_sys_open+57>:    call   0xc047125a <do_filp_open>
  0xc04712c9 <do_sys_open+62>:    mov    %eax,%edi
  /usr/src/debug/kernel-2.6.18/linux-2.6.18.i686/fs/open.c: 1164
  0xc04712cb <do_sys_open+64>:    cmp    $0xfffff000,%edi
  0xc04712d1 <do_sys_open+70>:    pop    %eax
  0xc04712d2 <do_sys_open+71>:    jbe    0xc04712df <do_sys_open+84>
  /usr/src/debug/kernel-2.6.18/linux-2.6.18.i686/fs/open.c: 1165
  0xc04712d4 <do_sys_open+73>:    mov    %ebp,%eax
  0xc04712d6 <do_sys_open+75>:    mov    %edi,%ebp
  0xc04712d8 <do_sys_open+77>:    call   0xc0470ea5 <put_unused_fd>
  0xc04712dd <do_sys_open+82>:    jmp    0xc0471329 <do_sys_open+158>
  /usr/src/debug/kernel-2.6.18/linux-2.6.18.i686/fs/open.c: 1168
  0xc04712df <do_sys_open+84>:    mov    0x8(%edi),%eax
  crash> 
    
where the call to do_filp_open() returned a value of 00000003,
as seen in EDI, and therefore the IS_ERR() macro presumed it was
a valid file pointer.   do_filp_open() should only return either:

 (1) a valid kernel virtual address of a file structure, or
 (2) a negative errno value.

Those two types of return values are determined by the IS_ERR()
macro.  

  #define MAX_ERRNO       4095
  
  #define IS_ERR_VALUE(x) unlikely((x) >= (unsigned long)-MAX_ERRNO)
  
  static inline long IS_ERR(const void *ptr)
  {
          return IS_ERR_VALUE((unsigned long)ptr);
  }
  
If the return value is greater-than-or-equal-to fffff001 (-4095)
it's considered an error, otherwise it's accepted as legitimate
pointer value.  So a bogus return value of 00000003 will be accepted
as a kernel virtual address.  The page fault occurred when the
00000003 was subsequently used as a file pointer address, when the
call to fsnotify_open() tried to dereference "f->f_dentry", which
is the bad address of 0000000b, or 00000003 + the f_dentry offset
of 8:
  
  long do_sys_open(int dfd, const char __user *filename, int flags, int mode)
  {
          char *tmp = getname(filename);
          int fd = PTR_ERR(tmp);
  
          if (!IS_ERR(tmp)) {
                  fd = get_unused_fd();
                  if (fd >= 0) {
                          struct file *f = do_filp_open(dfd, tmp, flags, mode);
                          if (IS_ERR(f)) {
                                  put_unused_fd(fd);
                                  fd = PTR_ERR(f);
                          } else {
                                  fsnotify_open(f->f_dentry);
                                  fd_install(fd, f);
                          }
                  }
                  putname(tmp);
          }
          return fd;
  }
  
So the question is: how was it possible that do_filp_open() returned
something other than:

 (1) a negative errno value, or
 (2) a legitimate file pointer?
  
Looking at the possible return values from do_filp_open():

  static struct file *do_filp_open(int dfd, const char *filename, int flags,
                                   int mode)
  {
          int namei_flags, error;
          struct nameidata nd;
  
          namei_flags = flags;
          if ((namei_flags+1) & O_ACCMODE)
                  namei_flags++;
  
          error = open_namei(dfd, filename, namei_flags, mode, &nd);
          if (!error)
                  return nameidata_to_filp(&nd, flags);
  
          return ERR_PTR(error);
  }
  
open_namei() can only return 0 or a negative error number.  So if
it did return an error, then do_filp_open() would just pass it back
unchanged, and the IS_ERR() would catch it correctly.  So open_namei()
must have returned a 0 (ok), and nameidata_to_filp() must have been
called -- and in turn must have returned the bogus 00000003:
  
  struct file *nameidata_to_filp(struct nameidata *nd, int flags)
  {
          struct file *filp;
  
          /* Pick up the filp from the open intent */
          filp = nd->intent.open.file;
          /* Has the filesystem initialised the file for us? */
          if (filp->f_dentry == NULL)
                  filp = __dentry_open(nd->dentry, nd->mnt, flags, filp, NULL);
          else
                  path_release(nd);
          return filp;
  }

So the "filp" pointer above taken from nd->intent.open.file must 
have been an accessible kernel address because it was read OK, but 
it must have had a NULL f_dentry pointer.  So __dentry_open() must 
have been called, and *it* must have returned the 00000003 back
to filp.
  
Note that the original filp is passed to __dentry_open() above, and 
if it were to be returned by __dentry_open(), it would get the same 
address back, i.e. from argument/pointer "f" below:

  static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
  					int flags, struct file *f,
  					int (*open)(struct inode *, struct file *))
  {
  	struct inode *inode;
  	int error;
  
  	f->f_flags = flags;
  	f->f_mode = ((flags+1) & O_ACCMODE) | FMODE_LSEEK |
  				FMODE_PREAD | FMODE_PWRITE;
  	inode = dentry->d_inode;
  	if (f->f_mode & FMODE_WRITE) {
  		error = get_write_access(inode);
  		if (error)
  			goto cleanup_file;
  	}
  
  	f->f_mapping = inode->i_mapping;
  	f->f_dentry = dentry;
  	f->f_vfsmnt = mnt;
  	f->f_pos = 0;
  	f->f_op = fops_get(inode->i_fop);
  	file_move(f, &inode->i_sb->s_files);
  
  	if (!open && f->f_op)
  		open = f->f_op->open;
  	if (open) {
  		error = open(inode, f);
  		if (error)
  			goto cleanup_all;
  	}
  
  	f->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
  
  	file_ra_state_init(&f->f_ra, f->f_mapping->host->i_mapping);
  
  	/* NB: we're sure to have correct a_ops only after f_op->open */
  	if (f->f_flags & O_DIRECT) {
  		if (!f->f_mapping->a_ops ||
  		    ((!f->f_mapping->a_ops->direct_IO) &&
  		    (!f->f_mapping->a_ops->get_xip_page))) {
  			fput(f);
  			f = ERR_PTR(-EINVAL);
  		}
  	}
  
  	return f;
  
  cleanup_all:
  	fops_put(f->f_op);
  	if (f->f_mode & FMODE_WRITE)
  		put_write_access(inode);
  	file_kill(f);
  	f->f_dentry = NULL;
  	f->f_vfsmnt = NULL;
  cleanup_file:
  	put_filp(f);
  	dput(dentry);
  	mntput(mnt);
  	return ERR_PTR(error);
  }

So __dentry_open() must have passed through the "cleanup_all" or
"cleanup_file" labels, where ERR_PTR(error) seems to have returned
the 00000003.  But 00000003 cannot have come from get_write_access(), 
because that function can only return 0 or -ETXTBSY.  So that leaves
the filesystem-dependent "open" call, because the calling function
nameidata_to_filp() had passed in a NULL "open" argument:

        if (!open && f->f_op)
                open = f->f_op->open;
        if (open) {
                error = open(inode, f);
                if (error)
                        goto cleanup_all;
        }

And here's where the AFS filesystem comes into play.  The nameidata
structure was a stack variable and it got overwritten.  But we can
probably deduce that the file being opened was in the afs filesystem:
  
  crash> bt
  PID: 22328  TASK: dbdeb000  CPU: 0   COMMAND: "sushiremote"
   #0 [eb62beb8] crash_kexec at c0442d4e
   #1 [eb62befc] die at c04064c6
   #2 [eb62bf2c] do_page_fault at c0611537
   #3 [eb62bf64] error_code (via page_fault) at c0405a87
      EAX: 000001b6  EBX: 00008241  ECX: 000001b6  EDX: 0000007c  EBP: 00000006 
      DS:  007b      ESI: ffffff9c  ES:  007b      EDI: 00000003
      CS:  0060      EIP: c04712df  ERR: ffffffff  EFLAGS: 00010207 
   #4 [eb62bf98] do_sys_open at c04712df
   #5 [eb62bfb0] sys_open at c0471361
   #6 [eb62bfb8] system_call at c0404f10
      EAX: 00000005  EBX: 0a148548  ECX: 00008241  EDX: 000001b6 
      DS:  007b      ESI: 00008241  ES:  007b      EDI: 00379ff4
      SS:  007b      ESP: bf8bc8c8  EBP: bf8bc8e8
      CS:  0073      EIP: 00a2d402  ERR: 00000005  EFLAGS: 00000206 
  crash>

The filename being opened is in the task's userspace, and was passed
into the kernel in the EBX register:

  crash> rd -u 0a148548 50
   a148548:  732f732f 69687375 756f6d2f 6f70746e   /s/sushi/mountpo
   a148558:  73746e69 762e622f 312e6c6f 32303532   ints/b.vol.12502
   a148568:  35353932 30322f33 302e3930 35312e38   29553/2009.08.15
   a148578:  7366413a 632e703a 6f646e6f 77732e72   :Afs:p.condor.sw
   a148588:  77656e2d 3030322f 31383039 38312e35   -new/20090815.18
   a148598:  39323232 6474732f 00727265 007a672e   2229/stderr..gz.
   a1485a8:  00230000 00000071 00000000 00000000   ..#.q...........
   a1485b8:  00000000 00000000 00000000 00000000   ................
   a1485c8:  00000000 00000000 093f6164 0937643c   ........da?.<d7.
   a1485d8:  0937643c 00000000 00000000 00000000   <d7.............
   a1485e8:  00000000 00000000 0000003c 00000000   ........<.......
   a1485f8:  00000000 00000000 00000000 00000000   ................
   a148608:  00000000 00000000                     ........
  crash>
  
I'm guessing that "/s" is perhaps linked to "/afs/cs.wisc.edu/s",
given that the task has another file opened in similarly-named 
directory tree:
  
  crash> files
  PID: 22328  TASK: dbdeb000  CPU: 0   COMMAND: "sushiremote"
  ROOT: /    CWD: /afs/cs.wisc.edu/u/s/u/sushi
   FD    FILE     DENTRY    INODE    TYPE  PATH
    0  f787b0c0  d893e2b0  de8cb6e8  SOCK  socket:/[1448774]
    1  f787b0c0  d893e2b0  de8cb6e8  SOCK  socket:/[1448774]
    2  ddbc22c0  d893e118  cd45e0e8  SOCK  socket:/[1448776]
    3  cb522a80  d9fe4d50  e96bbb28  SOCK  socket:/[1449031]
    4  f7080980  e4a14668  d8111040  REG   /afs/cs.wisc.edu/s/sushi-0.3.16/mountpoints/b.vol.1250229553/2009.08.15:Afs:p.condor.sw-new/20090815.182229/data0.gz
    5  d3212680  e1a48bb8  e12c8e50  PIPE  
    7  e9f3f200  e444d1a0  ca137370  PIPE  
  crash> 

And taking the "data0.gz" file structure's file struct at f7080980,
its f_op.f_open leads to afs_linux_open(): 
    
  crash> struct file.f_op f7080980
  f_op = 0xf95e0460, 
  crash> sym 0xf95e0460
  f95e0460 (d) afs_file_fops
  crash> file_operations.open f95e0460
  open = 0xf95bbca7, 
  crash> sym 0xf95bbca7
  f95bbca7 (t) afs_linux_open  
  crash>       

So it appears that afs_linux_open() returned the 00000003, but there's
no way we can tell given that it's a third-party module.

Danny, can you verify my logic above?

Comment 9 Danny Feng 2009-08-19 02:34:46 UTC

Thanks Dave for the excellent analysis, I totally agree with you. This is a known security issue in OpenAFS, please refer http://www.openafs.org/security/OPENAFS-SA-2009-002.txt and take the patch for openAFS

Since this is an openAFS issue, not kernel bugs, close the bug for rhel.

Note You need to log in before you can comment on or make changes to this bug.