529227 – (CVE-2009-3726) CVE-2009-3726 kernel: nfsv4: kernel panic in nfs4_proc_lock()

Bug 529227 (CVE-2009-3726) - CVE-2009-3726 kernel: nfsv4: kernel panic in nfs4_proc_lock()

Summary: CVE-2009-3726 kernel: nfsv4: kernel panic in nfs4_proc_lock()

Keywords:
Status:	CLOSED ERRATA
Alias:	CVE-2009-3726
Product:	Security Response
Classification:	Other
Component:	vulnerability
Sub Component:
Version:	unspecified
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Assignee:	Red Hat Product Security
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	533114 533115 533117 533118 533119 537301
Blocks:
TreeView+	depends on / blocked

Reported:	2009-10-15 15:26 UTC by Simon Vallet
Modified:	2023-05-13 01:59 UTC (History)
CC List:	13 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2021-10-19 09:08:18 UTC
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2009:1635	normal	SHIPPED_LIVE	Important: kernel-rt security, bug fix, and enhancement update	2009-12-03 11:27:50 UTC
Red Hat Product Errata	RHSA-2009:1670	normal	SHIPPED_LIVE	Important: kernel security and bug fix update	2009-12-15 17:18:47 UTC
Red Hat Product Errata	RHSA-2010:0474	normal	SHIPPED_LIVE	Important: kernel security and bug fix update	2010-06-15 23:53:35 UTC

Description Simon Vallet 2009-10-15 15:26:33 UTC

Hi,

Description of problem:

execution of a particular program from the Arachne suite reliably causes a kernel panic due to a NULL-pointer dereference in nfs4_proc_lock().

Version-Release number of selected component (if applicable):

2.6.18-164.2.1.el5

How reproducible:

always on NFSv4 mounted directories

Steps to Reproduce:
1. wget http://www.genoscope.cns.fr/externe/redhat/XMLMissingField
2. Save a copy on an NFSv4-mounted directory
3. Execute it
  
Actual results:

Kernel panic

Expected results:

No panic

Additional info:

Console output:

Unable to handle kernel NULL pointer dereference at 0000000000000030 RIP:
 [<ffffffff8837b210>] :nfs:nfs4_proc_lock+0x21f/0x3ad
PGD 1026eec067 PUD 1026f2f067 PMD 0
Oops: 0000 [1] SMP
last sysfs file: /block/dm-1/range
CPU 0
Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler nfs lockd fscache nfs_acl sunrpc bonding ipv6 xfrm_nalgo crypto_api video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac joydev sg shpchp i2c_nforce2 i2c_core forcedeth dm_snapshot dm_zero dm_mod sata_nv libata mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 4070, comm: XMLMissingField Not tainted 2.6.18-164.2.1.el5 #1
RIP: 0010:[<ffffffff8837b210>]  [<ffffffff8837b210>] :nfs:nfs4_proc_lock+0x21f/0x3ad
RSP: 0018:ffff810819bdbdd8  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff810827c52088 RSI: 0000000000000006 RDI: ffff810819bdbe38
RBP: ffff81081a6dfdc0 R08: 0000000000000001 R09: ffff810819bdbd68
R10: ffff810819bdbd68 R11: 00000000000000d0 R12: ffff810827c52088
R13: 0000000000000000 R14: ffff810819a9b930 R15: 0000000000000006
FS:  00002b97d31fc7b0(0000) GS:ffffffff803c1000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000030 CR3: 00000010268cb000 CR4: 00000000000006e0
Process XMLMissingField (pid: 4070, threadinfo ffff810819bda000, task ffff810827d6a7e0)
Stack:  00000000000003e8 0000000000000000 ffff810819a9b930 ffffffff88373e4f
 0000000000000000 0000000000000000 0000000000000000 0000000019a9ba40
 ffff810819bdbe18 ffff810819bdbe18 0000000000000000 0000000000000000
Call Trace:
 [<ffffffff88373e4f>] :nfs:nfs_sync_inode_wait+0x116/0x1db
 [<ffffffff8836a226>] :nfs:do_setlk+0x55/0x8c
 [<ffffffff80039e72>] fcntl_setlk+0x11e/0x273
 [<ffffffff800b66fa>] audit_syscall_entry+0x180/0x1b3
 [<ffffffff8002e5bb>] sys_fcntl+0x269/0x2dc
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0


Code: 49 8b 45 30 4c 89 e6 4c 89 ef 45 8a 74 24 58 48 8b 40 18 48
RIP  [<ffffffff8837b210>] :nfs:nfs4_proc_lock+0x21f/0x3ad
 RSP <ffff810819bdbdd8>
CR2: 0000000000000030
 <0>Kernel panic - not syncing: Fatal exception

Also see http://www.spinics.net/linux/lists/linux-nfs/msg03357.html

Comment 1 Jeff Layton 2009-11-02 11:55:50 UTC

Here's what this binary does:

open("/proc/self/exe", O_RDONLY)        = 3
fcntl(3, F_SETLK, {type=F_RDLCK, whence=SEEK_SET, start=0, len=0}

I'll see if I can roll up a more stripped down reproducer (with source code).

Comment 2 Jeff Layton 2009-11-02 12:09:30 UTC

Here's the assembly from the crash:

   169ec:       e8 37 f0 ff ff          callq  15a28 <nfs4_wait_for_completion_rpc_task>
   169f1:       48 89 df                mov    %rbx,%rdi
   169f4:       89 c5                   mov    %eax,%ebp
   169f6:       e8 00 00 00 00          callq  169fb <nfs4_proc_lock+0x217>
   169fb:       45 88 b4 24 88 00 00    mov    %r14b,0x88(%r12)
   16a02:       00 
   16a03:       e9 ab 01 00 00          jmpq   16bb3 <nfs4_proc_lock+0x3cf>
   16a08:       bd 00 fe ff ff          mov    $0xfffffe00,%ebp
   16a0d:       e9 a1 01 00 00          jmpq   16bb3 <nfs4_proc_lock+0x3cf>
   16a12:       fc                      cld    
   16a13:       48 8d 7c 24 50          lea    0x50(%rsp),%rdi
   16a18:       31 c0                   xor    %eax,%eax
   16a1a:       ab                      stos   %eax,%es:(%rdi)
   16a1b:       ab                      stos   %eax,%es:(%rdi)
   16a1c:       ab                      stos   %eax,%es:(%rdi)
   16a1d:       ab                      stos   %eax,%es:(%rdi)
   16a1e:       49 8b 45 30             mov    0x30(%r13),%rax  <<<< CRASH HERE
   16a22:       4c 89 e6                mov    %r12,%rsi
   16a25:       4c 89 ef                mov    %r13,%rdi
   16a28:       45 8a b4 24 88 00 00    mov    0x88(%r12),%r14b
   16a2f:       00 
   16a30:       48 8b 40 48             mov    0x48(%rax),%rax
   16a34:       48 89 44 24 08          mov    %rax,0x8(%rsp)
   16a39:       e8 00 00 00 00          callq  16a3e <nfs4_proc_lock+0x25a>
   16a3e:       85 c0                   test   %eax,%eax
   16a40:       89 c3                   mov    %eax,%ebx
   16a42:       0f 85 e4 00 00 00       jne    16b2c <nfs4_proc_lock+0x348>
   16a48:       41 80 8c 24 88 00 00    orb    $0x8,0x88(%r12)
   16a4f:       00 08 
   16a51:       49 8b bc 24 80 00 00    mov    0x80(%r12),%rdi
   16a58:       00 
   16a59:       4c 89 e6                mov    %r12,%rsi
   16a5c:       e8 c9 e8 ff ff          callq  1532a <do_vfs_lock>

Comment 3 Jeff Layton 2009-11-02 12:43:46 UTC

Created attachment 367124 [details]
reproducer program

This program can reproduce the problem easily...

I was hoping that the patch for 511278 would fix this too, but that doesn't seem to be the case. Still working on determining the cause, there are a lot of inlined functions in this area so it's hard to tell exactly where it crashed.

Comment 4 Jeff Layton 2009-11-02 15:56:05 UTC

The crash is actually occurring in nfs4_proc_setlk(). It does this:

        do {
                err = nfs4_handle_exception(NFS_SERVER(state->inode),
                                _nfs4_proc_setlk(state, cmd, request),
                                &exception);
        } while (exception.retry);

...but the problem here is that "state" is NULL and dereferencing the inode pointer in it causes the crash. So the real problem appears to be in the open code which is where state should be getting set.

Comment 5 Jeff Layton 2009-11-02 16:12:45 UTC

A little bit of printk debugging:

------------------[snip]--------------------
nfs4_intent_set_file: filp=ffff810029c75a70

Call Trace:
 [<ffffffff88578c54>] :nfs:nfs4_intent_set_file+0x3e/0x6f
 [<ffffffff88579e9e>] :nfs:nfs4_open_revalidate+0x8c/0x43f
 [<ffffffff88564a2b>] :nfs:nfs_open_revalidate+0x13d/0x21d
 [<ffffffff8000d8ac>] do_lookup+0x191/0x1e8
 [<ffffffff8000a5bc>] __link_path_walk+0xa1d/0xf62
 [<ffffffff8000f370>] link_path_walk+0x42/0xb2
 [<ffffffff8000d59a>] do_path_lookup+0x279/0x2f6
 [<ffffffff80024a13>] __path_lookup_intent_open+0x56/0x97
 [<ffffffff8003e3ad>] open_exec+0x24/0xc0
 [<ffffffff80041036>] do_execve+0x32/0x1f7
 [<ffffffff80041036>] do_execve+0x32/0x1f7
 [<ffffffff8000cece>] cache_alloc_debugcheck_after+0xef/0x1c1
 [<ffffffff8004104a>] do_execve+0x46/0x1f7
 [<ffffffff8005796a>] sys_execve+0x36/0x4c
 [<ffffffff8006050b>] stub_execve+0x67/0xb0

nfs4_proc_lock: filp=ffff81002a302c20 ctx=ffff810027702b90
nfs4_proc_setlk: state=0000000000000000 file_lock=ffff810038c88838
nfs4_proc_setlk: calling nfs4_handle_exception
Unable to handle kernel NULL pointer dereference at 0000000000000030 RIP: 
 [<ffffffff88576ac8>] :nfs:nfs4_proc_lock+0x286/0x459
PGD 27abe067 PUD 27abd067 PMD 0 
------------------[snip]--------------------

One major problem here. The filp passed to nfs4_proc_lock is different from the one that was set up in the execve call. Possibly a VFS bug where we're not attaching the filp instantiated in the lookup to the fd?

Comment 6 Jeff Layton 2009-11-02 16:18:01 UTC

Of course though...that *should* be the case. We're not interested in the filp associated with the execve call, but rather the one associated with the open().

Very confusing reproducer...I'll keep poking at it.

Comment 7 Jeff Layton 2009-11-03 12:09:53 UTC

Turns out that the nfs_debug facility in RHEL5 is broken (and has been since GA). Some messages don't get printed out. Here's some more output with that fixed:

---------------------[snip]--------------------
NFS: nfs_fhget(0:18/43651 ct=1)
nfs_file_open: i_ino=43651 filp=ffff810029db5380
nfs4_intent_set_file: filp=ffff810029db5380

Call Trace:
 [<ffffffff885790a0>] :nfs:nfs4_intent_set_file+0x3e/0x6f
 [<ffffffff8857ac44>] :nfs:nfs4_atomic_open+0x184/0x1a4
 [<ffffffff800239af>] d_alloc+0x24/0x1d3
 [<ffffffff8856489b>] :nfs:nfs_atomic_lookup+0xb2/0x105
 [<ffffffff8000d802>] do_lookup+0xe7/0x1e8
 [<ffffffff8000a5bc>] __link_path_walk+0xa1d/0xf62
 [<ffffffff8000f370>] link_path_walk+0x42/0xb2
 [<ffffffff8000d59a>] do_path_lookup+0x279/0x2f6
 [<ffffffff80024a13>] __path_lookup_intent_open+0x56/0x97
 [<ffffffff8003e3ad>] open_exec+0x24/0xc0
 [<ffffffff80041036>] do_execve+0x32/0x1f7
 [<ffffffff80041036>] do_execve+0x32/0x1f7
 [<ffffffff8000cece>] cache_alloc_debugcheck_after+0xef/0x1c1
 [<ffffffff8004104a>] do_execve+0x46/0x1f7
 [<ffffffff8005796a>] sys_execve+0x36/0x4c
 [<ffffffff8006050b>] stub_execve+0x67/0xb0

encode_compound: tag=
NFS: permission(0:18/43651), mask=0x1, res=0
nfs: read(/bz529227, 128@0)
NFS: nfs_readpages (0:18/43651 2)
NFS:    0 initiated read call (req 0:18/43651, 7578 bytes @ offset 0)
encode_compound: tag=
nfs_readpage_result:  111, (status 7578)
NFS: read done (0:18/43651 4096@0)
NFS: read done (0:18/43651 3482@4096)
nfs: read(/bz529227, 448@64)
nfs: read(/bz529227, 28@512)
NFS: permission(0:18/43651), mask=0x4, res=0
nfs: mmap(/bz529227)
nfs: mmap(/bz529227)
nfs: flush(0:18/43651)

    ^^^^ exec call ends...

NFS: permission(0:18/43651), mask=0x4, res=0
nfs_file_open: i_ino=43651 filp=ffff81001ef10d50

    ^^^^ open call -- no lookup

NFS: nfs_lock(f=0:18/43651, t=0, fl=1, r=0:9223372036854775807)
nfs4_proc_lock: filp=ffff81001ef10d50 ctx=ffff81001ecaf560
nfs4_proc_setlk: state=0000000000000000 file_lock=ffff810038d58268
nfs4_proc_setlk: calling nfs4_handle_exception
Unable to handle kernel NULL pointer dereference at 0000000000000030 RIP: 

    ^^^^ oops because context state is zeroed out

--------------------[snip]-----------------------

I think we need to ensure that the state is set on the open context when a lookup doesn't occur, or force a lookup on the second open here somehow.

I'll need to check more recent kernels and see how this works there.

Comment 8 Jeff Layton 2009-11-03 13:33:52 UTC

The really odd thing here is that in this situation, the second open doesn't seem to cause d_revalidate to get called. Obviously we need to do a lookup there, and it's probably in the dcache at that point. The VFS should call d_revalidate there, but I never see nfs_open_revalidate get called. That's the d_revalidate function for nfs4 so it should be getting called here.

I need to confirm whether I'm doing something wrong, but I don't think that I am.

Comment 9 Jeff Layton 2009-11-03 18:39:14 UTC

Ok, I've identified a patch that prevents this oops in rawhide. The downside is that it causes the SETLK call to return -ENOLCK there instead of getting a successful read lock. So I still think there's something not quite right here.

What we'll probably need to do is to take that patch in the near term, and then look at fixing this the "right way" upstream.

Comment 10 Jeff Layton 2009-11-03 19:35:05 UTC

I have some kernels on my people.redhat.com page that have a patch for this issue. If you have a non-critical place to do so, could you test them and report whether they fix the issue for you? The kernels are here:

    http://people.redhat.com/jlayton

Comment 11 RHEL Program Management 2009-11-04 02:11:33 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 12 Simon Vallet 2009-11-04 12:46:54 UTC

Hi,

thanks for looking into this. I have installed your test kernel (2.6.18-172.el5.jtltest.92) and can confirm that the panic does not happen anymore.

From what I've seen, the lock is acquired principally in order to prevent make from overwriting the executable, which in our case is unlikely -- so returning ENOLCK would be OK from our standpoint. The general problem remains, however: have you been able to contact upstream about this ?

For information, below is the corresponding source leading to the lock request:

// Place an advisory lock on the current executable file.  This will
// go away automatically when the process ends for any reason, as
// will the file descriptor that's needed to make the lock.  We use
// the lock to prevent make from overwriting the current executable,
// as that may have disastrous effects on the running process in
// certain circumstances.
void readLockThisExe()
{
  // Note: While this works on LSF and wga*, it does not work on the
  // alphas.  There is no such file so the open() will fail and no
  // lock will be placed.
  const char filename[] = "/proc/self/exe";
  int ret;
  int fid = open(filename, O_RDONLY, 0);
  if (fid>0) {
    ret = lock_reg(fid, F_SETLK, F_RDLCK, 0, SEEK_SET, 0);
    // Note we ignore errors in setting the lock.
  }
}

Simon

Comment 13 Simon Vallet 2009-11-04 14:09:20 UTC

Oops -- Here is lock_reg, as called in readLockThisExe():

// Helper for setting advisory locks, from Stevens.
int lock_reg(int fd, int cmd, int type, off_t offset, int whence, off_t len)
{
  struct flock lock;
  lock.l_type = type;
  lock.l_start = offset;
  lock.l_whence = whence;
  lock.l_len = len;
  return fcntl(fd, cmd, &lock);
}

Simon

Comment 15 Jeff Layton 2009-11-04 23:58:42 UTC

No, I haven't contacted upstream on this yet. I want to do a bit more research and see if I can determine the cause (and maybe generate a patch). In the interim though we'll see about getting the fix for the oops in an errata soon.

Comment 19 Eugene Teo (Security Response) 2009-11-05 05:23:34 UTC

Upstream commit:
http://git.kernel.org/linus/d953126a28f97ec965d23c69fd5795854c048f30

Comment 21 Jeff Layton 2009-11-05 13:41:40 UTC

(In reply to comment #3)
> Created an attachment (id=367124) [details]
> reproducer program
> 
> This program can reproduce the problem easily...
> 
> I was hoping that the patch for 511278 would fix this too, but that doesn't
> seem to be the case. Still working on determining the cause, there are a lot of
> inlined functions in this area so it's hard to tell exactly where it crashed.  

To clarify on how to use this...

Build this program and copy it to any nfs4 mount and run it.

Comment 23 Eugene Teo (Security Response) 2009-11-09 02:59:00 UTC

How to mitigate against NULL pointer dereference vulnerabilities?
Read http://kbase.redhat.com/faq/docs/DOC-20536

Comment 25 errata-xmlrpc 2009-12-03 11:27:53 UTC

This issue has been addressed in following products:

  MRG for RHEL-5

Via RHSA-2009:1635 https://rhn.redhat.com/errata/RHSA-2009-1635.html

Comment 26 Sachin Prabhu 2009-12-14 16:14:44 UTC

We have had a report with a different manifestation of the same bug this time
while unlocking a file having a NULL state.

============================================================================
BUG: unable to handle kernel NULL pointer dereference at virtual address
00000024
printing eip:
c060f3f1
*pde = 15ea2001
Oops: 0002 [#1]
SMP
last sysfs file: /class/net/eth7/address
Modules linked in: linuxshield(U) lshook(U) mptctl autofs4 hidp nfs lockd
fscache nfs_acl rfcomm l2c
ap bluetooth ipmi(U) smbus(U) sunrpc bonding loop dm_multipath scsi_dh video
sr_mod cdrom hwmon back
light sbs i2c_ec button battery asus_acpi ac parport_pc lp parport usb_storage
sg i2c_i801 i2c_core
tg3 pcspkr libphy bnx2 serio_raw dm_snapshot dm_zero dm_mirror dm_log dm_mod
megaraid_sas mptsas mpt
scsih mptbase scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd
ehci_hcd
CPU:    0
EIP:    0060:[<c060f3f1>]    Tainted: G      VLI
EFLAGS: 00010246   (2.6.18-128.2.1.el5PAE #1)
EIP is at _spin_lock+0x0/0xf
eax: 00000024   ebx: f91e3fa0   ecx: d7777f30   edx: d7777f30
esi: 00000000   edi: d7777f30   ebp: 00000000   esp: d7777e90
ds: 007b   es: 007b   ss: 0068
Process scanner (pid: 7956, ti=d7777000 task=d781a550 task.ti=d7777000)
Stack: f91dfee1 d7777f30 d75880c0 f91e3fa0 d7777f30 d7777f30 00000006 f91d6849
      00000000 000003e8 f28c4d40 f28c4e88 00000000 f91d0206 00000000 00000000
      00000000 00000000 00000000 d7777edc d7777edc f28c4f30 00000000 d7777f30
Call Trace:
[<f91dfee1>] nfs4_set_lock_state+0x4a/0xe9 [nfs]
[<f91d6849>] nfs4_proc_lock+0x16d/0x374 [nfs]
[<f91d0206>] nfs_sync_inode_wait+0xf5/0x1ab [nfs]
[<f91c75d7>] do_unlk+0x3f/0x58 [nfs]
[<f91c76f4>] nfs_lock+0x0/0xfb [nfs]
[<c0483d7b>] vfs_lock_file+0x2d/0x31
[<c048489e>] locks_remove_posix+0x7d/0x97
[<c0436ef1>] hrtimer_try_to_cancel+0x3c/0x42
[<c0427a3c>] do_setitimer+0xe0/0x4b0
[<c046fb8e>] filp_close+0x47/0x54
[<c0470dc1>] sys_close+0x71/0xa0
[<c0404f17>] syscall_call+0x7/0xb

Code: 79 05 f0 ff 02 30 c9 89 c8 c3 89 c2 f0 81 28 00 00 00 01 0f 94 c0 84 c0
b9 01 00 00 00 75 09 f
0 81 02 00 00 00 01 30 c9 89 c8 c3 <f0> fe 08 79 09 f3 90 80 38 00 7e f9 eb f2
c3 f0 81 28 00 00 00
EIP: [<c060f3f1>] _spin_lock+0x0/0xf SS:ESP 0068:d7777e90
=============================================================================

The crash happens at 

static struct nfs4_lock_state *nfs4_get_lock_state(struct nfs4_state *state,
fl_owner_t owner)
{
        struct nfs4_lock_state *lsp, *new = NULL;

        for(;;) {
                spin_lock(&state->state_lock); <-- here
                lsp = __nfs4_find_lock_state(state, owner);
}

The problem is caused by nfs4_proc_lock() passing a NULL value as state to
nfs4_proc_unlck().

static int
nfs4_proc_lock(struct file *filp, int cmd, struct file_lock *request)
{
..
        if (request->fl_type == F_UNLCK)
                return nfs4_proc_unlck(state, cmd, request);
..
}

Comment 27 errata-xmlrpc 2009-12-15 17:18:52 UTC

This issue has been addressed in following products:

  Red Hat Enterprise Linux 5

Via RHSA-2009:1670 https://rhn.redhat.com/errata/RHSA-2009-1670.html

Comment 31 Eugene Teo (Security Response) 2010-06-01 01:22:41 UTC

Acknowledgements:

Red Hat would like to thank Simon Vallet for responsibly reporting this issue.

Comment 32 errata-xmlrpc 2010-06-15 23:54:06 UTC

This issue has been addressed in following products:

  Red Hat Enterprise Linux 4

Via RHSA-2010:0474 https://rhn.redhat.com/errata/RHSA-2010-0474.html

Comment 35 Eugene Teo (Security Response) 2011-07-19 05:23:10 UTC

Statement:

The Linux kernel as shipped with Red Hat Enterprise Linux 3 did not have support for NFSv4, and therefore is not affected by this issue. It was addressed in Red Hat Enterprise Linux 4, 5, and Red Hat Enterprise MRG via https://rhn.redhat.com/errata/RHSA-2010-0474.html, https://rhn.redhat.com/errata/RHSA-2009-1670.html and https://rhn.redhat.com/errata/RHSA-2009-1635.html respectively.

Note You need to log in before you can comment on or make changes to this bug.