Bug 487255 - (CVE-2009-0835) CVE-2009-0835 kernel: x86-64: seccomp: 32/64 syscall hole
CVE-2009-0835 kernel: x86-64: seccomp: 32/64 syscall hole
Status: CLOSED ERRATA
Product: Security Response
Classification: Other
Component: vulnerability (Show other bugs)
unspecified
All Linux
high Severity high
: ---
: ---
Assigned To: Red Hat Product Security
public=20090225,source=full-disclosur...
: Security
Depends On:
Blocks: 487741
  Show dependency treegraph
 
Reported: 2009-02-24 22:25 EST by Eugene Teo (Security Response)
Modified: 2010-04-08 23:28 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 487741 (view as bug list)
Environment:
Last Closed: 2010-04-08 23:28:50 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Eugene Teo (Security Response) 2009-02-24 22:25:59 EST
Description of problem:
Various security technologies make use of syscall filtering. Such technologies are very powerful because they restrict a compromise not just in terms of access to files, networks and processes -- but also access to the rich kernel API (a great source of ring 0 bugs).

Syscall filtering technologies typically make an allow / deny decision based on the syscall (identified by a number) and sometimes also the exact arguments to the syscall. A vulnerability exists due to the identification of syscall by number. On 64-bit aware Linux kernels (x86_64), the syscall number can map to either the 32-bit or 64-bit syscall table. Since the syscall tables are different for 32-bit vs. 64-bit, and the user space process gets to control which table it hits, the syscall number check can often be fooled.

For example, a syscall filter technology might be monitoring a 64-bit process, and configured to allow some subset of the very common open() syscall. That's syscall number 2 in 64-bit land. However, the monitored process can switch to 32-bit mode and issue syscall 2. That appears to be open() to the monitor but will execute as fork() in the kernel - possibly leading to an unmonitored process.

Here is a sample piece of code which does a 32-bit syscall:

int
main(int argc, const char* argv[])
{
  /* Syscall 1 is exit on i386 but write on x86_64. */
  asm volatile("movl $1, %eax\n"
               "int $0x80\n");
  for (;;);
}

When built 64-bit, and run under strace on my 64-bit machine, the difference in opinion on the syscall is apparent:

write(1, "\370V\355\365\377\177", 140737319359320 <unfinished ... exit status 208>

The monitor sees write() but the kernel sees (and executes!) exit().

Detecting this situation has some subtleties. Which syscall table the kernel hits depends on both the instruction used to trap into the kernel, and also the "long mode" bit in the current descriptor referred to by the code segment (CS) register. int80 and sysenter always cause a 32-bit syscall, but syscall looks at the descriptor. Therefore, to securely monitor a 32-bit process, it is sufficient just to validate that the CS register references a privileged 32-bit descriptor. Unfortunately, to securely monitor a 64-bit process, not only must the CS register be checked, but the instruction initiating the syscall must be checked. This involves reading user-space which is of course is subject to well-documented lethal race conditions when other processes which share writeable address space.

http://scary.beasts.org/security/CESA-2009-001.html
http://scarybeastsecurity.blogspot.com/2009/01/bypassing-syscall-filtering.html
Comment 14 Eugene Teo (Security Response) 2009-03-01 22:44:39 EST
Programs affected: Fortunately, pretty much no-one uses seccomp.
Severity: Syscall policy violation.

This is a specific follow-on from CESA-2009-001 which noted a generic Linux issue with syscall filtering.

The Linux kernel actually has a built-in syscall filtering technology called "seccomp". It permits a process to restrict itself to an extremely restricted set of syscalls -- read(), write(), exit(), sigreturn(). That's very powerful if not quite generic enough for wide use. Check out prctl(PR_SET_SECCOMP, ...).

The confusion with 32-bit vs. 64-bit syscall numbers applies in this context too. The impact is very limited because of the limited number of syscalls which can abuse this mix up.[...]

http://scary.beasts.org/security/CESA-2009-004.html
Comment 15 Eugene Teo (Security Response) 2009-03-01 22:51:59 EST
Proposed patches for upstream kernel:
http://lkml.org/lkml/2009/2/27/451 summary
http://lkml.org/lkml/2009/2/27/453 seccomp
http://lkml.org/lkml/2009/2/28/23 seccomp follow-ups
Comment 16 Eugene Teo (Security Response) 2009-03-01 23:02:25 EST
rhel-5 did not set CONFIG_SECCOMP.
Comment 18 Eugene Teo (Security Response) 2009-03-19 00:13:17 EDT
CVSS2 score of low, 3.6 (AV:L/AC:L/Au:N/C:P/I:P/A:N)
Comment 19 errata-xmlrpc 2009-04-29 05:28:26 EDT
This issue has been addressed in following products:

  MRG for RHEL-5

Via RHSA-2009:0451 https://rhn.redhat.com/errata/RHSA-2009-0451.html

Note You need to log in before you can comment on or make changes to this bug.