Bug 824966 - (CVE-2012-2934) CVE-2012-2934 kernel: denial of service due to AMD Erratum #121
CVE-2012-2934 kernel: denial of service due to AMD Erratum #121
Status: NEW
Product: Security Response
Classification: Other
Component: vulnerability (Show other bugs)
unspecified
All Linux
medium Severity medium
: ---
: ---
Assigned To: Red Hat Product Security
impact=moderate,public=20120612,repor...
: Security
Depends On: 824969 824970
Blocks: 813442 815484 858724
  Show dependency treegraph
 
Reported: 2012-05-24 13:42 EDT by Petr Matousek
Modified: 2015-11-06 11:30 EST (History)
21 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Petr Matousek 2012-05-24 13:42:10 EDT
On certain older AMD CPUs sequential execution across non-canonical boundary can result in locking the CPU.

On Xen, a guest user or administrator of a 64 bit PV guest on a vulnerable system can cause the processor to lock up, leading to a Denial of Service attack against the host.

For Red Hat Enterprise Linux guests, only privileged guest users can exploit this issue. HVM guests and 32-bit PV guests cannot be used to exploit this issue.

References:
AMD erratum #121, http://support.amd.com/us/Processor_TechDocs/25759.pdf

Acknowledgements:

Red Hat would like to thank the Xen project for reporting this issue.
Comment 1 Petr Matousek 2012-05-24 13:43:46 EDT
The following 130nm and 90nm (DDR1-only) AMD processors are subject
to this erratum:

 * First-generation AMD-Opteron(tm) single and dual core processors
   in either 939 or 940 packages:
   * AMD Opteron(tm) 100-Series Processors
   * AMD Opteron(tm) 200-Series Processors
   * AMD Opteron(tm) 800-Series Processors
 * AMD Athlon(tm) processors in either 754, 939 or 940 packages
 * AMD Sempron(tm) processor in either 754 or 939 packages
 * AMD Turion(tm) Mobile Technology in 754 package

This issue does not affect Intel processors.

The above mentioned processors do not provide support for AMD virtualization functionality so HVM guests cannot be used to exploit this issue.
Comment 3 Petr Matousek 2012-05-24 13:46:42 EDT
Statement:

This issue did not affect the versions of the Linux kernel as shipped with Red Hat Enterprise Linux 5 and 6, and Red Hat Enterprise MRG, as those versions have a guard page between the end of the user-mode accessible virtual address space and the beginning of the non-canonical area due to CVE-2005-1764 fix.

This issue did affect the versions of Xen hypervisor as shipped with Red Hat Enterprise Linux 5. A kernel-xen update for Red Hat Enterprise Linux 5 is available to address this flaw.
Comment 11 Jan Lieskovsky 2012-06-12 08:10:37 EDT
Public via:
http://www.openwall.com/lists/oss-security/2012/06/12/3
Comment 12 errata-xmlrpc 2012-06-12 10:56:15 EDT
This issue has been addressed in following products:

  Red Hat Enterprise Linux 5

Via RHSA-2012:0721 https://rhn.redhat.com/errata/RHSA-2012-0721.html
Comment 13 Richard Lynch 2012-06-14 15:16:45 EDT
This text in the CentOS updater left me scratching my head as to whether a 32-bit AMD chip was vulnerable.

* It was found that guests could trigger a bug in earlier AMD CPUs, leading
to a CPU hard lockup, when running on the Xen hypervisor implementation. An
unprivileged user in a 64-bit para-virtualized guest could use this flaw to
crash the host. Warning: After installing this update, hosts that are using
an affected AMD CPU (refer to Red Hat Bugzilla bug #824966 for a list) will
fail to boot. In order to boot such hosts, the new kernel parameter,
allow_unsafe, can be used ("allow_unsafe=on"). This option should only be
used with hosts that are running trusted guests, as setting it to "on"
reintroduces the flaw (allowing guests to crash the host). (CVE-2012-2934,
Moderate)

Further confusion about the utility of the patch, if one then has to set allow_unsafe=on to reboot, and setting re-introduces the vulnerability.

What would be the point of applying the patch, exactly?

I'm sure it correctly identifies everything, taken the right way, but the word ordering has me wondering what's what.
Comment 14 Petr Matousek 2012-06-14 16:37:59 EDT
(In reply to comment #13)
> This text in the CentOS updater left me scratching my head as to whether a
> 32-bit AMD chip was vulnerable.

32-bit only AMD chips are not vulnerable because to exploit this vulnerability the processor has to be in 64-bit mode. 32-bit PV guests on 64-bit capable AMD chip are also not vulnerable.

> Further confusion about the utility of the patch, if one then has to set
> allow_unsafe=on to reboot, and setting re-introduces the vulnerability.
> 
> What would be the point of applying the patch, exactly?

The whole point of this patch (that detects vulnerable hardware configuration) as I see it is to inform the administrator that there is a potential problem that cannot be easily fixed via software patch (as it's hardware issue), in a way that cannot be skipped by accident. The administrator then can make an informed decision whether the risk is acceptable (he runs trusted/32-bit pv guests only), whether he wants to upgrade the system to a newer processor or whether he simply wants to ignore the problem.
Comment 15 philippe.camps 2012-06-15 05:44:26 EDT
I have 64 bit AMD chips with xen under Red Hat 5. After the new kernel applied, 2.6.18-308.8.2.el5xen, when I reboot the machine, now my server reboots constantly with this kernel.
After grub menu, I can see for 5 seconds, this message:
 "Panic on CPU 0:
  Xen will not boot on this CPU for security reasons.
  pass "allow_unsafe" if you're trusting all your CPU guest kernels"  

So I tried to put this kernel parameter allow_unsafe=on , but nothing changed.
What does it mean ?
I had to come back with 2.6.18-308.8.1.el5xen kernel.

My CPU (I have 4 processors)
[root@antimoine ~]# cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 33
model name      : Dual Core AMD Opteron(tm) Processor 280
stepping        : 2
cpu MHz         : 2393.182
cache size      : 1024 KB
physical id     : 0
siblings        : 1
core id         : 0
cpu cores       : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu tsc msr pae cx8 apic mtrr cmov pat clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy
bogomips        : 5985.08
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:
Comment 16 Paolo Bonzini 2012-06-15 06:22:39 EDT
Philippe, did you put a kernel parameter, or a hypervisor parameter?
Comment 17 philippe.camps 2012-06-15 06:57:08 EDT
(In reply to comment #16)
> Philippe, did you put a kernel parameter, or a hypervisor parameter?

Thank you Paolo,

I understood what I should do.
The first time, I put a hypervisor parameter in:
module /boot/vmlinuz-2.6.18-308.8.2.el5xen ro root=LABEL=/ rhgb quiet single allow_unsafe=on

After your question, I put the parameter in kernel:
kernel /boot/xen.gz-2.6.18-308.8.2.el5 allow_unsafe=on

And it boots normally.

In the futur, should I put myself this parameter each new kernel or it will be automatically done?
Comment 18 Paolo Bonzini 2012-06-15 11:39:49 EDT
Actually, "xen.gz" parameters are hypervisor parameters, and "vmlinuz" parameters are for the kernel (confusing, I know).  But anyway you found the problem.  The parameters are copied to the new kernel when upgrading.  Of course you have to add them manually once on every machine.
Comment 19 GordonL 2012-09-19 04:05:57 EDT
I love Red Hat, but I think rigging the new kernel to refuse to boot on these AMD processors was a terrible mistake.  I have a mail server which has been running well for years and is quite stable, but today I decided to 'yum update' and reboot for the first time in quite a while.  It is always scary to do that when the system is running well, but I didn't want to ignore the patches and risk a remote security vulnerability.

Unfortunately when I rebooted after the update, the machine didn't come back online.  It is hosted at a datacenter in another city, but I do have access to a remote power cycle switch.  I cut and restored the power, but the machine still never came back up.  I tried again and waited a while in case it was doing a filesystem check or something had to time out, but the host never came back online.  So I had to cancel all of my afternoon meetings and immediately drive to Sunnyvale, California to physically access the host.

Once I arrived at the Datacenter and plugged a monitor and keyboard into the machine, I found that it was continually rebooting with this message:

"Xen will not boot on this CPU for security reasons.
pass "allow_unsafe" if you're trusting all your CPU guest kernels"  

So in order to warn me of a potential denial of service vulnerability, you intentionally denied service to my machine by repeatedly crashing and rebooting the kernel?!  I don't care about this risk was able to boot by adding "allow_unsafe" to my kernel boot parameters, but my afternoon was still ruined.  And even if I did care about the risk, it is a hardware problem and so there isn't much I could do short of buying a new motherboard and processor.

If users care about this sort of risk, they can subscribe to the security announcement list.  If you really think more notification is needed, you could send an email to the root/postmaster account or something.  But please don't intentionally crash our kernels as a way to get our attention. That was a really terrible "solution".  The damage is done in this case, but I hope that RedHat will adopt a policy against doing this in the future.  Thanks!
Comment 20 Laszlo Ersek 2012-09-19 06:46:07 EDT
Hello Gordon,

let me apologize for the inconvenience.

Please allow me to raise two points:

(1) This behavior has been documented in the relevant errata (linked in comment 12):

    * It was found that guests could trigger a bug in earlier AMD CPUs,
    leading to a CPU hard lockup, when running on the Xen hypervisor
    implementation. An unprivileged user in a 64-bit para-virtualized guest
    could use this flaw to crash the host. Warning: After installing this
    update, hosts that are using an affected AMD CPU (refer to Red Hat
    Bugzilla bug #824966 for a list) will fail to boot. In order to boot
    such hosts, the new kernel parameter, allow_unsafe, can be used
    ("allow_unsafe=on"). This option should only be used with hosts that are
    running trusted guests, as setting it to "on" reintroduces the flaw
    (allowing guests to crash the host). (CVE-2012-2934, Moderate)

(2) The difference between the "potential denial of service vulnerability" and the "intentionally denied service" at hand is that the former can be triggered maliciously and unexpectedly for the system administrator, while the latter has been documented and its scheduling (and override) are in the system administrator's control.

That said, again, please accept my apology for the inconvenience.

Laszlo Ersek
Comment 21 Petr Matousek 2012-09-19 13:39:57 EDT
Hello Gordon,

please let me apologize for the inconvenience. Back then we've used the upstream proposed fix that I thought was sufficient and appropriate provided the changes will be documented in the corresponding errata.

I've opened a new bug #858724 to change the default behaviour of the CVE-2012-2934 fix from host boot denial to guest (DomU) creation denial - similar to upstream changeset 25765:e6ca45ca03c2 [1]. Once fixed, it should avoid the problems you have ran into.

  [1] http://xenbits.xensource.com/hg/xen-unstable.hg/rev/e6ca45ca03c2

Best regards,
--
Petr Matousek / Red Hat Security Response Team
Comment 22 GordonL 2012-09-19 17:03:39 EDT
Thanks, Petr, for the quick response!  That does sound like a much better solution.
Comment 23 Petr Matousek 2012-11-16 04:37:22 EST
Hello, Gordon.

(In reply to comment #22)
> Thanks, Petr, for the quick response!  That does sound like a much better
> solution.

To follow up -- on 2012-11-13 RHSA-2012-1445 [1] has been released that changed the default behaviour of CVE-2012-2934 fix to guest (DomU) creation denial on affected hardware configurations.

  [1] http://rhn.redhat.com/errata/RHSA-2012-1445.html

Best regards,
--
Petr Matousek / Red Hat Security Response Team

Note You need to log in before you can comment on or make changes to this bug.