Bug 719228 - (CVE-2011-2521) CVE-2011-2521 kernel: perf, x86: fix Intel fixed counters base initialization
CVE-2011-2521 kernel: perf, x86: fix Intel fixed counters base initialization
Status: CLOSED ERRATA
Product: Security Response
Classification: Other
Component: vulnerability (Show other bugs)
unspecified
All Linux
medium Severity medium
: ---
: ---
Assigned To: Red Hat Product Security
public=20110319,reported=20110706,sou...
: Security
: 717049 CVE-2011-2693 (view as bug list)
Depends On: 719229 736284 748669
Blocks: 719216
  Show dependency treegraph
 
Reported: 2011-07-06 02:02 EDT by Eugene Teo (Security Response)
Modified: 2015-02-16 10:47 EST (History)
22 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-04-25 07:58:58 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Sosreport comment #17 (1.28 MB, application/x-xz)
2011-11-25 06:50 EST, IBM Bug Proxy
no flags Details
Crash report comment #17 (44.87 KB, text/plain)
2011-11-25 07:02 EST, IBM Bug Proxy
no flags Details

  None (edit)
Description Eugene Teo (Security Response) 2011-07-06 02:02:13 EDT
The following patch solves the problems introduced by Robert's commit 41bf498 and reported by Arun Sharma. This commit gets rid of the base + index notation for reading and writing PMU msrs.

The problem is that for fixed counters, the new calculation for the base did not take into account the fixed counter indexes, thus all fixed counters were read/written from fixed counter 0.  Although all fixed counters share the same config MSR, they each have their own counter register.

Without:

 $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds

  242202299 unhalted_core_cycles (0.00% scaling, ena=1000790892, run=1000790892)
 2389685946 instructions_retired (0.00% scaling, ena=1000790892, run=1000790892)
      49473 baclears             (0.00% scaling, ena=1000790892, run=1000790892)

With:

 $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds

 2392703238 unhalted_core_cycles (0.00% scaling, ena=1000840809, run=1000840809)
 2389793744 instructions_retired (0.00% scaling, ena=1000840809, run=1000840809)
      47863 baclears             (0.00% scaling, ena=1000840809, run=1000840809)

Acknowledgements:

Red Hat would like to thank Li Yu for reporting this issue.
Comment 4 Eugene Teo (Security Response) 2011-07-06 04:11:44 EDT
Statement:

This issue did not affect the versions of Linux kernel as shipped with Red Hat Enterprise Linux 4, 5, and Red Hat Enterprise MRG as they did not backport the upstream commit 41bf498 that introduced the issue. This has been addressed in Red Hat Enterprise Linux 6 via https://rhn.redhat.com/errata/RHSA-2011-1350.html.
Comment 7 Petr Matousek 2011-09-06 14:39:14 EDT
*** Bug 717049 has been marked as a duplicate of this bug. ***
Comment 8 Petr Matousek 2011-09-06 14:42:43 EDT
*** Bug 721283 has been marked as a duplicate of this bug. ***
Comment 12 IBM Bug Proxy 2011-10-03 08:50:24 EDT
------- Comment From ranittal@linux.vnet.ibm.com 2011-10-03 08:44 EDT-------
Hi Don,

Do we have any update on this? Can you please confirm which build would include this fix?

Thanks.
Comment 13 errata-xmlrpc 2011-10-05 17:47:54 EDT
This issue has been addressed in following products:

  Red Hat Enterprise Linux 6

Via RHSA-2011:1350 https://rhn.redhat.com/errata/RHSA-2011-1350.html
Comment 14 Eugene Teo (Security Response) 2011-10-24 23:41:30 EDT
Created kernel tracking bugs for this issue

Affects: fedora-all [bug 748669]
Comment 15 IBM Bug Proxy 2011-11-25 06:50:46 EST
------- Comment From nabharay@in.ibm.com 2011-11-25 06:47 EDT-------
While running pounder test on HS22 with RHEL 6.2 RC1 64 bit , the machine got crashed after 44 hours of test run and vmcore was generated. The backtrace of the vmcore is same as the mentioned in the bug description.

---uname ouptut----

Linux hs22.in.ibm.com 2.6.32-220.el6.x86_64 #1 SMP Wed Nov 9 08:03:13 EST 2011 x86_64 x86_64 x86_64 GNU/Linux

Machine Type = HS22

Attaching the crash report and sos report.

I am pasting the back trace below:

This GDB was configured as "x86_64-unknown-linux-gnu"...

KERNEL: /usr/lib/debug/lib/modules/2.6.32-220.el6.x86_64/vmlinux
DUMPFILE: /var/crash/127.0.0.1-2011-11-25-07:20:47/vmcore  [PARTIAL DUMP]
CPUS: 16
DATE: Fri Nov 25 07:18:47 2011
UPTIME: 1 days, 20:33:37
LOAD AVERAGE: 190.73, 316.30, 344.43
TASKS: 776
NODENAME: hs22.in.ibm.com
RELEASE: 2.6.32-220.el6.x86_64
VERSION: #1 SMP Wed Nov 9 08:03:13 EST 2011
MACHINE: x86_64  (2666 Mhz)
MEMORY: 70 GB
PANIC: "Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 13"
PID: 2016
COMMAND: "timed_loop"
TASK: ffff880a9eb20b00  [THREAD_INFO: ffff8800623ee000]
CPU: 13
STATE: TASK_RUNNING (PANIC)

crash> bt
PID: 2016   TASK: ffff880a9eb20b00  CPU: 13  COMMAND: "timed_loop"
#0 [ffff8808234a7b00] machine_kexec at ffffffff81031fcb
#1 [ffff8808234a7b60] crash_kexec at ffffffff810b8f72
#2 [ffff8808234a7c30] panic at ffffffff814ec348
#3 [ffff8808234a7cb0] watchdog_overflow_callback at ffffffff810d8fad
#4 [ffff8808234a7cd0] __perf_event_overflow at ffffffff8110a89d
#5 [ffff8808234a7d70] perf_event_overflow at ffffffff8110ae54
#6 [ffff8808234a7d80] intel_pmu_handle_irq at ffffffff8101e096
#7 [ffff8808234a7e90] perf_event_nmi_handler at ffffffff814f09f9
#8 [ffff8808234a7ea0] notifier_call_chain at ffffffff814f2545
#9 [ffff8808234a7ee0] atomic_notifier_call_chain at ffffffff814f25aa
#10 [ffff8808234a7ef0] notify_die at ffffffff81096bce
#11 [ffff8808234a7f20] do_nmi at ffffffff814f01c3
#12 [ffff8808234a7f50] nmi at ffffffff814efad0
[exception RIP: _spin_lock_irqsave+47]
RIP: ffffffff814ef22f  RSP: ffff8800623ef928  RFLAGS: 00000083
RAX: 0000000000008c60  RBX: ffff880800033db8  RCX: 0000000000008c54
RDX: 0000000000000282  RSI: 0000000000000001  RDI: ffff880800033db8
RBP: ffff8800623ef928   R8: 0000000000000002   R9: 00000000000030e6
R10: 0000000000000001  R11: 0000000000000000  R12: ffff880800000040
R13: 0000000000000001  R14: 0000000000000001  R15: 0000000000000000
ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
--- <NMI exception stack> ---
#13 [ffff8800623ef928] _spin_lock_irqsave at ffffffff814ef22f
#14 [ffff8800623ef930] __wake_up at ffffffff810517f2
#15 [ffff8800623ef970] wakeup_kswapd at ffffffff811295ae
#16 [ffff8800623ef9b0] __alloc_pages_nodemask at ffffffff81123e5b
#17 [ffff8800623efad0] alloc_pages_current at ffffffff81158b7a
#18 [ffff8800623efb00] __page_cache_alloc at ffffffff81110e57
#19 [ffff8800623efb30] __do_page_cache_readahead at ffffffff81126a7b
#20 [ffff8800623efbc0] ra_submit at ffffffff81126bd1
#21 [ffff8800623efbd0] filemap_fault at ffffffff81112123
#22 [ffff8800623efc40] __do_fault at ffffffff8113b2c4
#23 [ffff8800623efcd0] handle_pte_fault at ffffffff8113b877
#24 [ffff8800623efdb0] handle_mm_fault at ffffffff8113c4b4
#25 [ffff8800623efe00] __do_page_fault at ffffffff81042b39
#26 [ffff8800623eff20] do_page_fault at ffffffff814f248e
#27 [ffff8800623eff50] page_fault at ffffffff814ef845
RIP: 00000036c824e950  RSP: 00007fff4c167988  RFLAGS: 00010246
RAX: 0000000000000000  RBX: 00000000020d9010  RCX: 00007fff4c16928a
RDX: 0000000000401389  RSI: 0000000000000100  RDI: 00007fff4c167990
RBP: 00007fff4c1680ec   R8: 000000000000ffff   R9: 000000000000000f
R10: fffffffffffff38f  R11: 0000000000000000  R12: 00007fff4c168208
R13: 00000000004012e0  R14: 00007fff4c168218  R15: 0000000000004093
ORIG_RAX: ffffffffffffffff  CS: 0033  SS: 002b

This is a regression from RHEL 6.2 SNAP3 or SNAP4.

Thanks
Comment 16 IBM Bug Proxy 2011-11-25 06:50:59 EST
Created attachment 536207 [details]
Sosreport comment #17


------- Comment on attachment From nabharay@in.ibm.com 2011-11-25 06:49 EDT-------


Sosreport comment #17
Comment 17 IBM Bug Proxy 2011-11-25 07:02:12 EST
Created attachment 536208 [details]
Crash report comment #17


------- Comment on attachment From nabharay@in.ibm.com 2011-11-25 06:51 EDT-------


Crash report as mentioned in comment #17
Comment 18 Linda Wang 2011-11-30 01:09:48 EST
- [kernel] perf: Optimize event scheduling locking (Steve Best) [744986]

is the only patch that may have caused this regression since 
snap3/4.
Comment 19 Steve Best 2011-11-30 14:03:21 EST
Linda,

I'm not sure why bzs are getting dupped to this bz. checking out kernel 220 it has
Upstream commit:
http://git.kernel.org/linus/fc66c5210ec2539e800e87d7b3a985323c7be96e

anyone have any idea why this is still open? I assume it is for RHEL 6.2, maybe it is for another RHEL 6.x release?

-Steve
Comment 20 Eugene Teo (Security Response) 2011-12-01 00:17:36 EST
(In reply to comment #19)
> Linda,
> 
> I'm not sure why bzs are getting dupped to this bz. checking out kernel 220 it
> has
> Upstream commit:
> http://git.kernel.org/linus/fc66c5210ec2539e800e87d7b3a985323c7be96e
> 
> anyone have any idea why this is still open? I assume it is for RHEL 6.2, maybe
> it is for another RHEL 6.x release?
> 
> -Steve

This is a top-level security bug. It is meant to keep track of the trackers (see Depends on). This will remain opened until all the trackers have been addressed, including rhel-6.2. Thanks.
Comment 21 IBM Bug Proxy 2012-04-25 07:42:20 EDT
------- Comment From tpnoonan@us.ibm.com 2012-04-25 11:32 EDT-------
(In reply to comment #21)
------- Comment From
> eteo@redhat.com 2011-12-01 00:17:36 EDT-------
> I'm not sure why bzs are getting dupped to this bz. checking out
> kernel 220 it
>
>
> anyone have any idea why this is still open? I assume it is for RHEL 6.2,
> maybe
>

This is a top-level
> security bug. It is meant to keep track of the trackers (see Depends on).
> This will remain opened until all the trackers have been addressed,
> including rhel-6.2. Thanks.

hi red hat, have all the trackers been addressed/can this one now be closed? thanks
Comment 22 Petr Matousek 2012-04-25 07:58:58 EDT
(In reply to comment #21)
> ------- Comment From tpnoonan@us.ibm.com 2012-04-25 11:32 EDT-------
> hi red hat, have all the trackers been addressed/can this one now be closed?

Hi IBM, yes, closing.

Note You need to log in before you can comment on or make changes to this bug.