750555 – (CVE-2012-1150) CVE-2012-1150 python: hash table collisions CPU usage DoS (oCERT-2011-003)

Bug 750555 (CVE-2012-1150) - CVE-2012-1150 python: hash table collisions CPU usage DoS (oCERT-2011-003)

Summary: CVE-2012-1150 python: hash table collisions CPU usage DoS (oCERT-2011-003)

Keywords:
Status:	CLOSED ERRATA
Alias:	CVE-2012-1150
Product:	Security Response
Classification:	Other
Component:	vulnerability
Sub Component:
Version:	unspecified
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Red Hat Product Security
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	771528 771534 802209 802210 805382 805383 805384 805385
Blocks:	750561 hashdos, oCERT-2011-003 790031
TreeView+	depends on / blocked

Reported:	2011-11-01 15:03 UTC by Jan Lieskovsky
Modified:	2023-05-12 15:01 UTC (History)
CC List:	15 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2012-07-18 10:16:47 UTC
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Python	13703	None	None	None	Never
Red Hat Product Errata	RHSA-2012:0744	normal	SHIPPED_LIVE	Moderate: python security update	2012-06-18 16:31:56 UTC
Red Hat Product Errata	RHSA-2012:0745	normal	SHIPPED_LIVE	Moderate: python security update	2012-06-18 16:42:17 UTC

Description Jan Lieskovsky 2011-11-01 15:03:19 UTC

Julian Wälde and Alexander Klink reported a flaw in the hash function used in the implementation of the Python dictionaries (associative arrays).

A specially-crafted set of keys could trigger hash function collisions, which degrade dictionary performance by changing hash table operations complexity from an expected/average O(1) to the worst case O(n).  Reporters were able to find colliding strings efficiently using meet in the middle attack.

As various web application frameworks for Python automatically pre-fill certain dictionaries with data from the HTTP request (such as GET or POST parameters) for Python web application, a remote attacker could use this flaw to make Python interpreter use excessive amount of CPU time by sending a POST request with large amount of parameters which hash to the same value.

This problem is similar to the issue that was previously reported for and fixed in e.g. perl:
  http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003.pdf

Comment 2 Jan Lieskovsky 2011-11-01 15:07:17 UTC

Acknowledgements:

Red Hat would like to thank oCERT for reporting this issue. oCERT acknowledges Julian Wälde and Alexander Klink as the original reporters.

Comment 9 Tomas Hoger 2011-12-29 15:43:51 UTC

This issue was presented on 28C3:
http://events.ccc.de/congress/2011/Fahrplan/events/4680.en.html

Details were posted to full-disclosure:
http://seclists.org/fulldisclosure/2011/Dec/477

oCERT advisory:
http://www.ocert.org/advisories/ocert-2011-003.html

n.runs advisory (copy of the full-disclosure post):
http://www.nruns.com/_downloads/advisory28122011.pdf

28C3 slides and recording:
http://events.ccc.de/congress/2011/Fahrplan/attachments/2007_28C3_Effective_DoS_on_web_application_platforms.pdf
http://www.youtube.com/28c3#p/u/22/R2Cq3CLI6H8

Another good write-up of the issue:
http://cryptanalysis.eu/blog/2011/12/28/effective-dos-attacks-against-web-application-plattforms-hashdos/

Comment 10 Tomas Hoger 2011-12-30 13:38:04 UTC

New and old threads on python-dev related to this issue:

http://mail.python.org/pipermail/python-dev/2011-December/115116.html
http://mail.python.org/pipermail/python-dev/2003-May/035874.html

Comment 11 Tomas Hoger 2011-12-30 13:42:43 UTC

(In reply to comment #10)
> http://mail.python.org/pipermail/python-dev/2011-December/115116.html
> http://mail.python.org/pipermail/python-dev/2003-May/035874.html

Same discussions in gmane archive, which is better at showing discussions that span over more than one month:

http://thread.gmane.org/gmane.comp.python.devel/128267
http://thread.gmane.org/gmane.comp.python.devel/50762

Comment 13 Dave Malcolm 2012-01-03 23:56:07 UTC

(this is being tracked upstream as http://bugs.python.org/issue13703 )

Comment 21 Dave Malcolm 2012-01-13 05:15:46 UTC

python-dev status update: http://comments.gmane.org/gmane.comp.python.devel/128617

Comment 31 Dave Malcolm 2012-01-20 04:27:26 UTC

The collision-counting approach is now favored upstream (over hash randomization):
  http://mail.python.org/pipermail/python-dev/2012-January/115690.html
[+1 from Guido here: http://mail.python.org/pipermail/python-dev/2012-January/115692.html ]

Comment 32 Dave Malcolm 2012-01-20 04:32:46 UTC

For reference, latest version of lemburg's patch using collision-counting approach is currently this one:
  http://bugs.python.org/file24151/hash-attack.patch
(from 2012-01-06)

Comment 35 Dave Malcolm 2012-01-24 02:55:09 UTC

Upstream seems to be dithering between collision-counting and hash randomization, or some combination, and the discussion seems to have grown out-of-control.

For my part, I've posted some patches upstream:

(A) "hash-collision-counting-dmalcolm-2012-01-20-001.patch"
http://bugs.python.org/file24286
(see http://bugs.python.org/issue13703#msg151707 )
in which I further developed lemburg's first collision-counting patch

I then had a brainstorm and tried another approach:
(B) "amortized-probe-counting-dmalcolm-2012-01-20-002.patch"
http://bugs.python.org/file24288
(see http://bugs.python.org/issue13703#msg151714 )

(C) "amortized-probe-counting-dmalcolm-2012-01-21-003.patch"
http://bugs.python.org/file24289
(see http://bugs.python.org/issue13703#msg151735 )
tracking per-dict statistics, in particular: the amortized average cost per modification of each dict, and raising an exception when the average exceeds a given threshold for a given dict. This seems to work for all of python's test suite, and very quickly detects and complains about hostile input. It's rather experimental though, and the ratio/scale-factor is a number I chose arbitrarily, and I haven't gone through the math or gathered real numbers on real code. There are a few issues that would need fixing (see Antoine Pitrou's comments on it).

However, there's an argument that any approach that can lead to dict (or set) raising an exception is flawed: see http://bugs.python.org/issue13703#msg151796 It strikes me that this "protection" may allow yet another new class of attack, it which the attacker slowly poisons inputs, so that when a dict is constructed at some later point, an exception is raised you weren't expecting. If the protection wasn't there you'd get the extreme slowdown, rather an exception. But: my patch is too good :) it's too efficient at detecting when it's under attack: the exception is raised too quickly, and so arguably we've made the job easier for the attacker. But maybe I'm overthinking this? Perhaps I just need to tune things?

(D) http://bugs.python.org/issue13703#msg151847
in which I backported haypo's hash randomization patch from the main branch to the 2.7 branch. This patch disables the hash randomization by default, and only does it for string-like types, not numeric types (for reasons noted in that comment). When enabled with "PYTHONHASHRANDOMIZATION=1 python", this patch successfully blocks the reproducer. (there's also another hash randomization patch beyond haypo's, which does a few things a little differently; see python issue 13704).

Note that hash randomization is known to break Django's test suite, which is why I'm wary about enabling it by default.

Comment 37 Dave Malcolm 2012-01-25 11:54:52 UTC

I came up with a hybrid approach, which I believe combines the best aspects of the collision-counting and hash-randomization approaches.

I've posted it for upstream review here:

   http://bugs.python.org/issue13703#msg151939

In this approach, by default, hash() and dict don't use hash randomization - preserving existing behavior.  It detects when a given dict is under attack, and transitions just that dict to use randomized hashes.

It preserves ABI - however, in order to do this, the randomization only happens for str/unicode/bytes keys.  I believe this is a reasonable compromise, given the various attack scenarios on a web server (e.g. it should cover HTTP headers, POST fields, JSON, XML-RPC).

It will probably need performance tuning and extensive testing.

Comment 38 Dave Malcolm 2012-01-28 02:19:36 UTC

Upstream has reached a consensus about a security release (adding hash-randomization in disabled form with an envvar to enable it):
http://mail.python.org/pipermail/python-dev/2012-January/115892.html

I've been volunteered by the release managers of those upstream branches to create the patches (based on the work I've already done; see (D) above:
http://bugs.python.org/issue13703#msg151847 ).

Comment 55 Kurt Seifried 2012-03-10 05:52:17 UTC

Added CVE as per http://www.openwall.com/lists/oss-security/2012/03/10/3

Comment 56 Huzaifa S. Sidhpurwala 2012-03-12 05:41:23 UTC

Created python tracking bugs for this issue

Affects: fedora-all [bug 802209]

Comment 57 Huzaifa S. Sidhpurwala 2012-03-12 05:41:31 UTC

Created python3 tracking bugs for this issue

Affects: fedora-all [bug 802210]

Comment 69 Fedora Update System 2012-05-02 04:49:27 UTC

python-2.7.3-3.fc17, python-docs-2.7.3-1.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 70 Fedora Update System 2012-05-03 07:28:12 UTC

python3-3.2.3-1.fc15 has been pushed to the Fedora 15 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 71 Tomas Hoger 2012-05-03 12:43:59 UTC

This issue was fixed upstream in versions 2.6.8, 2.7.3, 3.1.5 and 3.2.3:

http://python.org/download/releases/2.6.8/
http://python.org/download/releases/2.7.3/
http://python.org/download/releases/3.1.5/
http://python.org/download/releases/3.2.3/

The final upstream solution is to use hash randomization.  However, randomization is disabled by default to avoid breaking code that incorrectly relies on specific ordering of dictionary keys.  Updated python versions provide following ways to enable randomization:

- PYTHONHASHSEED environment variable, which is expected to be set to a decimal number in the range [0,4294967295], or "random", with the following meaning:
  - "random" - new random seed is generated during interpreter start, seed
    is not predictable across python interpreter invocations
  - 0 - use the same seed as was used by older python versions, effectively
    disabling randomization
  - other values - use specific fixed seed value, which is different from
    seed that was used in older python versions; this may be useful in test
    suites or environments when multiple python instances are expected to
    produce the same hash values
- -R command line option, its use is equivalent to setting PYTHONHASHSEED environment variable to "random"

Refer to python command manual page for further details:
http://docs.python.org/release/2.6.8/using/cmdline.html#cmdoption-R
http://docs.python.org/release/2.6.8/using/cmdline.html#envvar-PYTHONHASHSEED

Comment 73 Fedora Update System 2012-05-06 01:25:59 UTC

python-2.7.3-1.fc16, python-docs-2.7.3-1.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 74 Fedora Update System 2012-05-07 04:16:33 UTC

python3-3.2.3-5.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 75 Fedora Update System 2012-05-07 22:09:44 UTC

python26-2.6.8-1.el5 has been pushed to the Fedora EPEL 5 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 76 Tomas Hoger 2012-06-08 10:05:36 UTC

As part of this fix, an implementation of os.random was moved from the os module (part of the Python standard library) the posix module (built in to the Python executable binary).  This causes no issues for most use cases, but may cause problems where updated Python standard library is used with not updated Python executable.  Such problem was reported for Python virtualenv use cases, see:

http://bugs.python.org/issue14444#msg158172

The above link also provides instructions on how to fix the problem.  These instructions are also available on the virtualenv project news page:

http://www.virtualenv.org/en/latest/news.html

  Warning: Python bugfix releases 2.6.8, 2.7.3, 3.1.5 and 3.2.3 include a
  change that will cause “import random” to fail with “cannot import name
  urandom” on any virtualenv created on a Unix host with an earlier release
  of Python 2.6/2.7/3.1/3.2, if the underlying system Python is upgraded.
  This is due to the fact that a virtualenv uses the system Python’s standard
  library but contains its own copy of the Python interpreter, so an upgrade
  to the system Python results in a mismatch between the version of the Python
  interpreter and the version of the standard library. It can be fixed by
  removing $ENV/bin/python and re-running virtualenv on the same target
  directory with the upgraded Python.

Comment 77 errata-xmlrpc 2012-06-18 12:32:36 UTC

This issue has been addressed in following products:

  Red Hat Enterprise Linux 6

Via RHSA-2012:0744 https://rhn.redhat.com/errata/RHSA-2012-0744.html

Comment 78 errata-xmlrpc 2012-06-18 12:42:54 UTC

This issue has been addressed in following products:

  Red Hat Enterprise Linux 5

Via RHSA-2012:0745 https://rhn.redhat.com/errata/RHSA-2012-0745.html

Comment 79 Fedora Update System 2012-06-19 14:53:04 UTC

python3-3.2.3-2.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 80 Ratul Gupta 2013-12-10 10:36:05 UTC

CVE-2013-7040 is filed as a followup for this bug.

Note You need to log in before you can comment on or make changes to this bug.