Bug 2274779 (CVE-2024-3651)

Summary: CVE-2024-3651 python-idna: potential DoS via resource consumption via specially crafted inputs to idna.encode()
Product: [Other] Security Response Reporter: Marco Benatto <mbenatto>
Component: vulnerabilityAssignee: Product Security <prodsec-ir-bot>
Status: NEW --- QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: unspecifiedCC: aarif, adudiak, agarcial, aoconnor, aprice, asegurap, bbuckingham, bcourt, bdettelb, caswilli, davidn, dfreiber, dkuc, drow, ehelms, epacific, fjansen, gtanzill, hhorak, hkataria, jburrell, jcammara, jhardy, jmitchel, jneedle, jobarker, jorton, jsamir, jsherril, jtanner, kaycoth, kholdawa, kshier, lbalhar, lzap, mabashia, mhulan, mminar, mpierce, nmoumoul, oezr, omaciel, orabin, osapryki, pcreech, psegedy, python-maint, rbiba, rbobbitt, rchan, rogbas, sidakwo, simaishi, smcdonal, sskracic, stcannon, sthirugn, teagle, vkrizan, vkumar, xiaoxwan, yguenane, zsadeh, zzhou
Target Milestone: ---Keywords: Security
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
A flaw was found in the python-idna library. A malicious argument was sent to the idna.encode() function can trigger an uncontrolled resource consumption, resulting in a denial of service.
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2274783, 2274784, 2274785, 2276057, 2274780, 2274781, 2274782, 2274786, 2274787, 2274788, 2274789, 2274790    
Bug Blocks: 2274791    

Description Marco Benatto 2024-04-12 19:20:16 UTC
A specially crafted argument to the idna.encode() function could consume significant resources. This may lead to a denial-of-service.

Reference:
https://github.com/kjd/idna/security/advisories/GHSA-jjg7-2v4v-x38h

Comment 1 Marco Benatto 2024-04-12 19:23:46 UTC
Created mingw-python-idna tracking bugs for this issue:

Affects: fedora-all [bug 2274782]


Created python-idna tracking bugs for this issue:

Affects: fedora-all [bug 2274780]


Created python-idna-ssl tracking bugs for this issue:

Affects: epel-8 [bug 2274783]
Affects: fedora-all [bug 2274781]

Comment 3 Lumír Balhar 2024-04-15 14:35:16 UTC
I don't know how to reproduce the issue. Even the changelog of idna does not mention it, the fix seems to be this commit: https://github.com/kjd/idna/commit/5beb28b9dd77912c0dd656d8b0fdba3eb80222e7

I'm able to create something like this:

```
zwnj = '\u200c'
latin = '\u0061'

idna.encode(latin * 10 + zwnj)
```

With that input, the first for loop in the valid_contextj function runs 10 times (v3.6) instead of just once (v3.7). However, I'm not able to prepare an input where a significant difference between 3.6 and 3.7 would be visible when it comes to consumed resources or processing time.

So far, I'm only able to reproduce and verify the issue using cProfiler from Python. The following line:

idna.encode(latin * 1000 + zwnj)

Produces the following output of cProfiler for idna 3.6:

$ python3 -m cProfile -s ncalls poc.py | head

         14231 function calls (14181 primitive calls) in 0.005 seconds

   Ordered by: call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     5007    0.000    0.000    0.000    0.000 {built-in method builtins.ord}
     2003    0.000    0.000    0.000    0.000 intranges.py:35(_decode_range)
1096/1095    0.000    0.000    0.000    0.000 {built-in method builtins.len}
     1024    0.000    0.000    0.000    0.000 {method 'get' of 'dict' objects}
     1002    0.001    0.000    0.001    0.000 intranges.py:39(intranges_contain)

and for idna 3.7:

$ python3 -m cProfile -s ncalls poc.py | head

         9337 function calls (9284 primitive calls) in 0.018 seconds

   Ordered by: call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     2003    0.000    0.000    0.000    0.000 intranges.py:35(_decode_range)
1096/1095    0.000    0.000    0.000    0.000 {built-in method builtins.len}
     1011    0.000    0.000    0.000    0.000 {built-in method builtins.ord}
     1002    0.001    0.000    0.001    0.000 intranges.py:39(intranges_contain)
     1002    0.000    0.000    0.000    0.000 intranges.py:32(_encode_range)

See the difference in the total function calls caused mostly by 5007 calls to ord function in the vulnerable version.

Comment 4 Lumír Balhar 2024-04-18 08:36:40 UTC
Reproducer has been provided in my issue: https://github.com/kjd/idna/issues/175