A flaw was found in Boost Regex library versions before 1.66.0. An integer overflow during calculation of max_state_count in regex library. This could result into a denial of service attack or possibly have unspecified other impact
Created boost tracking bugs for this issue:
Affects: fedora-all [bug 1564253]
Created boost148 tracking bugs for this issue:
Affects: epel-all [bug 1564254]
Created mingw-boost tracking bugs for this issue:
Affects: fedora-all [bug 1564556]
Affects: epel-7 [bug 1564554]
This flaw does not seem to result in a denial of service based on initial analysis. The integer overflow does occur, but the result of the overflow is later bounded by checks.
The PoC provided is interesting as it does appear to be a "denial of service" if you run it on a 64bit machine. However, the code in question inside of estimate_max_state_count does not actually run in this case. The high usage of memory and eventual abort when a memory allocation later fails is present before and after the patch in 1.66.0. Triggering a denial of service with an arbitrarily long regular expression does not seem novel unless Boost makes guaranties or promises about this that I am unaware of. If that is an issue, it is separate to this and is still reproducible upstream.
On a 32-bit system, the aforementioned denial of service does not appear to occur and we can more easily trigger the actual overflow.
If we dump out the states/max_state_count variables in estimate_max_state_count in versions of the boost regular expression engine before and after the patch on a 32-bit system, we see the following:
Starting states: 47000
States after multiplication: -2085967296
Ret6 max_state_count: 100225
Starting states: 47000
Ret1 max_state_count: 100000000
Thus, in unfixed, we see the overflow. Later on in the function it fails a check and ends up bounded at 100225 instead of the true max state size, which should be dramatically larger.
In the fixed version, it bounds the max_state_count to 100000000 before the overflow occurs.
It is unclear what a lower max_state_count would result in -- my guess is a failure to match against the regular expression. It does not appear to result in any form of memory corruption based on a few simple checks.
So, this may be an issue if you allow users to inject arbitrary regular expressions and are expecting correct output, but you almost certainly have other problems if that is the case.
I'd be interested if anyone else (perhaps with more boost development or regular expression knowledge) has a different opinion on the outcome of this or had success replicating this on a 64-bit system.
(In reply to Scott Gayou from comment #13)
> So, this may be an issue if you allow users to inject arbitrary regular
> expressions and are expecting correct output, but you almost certainly have
> other problems if that is the case.