Created attachment 996033 [details] Suggested patch Description of problem: Calling XML_Parse seeds the rand() pseudo random generator, causing it to generate non-random and predictable numbers. Version-Release number of selected component (if applicable): expat-2.1.0-11.fc23 How reproducible: Always Steps to Reproduce: 1. cat expat_test.c #include <stdlib.h> #include <stdio.h> #include <expat.h> int main(void) { XML_Parser parser; int i; for(i = 0; i < 20; i++) { parser = XML_ParserCreate("UTF-8"); XML_Parse(parser, "", 0, 1); XML_ParserFree(parser); printf("%d\n", rand()); } return 0; } 2. gcc -o expat_test expat_test.c 3. ./expat_test Actual results: The same number is printed 20 times Expected results: 20 different numbers are printed Additional info: Reported upstream (https://sourceforge.net/p/expat/bugs/519/), but without response for 12 months.
This bug appears to have been reported against 'rawhide' during the Fedora 23 development cycle. Changing version to '23'. (As we did not run this process for some time, it could affect also pre-Fedora 23 development cycle bugs. We are very sorry. It will help us with cleanup during Fedora 23 End Of Life. Thank you.) More information and reason for this action is here: https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora23
Thanks for the report - you described the repro case as "Suggested patch", did you have a patch?
Created attachment 1122079 [details] Suggested patch Sorry, That was the wrong file attached. The patch simply disables the seeding of the rand() function - seeding by the current time is causing the random numbers generated to be very predictable anyway.
Hi! I'm unsure if not calling srand anymore could break existing applications. Just so you know, there is a ticket upstream at https://sourceforge.net/p/expat/bugs/499/ . Best, Sebastian
The call to srand basically means that the output of rand if totally predictable - I can't imagine that any application will rely on that. An application that relies on the rng being seeded really should do that itself and not rely on a XML parser doing it.
Expat is calling srand (from generate_hash_secret_salt) by itself iff the code using Expat (a) never called XML_SetHashSalt on that parser or (b) passed salt 0 to XML_SetHashSalt. So the way Expat is calling srand as of now can affect code that * misunderstood salt value 0 (because it was undocumented before Expat 2.1.1) and/or * does not call XML_SetHashSalt because the code was not updated yet to do so or the arrival of XML_SetHashSalt bypassed the Expat user's radar For those, (1) not calling srand could be bad since the protection provided by random would dissolve but (2) calling srand (with enough entropy) would break repeatability if the app is used in a way where that's needed. To me that means: removing the call hurts and leaving it in hurts, too. What we can do is (1) call srand with more entropy as in [1] and (2) patch users of Expat to call XML_SetHashSalt with a non-zero value. If we wanted to only call srand if it hasn't been called before, we could use a hack like this: #include <stdio.h> #include <stdbool.h> bool has_srand_been_called_before() { if (rand() != 1804289383) return true; if (rand() != 846930886) return true; if (rand() != 1681692777) return true; return false; } void main() { if (! has_srand_been_called_before()) { printf("srand(1) detected, call to srand() needed.\n"); srand(17); /* except with more entropy */ } printf("rand() -> %d\n", rand()); } However these magic numbers may be different on other architectures, so configure would need to verify that assumption and an arch specific patch can be applied where configure detects a mismatch in the numbers produced by srand(1). I guess we don't want to go that road? :) [1] http://www.openwall.com/lists/oss-security/2012/04/05/2
Sebastian - great to see a new expat release! In the upstream bug link Marcus suggests using the tv_usec part from gettimeofday() - any reason that has to be fed through a PRNG or could it be used as the hash seed directly? If it needs to go through a PRNG it seems much better to use one of the re-entrant libc functions rather than srand. (I don't see any need at all to worry about apps assuming expat calls srand, that's not a documented part of the API, plus expat did *not* do that for most of its lifetime!)
Hi, very interesting idea! First, let me collect what we seem to want from a fixed version: (1) It should work well for code using Expat as it was used years ago, i.e. be compatible to API semantics of Expat 1.95.x or so and still be secure. (2) It should not call srand at all, at least not, if it has been called before. Because that would interfere with the stream of randomness in other parts of the code using Expat. (3) The salt should receive a secure amount of entropy. Right now, the salt is derived as entropy bits from time -> srand -> first value in rand chain -> salt . The idea now is to replace that by entropy bits from time -> salt . The need to use more entropy bits out of the clock left aside, the salt would no longer be all over the place but within a certain range. I have a feeling that something like entropy bits from time -> SHA256 -> salt would be more secure (since it would be all over the place again) but I cannot stop thinking that time is always going to be too little entropy, for something that may be attacking from the same local machine with no network delay in between as fast as possible. Maybe we need a way to read sizeof(int) good random bytes for all supported platforms and use that for the salt when no explicit value is given? For Linux that could be reading one byte from /dev/random, the rest from /dev/urandom. A chroot without /dev mounted may need some extra consideration. If the attacker can guess the time down to one second of precision, with microseconds from gettimeofday: do we have 10^6 ~= 2^20 = 20 bits entropy? Is that realistic? Is it enough? Best, Sebastian
(In reply to Sebastian Pipping from comment #8) > For Linux that could be reading one byte from /dev/random, the rest from > /dev/urandom. A chroot without /dev mounted may need some extra > consideration. Please do not mention using /dev/random at all, that is almost never a good idea. Also I'd suggest looking at the getrandom() syscall on platforms where it is available or where its equivalent is available.
Created attachment 1137086 [details] getrandom in a loop for review Thanks for pointing me to getrandom. Is my understanding correct, that I would have to call it in a loop like the sample code attached? Also, how do I fix getrandom_loop.c:(.text+0x53): undefined reference to `getrandom' collect2: error: ld returned 1 exit status I have Linux 4.4 headers around.
Created attachment 1137093 [details] Call srand with more entropy Since using good random on all supported platforms is not quick to implement, maybe we should split the fix in half: (1) A temporary fix that improves the situation and (2) a long term one, that gets high quality entropy in the way that works best on platform X. For (1) I'm attaching a patch for review. The two ideas are: * we get more entropy out of the clock than previously, and * any entropy passed to previous calls to srand is maintained. I'm looking forward to feedback.
Created attachment 1137426 [details] different proposal We had a bit of a discussion internally about this. I've attached this based on your patch, it removes the srand usage entirely, there doesn't seem to be a good justification for using that, and it avoids the global state/non-re-entrant function use entirely.
Hi! This is going in the right direction for a short to mid term fix. Its is import to me that we prevent continuing with zero or close to zero entropy -- which failure to query the time and a local attacker seem to result in -- without signaling that error to the layer above clearly. Applying that wish to the patch you proposed, I see that time(..) is used if gettimeofday(..) failed. Studying their man pages, my impression is both of these functions are not going to fail unless something is very out of hand not just in our process. Now if both of them would never fail, they deserve equal handling in my eyes: Either we assume they never fail and add an assert -- if you avoid assert on purpose please explain why -- or we assume they may fail, then time(..) can fail too and needs error handling as well (which is lacking right now). Especially: if one way to query time failed, why should the other work in that very scenario? So what I would like to see in the patch is that we pick from: a) Use gettimeofday(..) only, add assert on success b) Use gettimeofday(..) with time(..) as fallback, add assert on success of time(..). c) Pass the error upwards and make the whole upcoming parsing process fail if we cannot extract entropy from the clock. Please let me know what you think. Best, Sebastian
PS: I only really notice now how little entropy time(..) really is. So maybe drop (b) from my candidates again :)
Good argument. I'm definitely fine with (a).
Created attachment 1137839 [details] Resolve call to srand, use more entropy Excellent. Please review the new candidate for a patch, ideally the text I put in the commit message, too. Thanks!
Looks good to me, including the comment. Thanks a lot!
Oh, will that break that Windows build? IIRC #include <sys/*.h> won't work on Windows.
The patch doesn't work on Windows yet, yes. I am working with someone on adding Windows support based on functions similar to gettimeofday but supported on Windows.
Since keyword "Security" has been added: Does this deserve/have a CVE?
Hi, Patch in attachment #1137839 [details] is using a large constant which is not going to be valid for ABIs where long are 32 bits (32 bits systems and Windows 64 bits). Anyway, I think the value from struct timeval.tv_usec can be used asis as it will be hashed with others items in CHAR_HASH(). By the way, may be you could xor tv_usec with struct XML_ParserStruct address to reduce the likeliness of having two parsers with the same salt. Regards.
Hello Yann, I appreciate your input! > Patch in attachment #1137839 [details] is using a large constant which is not > going to be valid for ABIs where long are 32 bits (32 bits systems and Windows > 64 bits). I suppose on 32bit the constant would be cut down to 2**61-1 & 0xffffffff = 4294967295 = 3 * 5 * 17 * 257 * 65537 = not a prime? How about we check sizeof(long) == 4 at runtime and use 2^31-1 then, 2^61-1 else? Both are Mersenne primes. > By the way, may be you could xor tv_usec with struct XML_ParserStruct address > to reduce the likeliness of having two parsers with the same salt. Sounds like a great idea to gain a few free extra bits of cross-platform entropy.
PS: The latest two commits on https://sourceforge.net/p/expat/code_git/ci/resolve-srand/log/ no implement just that.
At least on i386 and amd64, GCC turns that code if (sizeof(unsigned long) == 4) { return entropy * 2147483647; } else { return entropy * 2305843009213693951; } to (roughly manually synthesized from assembler): if (sizeof(unsigned long) == 4) return (entropy << 31) - entropy; else return (entropy << 61) - entropy; I don't think this implements a valuable PRNG (look at the (lack of) avalanche effect). Anyway, as I said earlier, I don't think you need to implement a PRNG of your own here: it's probably easier to use (gather_time_entropy() ^ getpid() ^ (unsigned long)parser) directly. The value is used as seed to initialize hash computation for one given parser. I think an attacker being able to find the value by extracting some bits of salt by looking at output of some expat's public functions (for example, the order of keys in a dictionary) won't be able to use the value further, attacker won't be able to use this knowledge to construct a new XML file that triggers hash collisions, as the salt for the new parser will likely be different. By the way, if an attacker is able to observe hash output, then the hash function has to be hardened. Regards.
Btw. you can also use BSD style random() which allows to get previous state via setstate() and later restore that previous PRNG state. It's supported in glibc/muslc on Linux/BSD/Mac, but sadly it's not in Windows libc. Regards.
(In reply to Yann Droneaud from comment #24) > At least on i386 and amd64, GCC turns that code > [..] > to (roughly manually synthesized from assembler): > > if (sizeof(unsigned long) == 4) > return (entropy << 31) - entropy; > else > return (entropy << 61) - entropy; > > I don't think this implements a valuable PRNG (look at the (lack of) > avalanche effect). Agreed. Your translated assembly seems to still do what I was aiming at though: bijective transformation from monotonic to non-monotonic. Play code: #include <stdlib.h> #include <stdio.h> static unsigned long process(unsigned long i) { if (sizeof(unsigned long) == 4) return (i << 31) - i; else return (i << 61) - i; } int main(int argc, char ** argv) { unsigned long i = 1; unsigned long cur; unsigned long prev = 0; const long times = atol(argv[1]) + 1; for (; i < times; i++) { cur = process(i); printf("%lu, %i\n", cur, (cur > prev) ? 1 : -1); prev = cur; } return 0; } And invocation: # gcc yann.c && ./a.out 30 (In reply to marqin.pl+rh from comment #25) > Btw. you can also use BSD style random() which allows to get previous state > via setstate() and later restore that previous PRNG state. It's supported in > glibc/muslc on Linux/BSD/Mac, but sadly it's not in Windows libc. I think with random(3) we have the issue of how/when to call srandom again and would mess with what the application may do with random(3) in the same thread else. We would need a function getstate so we can restore previous state transparently after, I guess?
(In reply to Sebastian Pipping from comment #26) > (In reply to Yann Droneaud from comment #24) > > At least on i386 and amd64, GCC turns that code > > [..] > > to (roughly manually synthesized from assembler): > > > > if (sizeof(unsigned long) == 4) > > return (entropy << 31) - entropy; > > else > > return (entropy << 61) - entropy; > > > > I don't think this implements a valuable PRNG (look at the (lack of) > > avalanche effect). > > Agreed. Your translated assembly seems to still do what I was aiming at > though: bijective transformation from monotonic to non-monotonic. Play code: > > #include <stdlib.h> > #include <stdio.h> > > static unsigned long process(unsigned long i) { > if (sizeof(unsigned long) == 4) > return (i << 31) - i; > else > return (i << 61) - i; > } > > int main(int argc, char ** argv) { > unsigned long i = 1; > unsigned long cur; > unsigned long prev = 0; > const long times = atol(argv[1]) + 1; > for (; i < times; i++) { > cur = process(i); > printf("%lu, %i\n", cur, (cur > prev) ? 1 : -1); > prev = cur; > } > return 0; > } > > And invocation: > > # gcc yann.c && ./a.out 30 > > That doesn't look so good when formatted differently. --- yann.c 2016-03-28 21:52:34.278518801 +0200 +++ yann.c 2016-03-28 21:51:14.088592915 +0200 @@ -15,7 +15,9 @@ const long times = atol(argv[1]) + 1; for (; i < times; i++) { cur = process(i); - printf("%lu, %i\n", cur, (cur > prev) ? 1 : -1); + printf("%0*lx, %i\n", + (int)sizeof(unsigned long) * 2, + cur, (cur > prev) ? 1 : -1); prev = cur; } return 0; $ ./a.out 20 1fffffffffffffff, 1 3ffffffffffffffe, 1 5ffffffffffffffd, 1 7ffffffffffffffc, 1 9ffffffffffffffb, 1 bffffffffffffffa, 1 dffffffffffffff9, 1 fffffffffffffff8, 1 1ffffffffffffff7, -1 3ffffffffffffff6, 1 5ffffffffffffff5, 1 7ffffffffffffff4, 1 9ffffffffffffff3, 1 bffffffffffffff2, 1 dffffffffffffff1, 1 fffffffffffffff0, 1 1fffffffffffffef, -1 3fffffffffffffee, 1 5fffffffffffffed, 1 7fffffffffffffec, 1 9fffffffffffffeb, 1 bfffffffffffffea, 1 dfffffffffffffe9, 1 ffffffffffffffe8, 1 1fffffffffffffe7, -1 3fffffffffffffe6, 1 5fffffffffffffe5, 1 7fffffffffffffe4, 1 9fffffffffffffe3, 1 bfffffffffffffe2, 1 With the following patch that formats the values as a binary stream --- yann.c 2016-03-28 21:52:34.278518801 +0200 +++ yann.c 2016-03-28 22:00:27.815893604 +0200 @@ -8,15 +8,11 @@ return (i << 61) - i; } -int main(int argc, char ** argv) { - unsigned long i = 1; - unsigned long cur; - unsigned long prev = 0; - const long times = atol(argv[1]) + 1; - for (; i < times; i++) { - cur = process(i); - printf("%lu, %i\n", cur, (cur > prev) ? 1 : -1); - prev = cur; +int main(void) { + unsigned long i; + for (i = 0;; i++) { + unsigned long hash = process(i); + fwrite(&hash, sizeof(hash), 1, stdout); } return 0; } You can fed the output to dieharder -g 200 -a: $ ./a.out | dieharder -g 200 -a #=================================================================== # dieharder version 3.31.1 Copyright 2003 Robert G. Brown #=================================================================== rng_name |rands/second| Seed | stdin_input_raw| 3.71e+07 |1571883871| #=================================================================== test_name |ntup| tsamples |psamples| p-value |Assessment #=================================================================== diehard_birthdays| 0| 100| 100|0.00000000| FAILED diehard_operm5| 0| 1000000| 100|0.00000000| FAILED diehard_rank_32x32| 0| 40000| 100|0.00000000| FAILED diehard_rank_6x8| 0| 100000| 100|0.00000000| FAILED diehard_bitstream| 0| 2097152| 100|0.00000000| FAILED diehard_opso| 0| 2097152| 100|0.00000000| FAILED diehard_oqso| 0| 2097152| 100|0.00000000| FAILED diehard_dna| 0| 2097152| 100|0.00000000| FAILED diehard_count_1s_str| 0| 256000| 100|0.00000000| FAILED diehard_count_1s_byt| 0| 256000| 100|0.00000000| FAILED diehard_parking_lot| 0| 12000| 100|0.00000000| FAILED diehard_2dsphere| 2| 8000| 100|0.00000000| FAILED diehard_3dsphere| 3| 4000| 100|0.00000000| FAILED diehard_squeeze| 0| 100000| 100|0.00000000| FAILED diehard_sums| 0| 100| 100|0.00000000| FAILED diehard_runs| 0| 100000| 100|0.00000000| FAILED diehard_runs| 0| 100000| 100|0.00000000| FAILED diehard_craps| 0| 200000| 100|0.00000000| FAILED diehard_craps| 0| 200000| 100|0.00000000| FAILED ... <CTRL-C> Regards.
Interesting! I wouldn't have expected it to be that bad (and also not /dev/urandom performing so well, just tried). I have gotten a few of these to pass using a different number with more balance distribution of 1s and 0s, so far for 64 bit only. For better randomness (that we do not need to initialize ourselves): For Windows XP and later we may have CryptGenRandom, arc4random_buf on BSD, getrandom on Linux. Any ideas how to fix the link issue with comment #10 ?
What is the sense of testing an entropy generator against dieharder? It is not a proper PRNG and hence it is very likely to fail.
Created attachment 1142489 [details] gettimeofday of sys/time.h replacement for MSVC
attachment 1142489 [details] is a substitude of header sys/time.h providing gettineofday for MSVC .
Thanks. I should have mentioned we have Windows support on Git master by now, already.
(In reply to tc from comment #29) > What is the sense of testing an entropy generator against dieharder? It is > not a proper PRNG and hence it is very likely to fail. Since generate_hash_secret_salt is called for each parser potentially, it is used like an RNG by Expat effectively, to my understanding.
Well, then dieharder fails on the old and the nwe implementation. I actually like the idea of adding the low-level functions gettimeofday and _getpid on MSVC (instead of the used Win32-API functions.
(In reply to tc from comment #34) > Well, then dieharder fails on the old and the nwe implementation. Yes. Still, the old one had more side-effects and used less entropy. > I actually like the idea of adding the low-level functions gettimeofday and > _getpid on MSVC (instead of the used Win32-API functions. I see the beauty in that but I'm not sure if it's the best road. Maybe let's take that off-ticket.
(In reply to Sebastian Pipping from comment #20) > Since keyword "Security" has been added: Does this deserve/have a CVE? Update: CVE-2012-6702 has been assigned to this issue.
Commit: http://pkgs.fedoraproject.org/gitweb/?p=expat.git;a=commitdiff;h=27e66613e8e6f712d405820667e77a4cb08f9946
Package: expat-2.1.1-2.fc25 Build: https://koji.fedoraproject.org/koji/buildinfo?buildID=773512
Package: expat-2.1.1-2.fc24 Build: https://koji.fedoraproject.org/koji/buildinfo?buildID=773516
expat-2.1.1-2.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-7c6e7a9265
Package: expat-2.1.1-2.fc23 Build: https://koji.fedoraproject.org/koji/buildinfo?buildID=773519
expat-2.1.1-2.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-60889583ab
Package: expat-2.1.1-2.fc22 Build: https://koji.fedoraproject.org/koji/buildinfo?buildID=773523
expat-2.1.1-2.fc22 has been submitted as an update to Fedora 22. https://bodhi.fedoraproject.org/updates/FEDORA-2016-0fd6ca526a
expat-2.1.1-2.fc22 has been pushed to the Fedora 22 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-0fd6ca526a
expat-2.1.1-2.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-60889583ab
expat-2.1.1-2.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-7c6e7a9265
expat-2.1.1-2.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.
expat-2.1.1-2.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.
expat-2.1.1-2.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report.