Python in FC4 is compiled with UCS4 unicode strings instead of the upstream default of UCS2, this changes the exported ABI and renders Fedora libpython incompatible with other distributions. That's a serious problem, it means nobody who wishes to distribute binaries on Linux can realistically embed Python or use Python C modules in their app!
Red Hat has been shipping Python with UCS4 support since Red Hat Linux 9, if my memory serves right. Doing so gives us access to large character sets, and it also uncovered a number of bugs that were fixed upstream ever since. I don't think we are in the position at this point to go back to UCS2, since we perceive the future to be UCS4. The way people solve the ABI incompatibility problem is by shipping packages for specific distributions (which is probably the right thing to do in general).
If the future is UCS4 why isnt the upstream python package providing that by default?. Has these patches been pushed there?. Distribution specific packages for all third party software embedding python is a no go
I believe the answer is the amount of memory you are willing to use. Python encodes Unicode characters as a fixed-size array of chars. This has the advantage of simplifying a lot of the string operations, but it doubles or quadruples the memory requirements if what you use mostly is ASCII only. The flipside would have been to use UTF8 for the internal representation of Unicode chars, which saves you memory when you use ASCII only (or mostly ASCII with very few non-ASCII), but then string operations would be slowed down, since computing the length of a string would require examining each character in the string to see if it's a single-byte or multi-byte char. UCS2 is using 2 bytes for each char, UCS4 uses 4 bytes. Unicode defines more than 65535 characters, so a 2-byte representation is not enough (although, if I understand correctly, characters outside of UCS2 are not that frequently used). Moving from UCS2 to UCS4 (or the other way around) is a rather major undertake; we decided to make that move some time ago, and we are trying to preserve the ABI within Red Hat's products and avoid the exact type of problem you describe _within_ Fedora/RHEL. There is no one-size-fits-all, unfortunately: people who don't care about characters outside of UCS2 would probably want the extra memory back; OTOH, some people want the ability to represent those chars. That being said: which distros are still shipping UCS2? As far as I know, SuSE ships UCS4 since 9, Debian ships UCS4. Mandriva 2006 seems to still be UCS2.