Description of problem: Since the update to the new - 23 version of glibc (and the newer versions of Evolution from the 23rd Aug), Evolution fails to work. It starts but quickly begins to claim 90% of the processor time. The software has not crashed, it's just sitting there. Version-Release number of selected component (if applicable): $ rpm -q evolution glibc evolution-2.7.92-4.fc6 glibc-2.4.90-23 glibc-2.4.90-23 How reproducible: Always Steps to Reproduce: 1. Start evolution from either the command line or from the desktop icon 2. 3. Actual results: Evolution takes upto 90% of the processing time Running a debugger through gives the following 1) Starting evolution hangs in an endless loop of mmap(NULL, 68719480832, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, - 1, 0) = -1 ENOMEM (Cannot allocate memory) mmap(NULL, 68719480832, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, - 1, 0) = -1 ENOMEM (Cannot allocate memory) mmap(NULL, 68719480832, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, - 1, 0) = -1 ENOMEM (Cannot allocate memory) This comes just after it tries to read any ~/.evolution/mail/local/ *.cmeta-file Deleting all the .cmeta-files will at least let evolution start. 2) After starting up evolution (deleting the .cmeta-files) it is still unusable. It is using 90% cpu, but now there is no apparent reason for the hang. After abount 30 minutes of waiting for any folder to show up, I just had to kill it to be able to use the machine at all. An strace of the process now onlys shows poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN, revents=POLLIN}, {fd=9, events=POLLIN|POLLPRI}, {fd=11, events=POLLIN| POLLPRI}, {fd=27, events=POLLIN}, {fd=19, events=POLLIN}, {fd=23, events=POLLIN}], 7, 12952) = 1 ioctl(4, FIONREAD, [32]) = 0 read(4, "\241 U7z\0\300\2\17\1\0\0\26\1\0\0\254\r\0\0z\0\300\2\0"..., 32) = 32 write(4, "\31\0\v\0M\0\0\0\0\0\30\0! f\0M\0\0\0\17\1\0\0\26\1\0\0"..., 44) = 44 ioctl(4, FIONREAD, [0]) = 0 ioctl(4, FIONREAD, [0]) = 0 poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN, revents=POLLIN}, {fd=9, events=POLLIN|POLLPRI}, {fd=11, events=POLLIN| POLLPRI}, {fd=27, events=POLLIN}, {fd=19, events=POLLIN}, {fd=23, events=POLLIN}], 7, 7924) = 1 ioctl(4, FIONREAD, [32]) = 0 read(4, "\241 V7z\0\300\2\17\1\0\0\26\1\0\0\255\r\0\0z\0\300\2\0"..., 32) = 32 write(4, "\31\0\v\0M\0\0\0\0\0\30\0! f\0M\0\0\0\17\1\0\0\26\1\0\0"..., 44) = 44 ioctl(4, FIONREAD, [0]) = 0 ioctl(4, FIONREAD, [0]) = 0 With a new poll about every 2 or 3 seconds. These may be unrelated. Expected results: Evolution should give it's usual high standard open source goodness as a usenet and email client. Additional info: The problem is only seen on x86_64
There seem to be several bugs. One is that evolution calls malloc (0x1000000030). camel_object_get_type has (%rax is 1): lea -20(%rax), %edi ! Note, 32-bit result shl $0x4, %rdi add $0x160, %rdi and passes that as argument to g_try_malloc. This corresponds to: 426 guint32 i, count, version; ... 459 /* we batch up the properties and set them in one go */ 460 if (!(argv = g_try_malloc (sizeof (*argv) + (count - CAMEL_ARGV_MAX) * sizeof (argv->argv[0])))) 461 return -1; which is clearly 64-bit unclean code. Either you want to allocate at least sizeof (*argv), then it should be (MIN (count, CAMEL_ARGV_MAX) - CAMEL_ARGV_MAX) * sizeof (argv->argv[0]) or something similar, or it wants to allocate sometimes less than sizeof (*argv), in that case it could use something like offsetof (CamelArgV, argv) + count * CAMEL_ARGV_MAX. Or use signed count. Another one is glibc behavior when malloc is called with this and there isn't enough memory, will look at that now.
Seems to be happy with the glibc update on 26th Aug.
Okay, it's not crashed and burned. Closing the bug
This shouldn't be closed until evolution side is fixed. Perhaps it doesn't crash or hang if it the malloc fails, but the 64-bit unclean code is still causing an allocation which will very likely not succeed ever.
Matt, did you investigate the question in comment #1 ? Do we want to allocate at most (but sometimes less than) CAMEL_ARGV_MAX, or what is the goal here ? It appears to me that we should maybe change the array in CamelArgV to be flexible, and do away with CAMEL_ARGV_MAX altogether ? Then we would simply do g_try_malloc (sizeof(CamelArgV) + count * sizeof(CamelArg))
Wow, what a mess. Why couldn't they just use GValueArray for this stuff? I don't see how CAMEL_ARGV_MAX is actually limiting anything as far as memory allocation goes. It looks like the following expression is intended to come out negative: (count - CAMEL_ARGV_MAX) * sizeof (argv->argv[0]) But, for example, if count == (CAMEL_ARGV_MAX + 1) it works out to: (1) * sizeof (argv->argv[0]) which is positive. So the allocation limit is bypassed. I spotted two other places in camel-object.c that use this expression.
I forgot to clarify that the intent of the expression seems to be to allocate just enough memory to hold a given number of CamelArgs. I like Matthias' idea of using a flexible array, but I think I'll save that for post-FC6 (eventually I'd like to rewrite the entire Camel vararg mechanism). For now I'll take a more conservative approach that actually enforces the limit: count = MIN (count, CAMEL_ARGV_MAX); argv = g_try_malloc (sizeof (CamelArgV) - (CAMEL_ARGV_MAX - count) * sizeof (CamelArg));
Created attachment 137194 [details] Patch to clean up 64-bit unclean code
Fixed in evolution-data-server-1.8.0-11.fc6
Retest with evolution-2.8.0-6.fc6, evolution-data-server-1.8.0-11.fc6 and glibc-2.4.90-36 by the following steps: 1. Launch evolution via command or Email icon on panel 2. Normally use it, such as setup account,new message, send/receive mails, etc. 3. Observe CPU history and other system resouces. Test result: There was no abnormity showed as evolution can be used normally and system resouces load was low.