Bug 203915

Summary: 64-bit unclean code in evolution
Product: [Fedora] Fedora Reporter: Paul F. Johnson <paul>
Component: evolutionAssignee: Matthew Barnes <mbarnes>
Status: CLOSED RAWHIDE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 6CC: desktop-bugs, gnomeuser, jakub, redhat
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: evolution-data-server-1.8.0-11.fc6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-09-27 14:37:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 150224    
Attachments:
Description Flags
Patch to clean up 64-bit unclean code none

Description Paul F. Johnson 2006-08-24 13:58:37 UTC
Description of problem:
Since the update to the new -
23 version of glibc (and the 
newer versions of Evolution 
from the 23rd Aug), Evolution 
fails to work. It starts but 
quickly begins to claim 90% 
of the processor time.

The software has not crashed, 
it's just sitting there.

Version-Release number of 
selected component (if 
applicable):
$ rpm -q evolution glibc
evolution-2.7.92-4.fc6
glibc-2.4.90-23
glibc-2.4.90-23

How reproducible:
Always

Steps to Reproduce:
1. Start evolution from 
either the command line or 
from the desktop icon
2.
3.
  
Actual results:
Evolution takes upto 90% of 
the processing time

Running a debugger through 
gives the following

1) Starting evolution hangs 
in an endless loop of

mmap(NULL, 68719480832, 
PROT_READ|PROT_WRITE,  
MAP_PRIVATE|MAP_ANONYMOUS, -
1, 0) = -1
ENOMEM (Cannot allocate 
memory)
mmap(NULL, 68719480832, 
PROT_READ|PROT_WRITE,  
MAP_PRIVATE|MAP_ANONYMOUS, -
1, 0) = -1
ENOMEM (Cannot allocate 
memory)
mmap(NULL, 68719480832, 
PROT_READ|PROT_WRITE,  
MAP_PRIVATE|MAP_ANONYMOUS, -
1, 0) = -1
ENOMEM (Cannot allocate 
memory)

This comes just after it 
tries to read any  
~/.evolution/mail/local/
*.cmeta-file

Deleting all the .cmeta-files 
will at least let evolution 
start.


2) After starting up 
evolution (deleting the
 .cmeta-files) it is still  
unusable.
It is using 90% cpu, but now 
there is no apparent reason 
for the hang.
After abount 30 minutes of 
waiting for any folder to 
show up, I just  
had to kill it to be
able to use the machine at 
all.

An strace of the process now 
onlys shows

poll([{fd=5, events=POLLIN}, 
{fd=4, events=POLLIN, 
revents=POLLIN}, {fd=9,
events=POLLIN|POLLPRI}, 
{fd=11, events=POLLIN|
POLLPRI}, {fd=27,  
events=POLLIN}, {fd=19,
events=POLLIN}, {fd=23, 
events=POLLIN}], 7, 12952) = 1
ioctl(4, FIONREAD, 
[32])                = 0
read(4, "\241 
U7z\0\300\2\17\1\0\0\26\1\0\0\254\r\0\0z\0\300\2\0"...,
 32) = 32
write(4, 
"\31\0\v\0M\0\0\0\0\0\30\0! 
f\0M\0\0\0\17\1\0\0\26\1\0\0"...,
  
44) = 44
ioctl(4, FIONREAD, 
[0])                 = 0
ioctl(4, FIONREAD, 
[0])                 = 0
poll([{fd=5, events=POLLIN}, 
{fd=4, events=POLLIN, 
revents=POLLIN}, {fd=9,
events=POLLIN|POLLPRI}, 
{fd=11, events=POLLIN|
POLLPRI}, {fd=27,  
events=POLLIN}, {fd=19,
events=POLLIN}, {fd=23, 
events=POLLIN}], 7, 7924) = 1
ioctl(4, FIONREAD, 
[32])                = 0
read(4, "\241 
V7z\0\300\2\17\1\0\0\26\1\0\0\255\r\0\0z\0\300\2\0"...,
 32) = 32
write(4, 
"\31\0\v\0M\0\0\0\0\0\30\0! 
f\0M\0\0\0\17\1\0\0\26\1\0\0"...,
  
44) = 44
ioctl(4, FIONREAD, 
[0])                 = 0
ioctl(4, FIONREAD, 
[0])                 = 0

With a new poll about every 2 
or 3 seconds.

These may be unrelated.

Expected results:
Evolution should give it's 
usual high standard open 
source goodness as a usenet 
and email client.

Additional info:
The problem is only seen on 
x86_64

Comment 1 Jakub Jelinek 2006-08-24 15:59:52 UTC
There seem to be several bugs.
One is that evolution calls malloc (0x1000000030).
camel_object_get_type has (%rax is 1):
lea -20(%rax), %edi    ! Note, 32-bit result
shl $0x4, %rdi
add $0x160, %rdi
and passes that as argument to g_try_malloc.
This corresponds to:
426             guint32 i, count, version;
...
459                     /* we batch up the properties and set them in one go */
460                     if (!(argv = g_try_malloc (sizeof (*argv) + (count -
CAMEL_ARGV_MAX) * sizeof (argv->argv[0]))))
461                             return -1;

which is clearly 64-bit unclean code.  Either you want to allocate at least
sizeof (*argv), then it should be (MIN (count, CAMEL_ARGV_MAX) - CAMEL_ARGV_MAX)
* sizeof (argv->argv[0]) or something similar, or it wants to allocate sometimes
less than sizeof (*argv), in that case it could use something like
offsetof (CamelArgV, argv) + count * CAMEL_ARGV_MAX.
Or use signed count.

Another one is glibc behavior when malloc is called with this and there isn't
enough memory, will look at that now.

Comment 2 Paul F. Johnson 2006-08-26 22:09:53 UTC
Seems to be happy with the glibc update on 26th Aug.

Comment 3 Paul F. Johnson 2006-08-31 01:10:26 UTC
Okay, it's not crashed and burned. Closing the bug

Comment 4 Jakub Jelinek 2006-09-12 14:13:24 UTC
This shouldn't be closed until evolution side is fixed.  Perhaps it doesn't
crash or hang if it the malloc fails, but the 64-bit unclean code is still
causing an allocation which will very likely not succeed ever.

Comment 5 Matthias Clasen 2006-09-26 04:21:55 UTC
Matt, did you investigate the question in comment #1 ?
Do we want to allocate at most (but sometimes less than) CAMEL_ARGV_MAX,
or what is the goal here ?

It appears to me that we should maybe change the array in CamelArgV to be
flexible, and do away with CAMEL_ARGV_MAX altogether ? Then we would simply do

g_try_malloc (sizeof(CamelArgV) + count * sizeof(CamelArg))

Comment 6 Matthew Barnes 2006-09-26 19:45:12 UTC
Wow, what a mess.  Why couldn't they just use GValueArray for this stuff?

I don't see how CAMEL_ARGV_MAX is actually limiting anything as far as memory
allocation goes.  It looks like the following expression is intended to come out
negative:

   (count - CAMEL_ARGV_MAX) * sizeof (argv->argv[0])

But, for example, if count == (CAMEL_ARGV_MAX + 1) it works out to:

   (1) * sizeof (argv->argv[0])

which is positive.  So the allocation limit is bypassed.

I spotted two other places in camel-object.c that use this expression.

Comment 7 Matthew Barnes 2006-09-26 20:09:20 UTC
I forgot to clarify that the intent of the expression seems to be to allocate
just enough memory to hold a given number of CamelArgs.

I like Matthias' idea of using a flexible array, but I think I'll save that for
post-FC6 (eventually I'd like to rewrite the entire Camel vararg mechanism).

For now I'll take a more conservative approach that actually enforces the limit:

   count = MIN (count, CAMEL_ARGV_MAX);
   argv = g_try_malloc (sizeof (CamelArgV) -
           (CAMEL_ARGV_MAX - count) * sizeof (CamelArg));

Comment 8 Matthew Barnes 2006-09-27 02:17:40 UTC
Created attachment 137194 [details]
Patch to clean up 64-bit unclean code

Comment 9 Matthew Barnes 2006-09-27 14:37:51 UTC
Fixed in evolution-data-server-1.8.0-11.fc6

Comment 10 Nicole Dai 2006-10-13 06:09:18 UTC
Retest with evolution-2.8.0-6.fc6, evolution-data-server-1.8.0-11.fc6 and
glibc-2.4.90-36 by the following steps:
1. Launch evolution via command or Email icon on panel
2. Normally use it, such as setup account,new message, send/receive mails, etc.
3. Observe CPU history and other system resouces.

Test result: There was no abnormity showed as evolution can be used normally and
system resouces load was low.