203915 – 64-bit unclean code in evolution

Bug 203915 - 64-bit unclean code in evolution

Summary: 64-bit unclean code in evolution

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	evolution
Sub Component:
Version:	6
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Assignee:	Matthew Barnes
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	FC6Blocker
TreeView+	depends on / blocked

Reported:	2006-08-24 13:58 UTC by Paul F. Johnson
Modified:	2007-11-30 22:11 UTC (History)
CC List:	4 users (show)
Fixed In Version:	evolution-data-server-1.8.0-11.fc6
Clone Of:
Environment:
Last Closed:	2006-09-27 14:37:51 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Patch to clean up 64-bit unclean code (1.19 KB, patch) 2006-09-27 02:17 UTC, Matthew Barnes	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
GNOME Bugzilla	331099	0	None	None	None	Never

Description Paul F. Johnson 2006-08-24 13:58:37 UTC

Description of problem:
Since the update to the new -
23 version of glibc (and the 
newer versions of Evolution 
from the 23rd Aug), Evolution 
fails to work. It starts but 
quickly begins to claim 90% 
of the processor time.

The software has not crashed, 
it's just sitting there.

Version-Release number of 
selected component (if 
applicable):
$ rpm -q evolution glibc
evolution-2.7.92-4.fc6
glibc-2.4.90-23
glibc-2.4.90-23

How reproducible:
Always

Steps to Reproduce:
1. Start evolution from 
either the command line or 
from the desktop icon
2.
3.
  
Actual results:
Evolution takes upto 90% of 
the processing time

Running a debugger through 
gives the following

1) Starting evolution hangs 
in an endless loop of

mmap(NULL, 68719480832, 
PROT_READ|PROT_WRITE,  
MAP_PRIVATE|MAP_ANONYMOUS, -
1, 0) = -1
ENOMEM (Cannot allocate 
memory)
mmap(NULL, 68719480832, 
PROT_READ|PROT_WRITE,  
MAP_PRIVATE|MAP_ANONYMOUS, -
1, 0) = -1
ENOMEM (Cannot allocate 
memory)
mmap(NULL, 68719480832, 
PROT_READ|PROT_WRITE,  
MAP_PRIVATE|MAP_ANONYMOUS, -
1, 0) = -1
ENOMEM (Cannot allocate 
memory)

This comes just after it 
tries to read any  
~/.evolution/mail/local/
*.cmeta-file

Deleting all the .cmeta-files 
will at least let evolution 
start.


2) After starting up 
evolution (deleting the
 .cmeta-files) it is still  
unusable.
It is using 90% cpu, but now 
there is no apparent reason 
for the hang.
After abount 30 minutes of 
waiting for any folder to 
show up, I just  
had to kill it to be
able to use the machine at 
all.

An strace of the process now 
onlys shows

poll([{fd=5, events=POLLIN}, 
{fd=4, events=POLLIN, 
revents=POLLIN}, {fd=9,
events=POLLIN|POLLPRI}, 
{fd=11, events=POLLIN|
POLLPRI}, {fd=27,  
events=POLLIN}, {fd=19,
events=POLLIN}, {fd=23, 
events=POLLIN}], 7, 12952) = 1
ioctl(4, FIONREAD, 
[32])                = 0
read(4, "\241 
U7z\0\300\2\17\1\0\0\26\1\0\0\254\r\0\0z\0\300\2\0"...,
 32) = 32
write(4, 
"\31\0\v\0M\0\0\0\0\0\30\0! 
f\0M\0\0\0\17\1\0\0\26\1\0\0"...,
  
44) = 44
ioctl(4, FIONREAD, 
[0])                 = 0
ioctl(4, FIONREAD, 
[0])                 = 0
poll([{fd=5, events=POLLIN}, 
{fd=4, events=POLLIN, 
revents=POLLIN}, {fd=9,
events=POLLIN|POLLPRI}, 
{fd=11, events=POLLIN|
POLLPRI}, {fd=27,  
events=POLLIN}, {fd=19,
events=POLLIN}, {fd=23, 
events=POLLIN}], 7, 7924) = 1
ioctl(4, FIONREAD, 
[32])                = 0
read(4, "\241 
V7z\0\300\2\17\1\0\0\26\1\0\0\255\r\0\0z\0\300\2\0"...,
 32) = 32
write(4, 
"\31\0\v\0M\0\0\0\0\0\30\0! 
f\0M\0\0\0\17\1\0\0\26\1\0\0"...,
  
44) = 44
ioctl(4, FIONREAD, 
[0])                 = 0
ioctl(4, FIONREAD, 
[0])                 = 0

With a new poll about every 2 
or 3 seconds.

These may be unrelated.

Expected results:
Evolution should give it's 
usual high standard open 
source goodness as a usenet 
and email client.

Additional info:
The problem is only seen on 
x86_64

Comment 1 Jakub Jelinek 2006-08-24 15:59:52 UTC

There seem to be several bugs.
One is that evolution calls malloc (0x1000000030).
camel_object_get_type has (%rax is 1):
lea -20(%rax), %edi    ! Note, 32-bit result
shl $0x4, %rdi
add $0x160, %rdi
and passes that as argument to g_try_malloc.
This corresponds to:
426             guint32 i, count, version;
...
459                     /* we batch up the properties and set them in one go */
460                     if (!(argv = g_try_malloc (sizeof (*argv) + (count -
CAMEL_ARGV_MAX) * sizeof (argv->argv[0]))))
461                             return -1;

which is clearly 64-bit unclean code.  Either you want to allocate at least
sizeof (*argv), then it should be (MIN (count, CAMEL_ARGV_MAX) - CAMEL_ARGV_MAX)
* sizeof (argv->argv[0]) or something similar, or it wants to allocate sometimes
less than sizeof (*argv), in that case it could use something like
offsetof (CamelArgV, argv) + count * CAMEL_ARGV_MAX.
Or use signed count.

Another one is glibc behavior when malloc is called with this and there isn't
enough memory, will look at that now.

Comment 2 Paul F. Johnson 2006-08-26 22:09:53 UTC

Seems to be happy with the glibc update on 26th Aug.

Comment 3 Paul F. Johnson 2006-08-31 01:10:26 UTC

Okay, it's not crashed and burned. Closing the bug

Comment 4 Jakub Jelinek 2006-09-12 14:13:24 UTC

This shouldn't be closed until evolution side is fixed.  Perhaps it doesn't
crash or hang if it the malloc fails, but the 64-bit unclean code is still
causing an allocation which will very likely not succeed ever.

Comment 5 Matthias Clasen 2006-09-26 04:21:55 UTC

Matt, did you investigate the question in comment #1 ?
Do we want to allocate at most (but sometimes less than) CAMEL_ARGV_MAX,
or what is the goal here ?

It appears to me that we should maybe change the array in CamelArgV to be
flexible, and do away with CAMEL_ARGV_MAX altogether ? Then we would simply do

g_try_malloc (sizeof(CamelArgV) + count * sizeof(CamelArg))

Comment 6 Matthew Barnes 2006-09-26 19:45:12 UTC

Wow, what a mess.  Why couldn't they just use GValueArray for this stuff?

I don't see how CAMEL_ARGV_MAX is actually limiting anything as far as memory
allocation goes.  It looks like the following expression is intended to come out
negative:

   (count - CAMEL_ARGV_MAX) * sizeof (argv->argv[0])

But, for example, if count == (CAMEL_ARGV_MAX + 1) it works out to:

   (1) * sizeof (argv->argv[0])

which is positive.  So the allocation limit is bypassed.

I spotted two other places in camel-object.c that use this expression.

Comment 7 Matthew Barnes 2006-09-26 20:09:20 UTC

I forgot to clarify that the intent of the expression seems to be to allocate
just enough memory to hold a given number of CamelArgs.

I like Matthias' idea of using a flexible array, but I think I'll save that for
post-FC6 (eventually I'd like to rewrite the entire Camel vararg mechanism).

For now I'll take a more conservative approach that actually enforces the limit:

   count = MIN (count, CAMEL_ARGV_MAX);
   argv = g_try_malloc (sizeof (CamelArgV) -
           (CAMEL_ARGV_MAX - count) * sizeof (CamelArg));

Comment 8 Matthew Barnes 2006-09-27 02:17:40 UTC

Created attachment 137194 [details]
Patch to clean up 64-bit unclean code

Comment 9 Matthew Barnes 2006-09-27 14:37:51 UTC

Fixed in evolution-data-server-1.8.0-11.fc6

Comment 10 Nicole Dai 2006-10-13 06:09:18 UTC

Retest with evolution-2.8.0-6.fc6, evolution-data-server-1.8.0-11.fc6 and
glibc-2.4.90-36 by the following steps:
1. Launch evolution via command or Email icon on panel
2. Normally use it, such as setup account,new message, send/receive mails, etc.
3. Observe CPU history and other system resouces.

Test result: There was no abnormity showed as evolution can be used normally and
system resouces load was low.

Note You need to log in before you can comment on or make changes to this bug.