Bug 132108 - non-i386 canna LE causes htt_server SIGABRT, heavy CPU usage
Summary: non-i386 canna LE causes htt_server SIGABRT, heavy CPU usage
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: im-sdk
Version: rawhide
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Akira TAGOH
QA Contact:
URL:
Whiteboard:
: 133765 (view as bug list)
Depends On:
Blocks: FC3Target IIIMF
TreeView+ depends on / blocked
 
Reported: 2004-09-08 21:19 UTC by Zack Cerza
Modified: 2007-11-30 22:10 UTC (History)
5 users (show)

Fixed In Version: 12.0.1-7.svn1891
Clone Of:
Environment:
Last Closed: 2004-10-04 17:30:20 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
snipped 'top' output (557 bytes, text/plain)
2004-09-08 21:21 UTC, Zack Cerza
no flags Details

Description Zack Cerza 2004-09-08 21:19:12 UTC
Description of problem:
htt uses an average of 35% of my CPU cycles on a dual 1.8GHz Opteron,
when the CPU should be almost entirely idle. 

I've been told that im-sdk's purpose is related to aiding the input of
Indic (and possibly other non-latin) character sets, but to my
knowledge I'm not using it.

Version-Release number of selected component (if applicable):
iiimf-server-12.0.1-3.svn1891

How reproducible:
Always

Comment 1 Zack Cerza 2004-09-08 21:21:31 UTC
Created attachment 103611 [details]
snipped 'top' output

Pay close attention to the uptime, and htt's 'TIME+' column.

Comment 2 Akira TAGOH 2004-09-09 05:42:52 UTC
did you see any messages which is related htt on /var/log/messages?

Comment 3 Zack Cerza 2004-09-09 15:06:40 UTC
This is all I find: 

[root log]# date
Thu Sep  9 11:07:47 EDT 2004
[root log]# egrep \(iii\|htt[.\ ]\) messages
Sep  7 12:46:54 tallest iiim: htt shutdown succeeded
Sep  7 12:46:54 tallest su(pam_unix)[32593]: session opened for user
htt by (uid=0)tallest
Sep  7 12:46:55 tallest iiim: htt startup succeeded
Sep  7 17:28:39 tallest su(pam_unix)[2946]: session opened for user
htt by (uid=0)
Sep  7 17:28:39 tallest iiim: htt startup succeeded


Comment 4 Warren Togami 2004-09-11 11:39:00 UTC
This does seem to be a 64bit specific problem.  htt_server fails to
start when loading the canna LE.  Here is "htt_server -d" output and
backtrace when it reaches that failure.

LE(CannaLE) is loading.
    Path=/usr/lib64/im/leif/
    version=1.2
    locale=
    need_thread_lock=true
    langs=ja,
object for CannaLE
    object_type      = 131
    object id        = 32770
    object size      = 0
    rev. domain name = com.OpenI18N.leif
    path             = ./locale/ja/CannaLE/aux.so
    scope            = CannaLE
    signature        =
    basepath         =
    encoding         =
Internal error "Invalid Object Type.": IMBasicObject.cpp (255)

Program received signal SIGABRT, Aborted.
[Switching to Thread 182895402720 (LWP 6268)]
0x0000003acd12dda1 in raise () from /lib64/tls/libc.so.6
(gdb) bt
#0  0x0000003acd12dda1 in raise () from /lib64/tls/libc.so.6
#1  0x0000003acd12f5be in abort () from /lib64/tls/libc.so.6
#2  0x0000000000443ad6 in convert_od_type (ot=1920169263) at
IMBasicObject.cpp:255
#3  0x0000000000444005 in IMObjectWithDesc (this=0x58ee40,
desc=@0x583428) at IMBasicObject.cpp:263
#4  0x000000000041f6d9 in LEBase::add_imobjectdesc (this=0x582c80,
pol=0x583428) at LE.cpp:77
#5  0x000000000041fe62 in LEBase::loadif (this=0x582c80) at LE.cpp:145
#6  0x00000000004208ab in LEBase (this=0x582c80, x_dirname=@0x5826f8,
x_filename=@0x5826f0) at LE.cpp:280
#7  0x0000000000418936 in LEMgr::listup_LEs (this=0x5836d0) at
LEMgr.cpp:20
#8  0x000000000041a3f5 in LEMgr (this=0x5836d0, x_lepath=0x5804f8
"/usr/lib64/im/leif", xml=@0x5825d0) at LEMgr.cpp:330
#9  0x0000000000405e76 in IMSvr::config_le (this=0x7fbffff720,
lepath=0x5804f8 "/usr/lib64/im/leif", xml=@0x5825d0)
    at IMSvr.cpp:28
#10 0x000000000040951b in IMSvrCfg::config_le (this=0x7fbffff820,
pimsvr=0x7fbffff720,
    lepath=0x5804f8 "/usr/lib64/im/leif", xml=@0x5825d0) at
IMSvrCfg.hh:115
#11 0x000000000040874e in IMSvrArg::configure (this=0x7fbffff820,
pimsvr=0x7fbffff720) at IMSvrArg.cpp:189
#12 0x00000000004060ec in IMSvr::start (this=0x7fbffff720) at IMSvr.cpp:78
#13 0x0000000000405a99 in main (argc=2, argv=0x7fbffff998) at main.cpp:44
#14 0x0000003acd11befa in __libc_start_main () from /lib64/tls/libc.so.6
#15 0x00000000004058ba in _start ()
#16 0x0000007fbffff988 in ?? ()
#17 0x000000000000001c in ?? ()
Previous frame inner to this frame (corrupt stack?)


The excessive CPU usage that Zack describes is because htt is looping
infinitely like the below strace output.  htt should GIVE UP after a
set number of tries and log the failure rather than loop.

--- SIGUSR1 (User defined signal 1) @ 0 (0) ---
rt_sigaction(SIGUSR1, {SIG_DFL}, {SIG_IGN}, 8) = 0
clone(child_stack=0,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x2a9556b810) = 8420
wait4(8420, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGABRT}], 0, NULL) = 8420
--- SIGCHLD (Child exited) @ 0 (0) ---
rt_sigaction(SIGUSR1, {SIG_IGN}, {SIG_DFL}, 8) = 0
kill(0, SIGUSR1)   


I spent a few hours investigating possible causes for this 64bit
problem, and could find nothing readily obvious.  I compared canna LE
to /leif/sun_le_asia/th_TH/leif/le.c to see what canna is doing
differently, because iiimf-le-sun-thai similarly add an added object.
   The only real difference was in memory allocation for the IMObject.
 If I am understanding the code correctly, it should only need "1"
sizeof allocated, but Sun arbitrarily allocates "2" here.

objects = (IMObjectDescriptorStruct *) calloc(2,
sizeof(IMObjectDescriptorStruct));

The below patch prevents the SIGABRT and infinite looping.  This I
think is NOT A FIX, but rather an ugly workaround copied from Sun's
Thai LE.  It may be worth it to find the real cause of this problem. 
Note that I have not done any runtime testing of actually using IIIMF
on x86_64 yet.  I suspect there is other 64bit breakage elsewhere.

--- im-sdk-r12_0_1-svn1891/leif/canna/CannaLE.c.orig    2004-09-11
00:56:27.859055785 -1000
+++ im-sdk-r12_0_1-svn1891/leif/canna/CannaLE.c 2004-09-11
00:56:40.633274812 -1000
@@ -300,7 +300,7 @@
 init_objects()
 {
     IMObjectDescriptorStruct *l;
-    objects = (IMObjectDescriptorStruct *) calloc(1, sizeof
(IMObjectDescriptorStruct));
+    objects = (IMObjectDescriptorStruct *) calloc(2, sizeof
(IMObjectDescriptorStruct));

     l = objects;


Comment 5 Warren Togami 2004-09-12 03:00:45 UTC
==17527==  Address 0x1B9A0810 is 0 bytes after a block of size 56 alloc'd
==17527==    at 0x1B90340D: calloc (vg_replace_malloc.c:176)
==17527==    by 0x1B933B77: init_objects (CannaLE.c:303)
==17527==    by 0x1B93657B: if_GetIfInfo (CannaLE.c:1746)
==17527==    by 0x8081F67: (within /usr/sbin/htt_server)

valgrind i386 shows this problem with this original line of code. 
Thus this is somehow related to the memory allocation problem that
cauess trouble on x86_64.

CannaLE.c (line 303):
objects = (IMObjectDescriptorStruct *) calloc(1, sizeof
(IMObjectDescriptorStruct));

Comment 6 Warren Togami 2004-09-12 05:23:06 UTC
MALLOC_CHECK_=3 htt_server -d

If you run this in rawhide, it dies in the same place on i386.

Comment 7 Colin Charles 2004-09-12 11:02:53 UTC
Please try this on 32-bit systems too. My fresh rawhide install (9/11)
on ppc meant that htt_server ate up lots of cpu time. /etc/init.d/iiim
stop solves this for me (because I don't need to use it)

Comment 8 Warren Togami 2004-09-12 12:06:20 UTC
Seems this happens on non-i386 archs, also exposed by MALLOC_CHECK_=3
on i386.  Similar finding in Bug #132396

Comment 9 Warren Togami 2004-09-12 12:38:26 UTC
o_O...

ppc64 kernel with ppc userspace iiimf-* behaves identically to i386,
contrary to Colin's report in comment #7.  Very odd.


Comment 10 Akira TAGOH 2004-09-13 06:22:40 UTC
Ok, well, Warren, your patch is correct. IMObjectDescriptorStruct must
be terminated by NULL. and calloc(2, ...) does it. I'll apply your
patch for next build. thanks.

Comment 11 Akira TAGOH 2004-09-28 04:34:19 UTC
*** Bug 133765 has been marked as a duplicate of this bug. ***

Comment 12 Akira TAGOH 2004-09-28 04:40:43 UTC
This problem should be fixed in 12.0.1-7.svn1891. however there is
another bugs to get working on x86-64. Please check Bug#132940,
Bug#132941, Bug#132950

Comment 13 Akira TAGOH 2004-09-30 16:13:19 UTC
well, in 12.0.1-10.svn1943, htt_server itself should works on even
64bit architectures. please let me know if you still found a problem.

Comment 14 Zack Cerza 2004-10-04 17:30:20 UTC
It looks fixed, thanks :)


Note You need to log in before you can comment on or make changes to this bug.