Bug 189294 - gnome-pilot hangs due to invalid memory release / corrupted heap.
gnome-pilot hangs due to invalid memory release / corrupted heap.
Status: CLOSED INSUFFICIENT_DATA
Product: Fedora
Classification: Fedora
Component: gnome-pilot (Show other bugs)
5
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Matthew Barnes
: Patch, Reopened
: 189627 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-04-18 16:46 EDT by Gilboa Davara
Modified: 2007-11-30 17:11 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-10-02 13:34:18 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
gdb log (17.78 KB, text/plain)
2006-04-18 16:49 EDT, Gilboa Davara
no flags Details
trace log (17.50 KB, text/plain)
2006-04-18 16:50 EDT, Gilboa Davara
no flags Details
Patch to the 'gb-309130-attach-48413-backup-conduit-valgrind-fixes.patch' patch in gnome-pilot-2.0.13-7.fc5.5 (415 bytes, patch)
2006-04-19 06:26 EDT, Matt Davey
no flags Details | Diff
0.11.8-12.4.fc5 gdb crash log (13.46 KB, text/plain)
2006-05-07 05:33 EDT, Gilboa Davara
no flags Details
patch to orbit_daemon_glue.c to avoid accessing freed memory (1.39 KB, patch)
2006-05-08 10:05 EDT, Matt Davey
no flags Details | Diff
add a brief sleep after a sync, to avoid re-attaching to the ttyUSB device (387 bytes, patch)
2006-08-10 05:51 EDT, Matt Davey
no flags Details | Diff
gpilotd trace (10.36 KB, text/plain)
2007-01-03 05:30 EST, Gilboa Davara
no flags Details

  None (edit)
Description Gilboa Davara 2006-04-18 16:46:25 EDT
Description of problem:
When I try to sync evolution with my Palm (Tungsten T3) gpilotd hangs and the
palm device times-out.
I'm no gtk guru (and I couldn't find the debuginfo rpms, so I'm guessing
here), but it seems that gpilotd is getting SIGABRT due to invalid free
or due corrupted heap inside orbed_notify_connect.
In-order to remove udev out of the picture, I created static device ttyUSB nodes
in /etc/dev.

Version-Release number of selected component (if applicable):
gnome-pilot-2.0.13-7.fc5.5.x86_64
gnome-pilot-conduits-2.0.13-3.FC5.3.x86_64
gnome-pilot-devel-2.0.13-7.fc5.5.x86_64
pilot-link-devel-0.11.8-12.2.fc5.x86_64
pilot-link-0.11.8-12.2.fc5.x86_64

How reproducible:
Almost always. (I did manage to sync my palm device once or twice)

Steps to Reproduce:
1. Configure the palm device (USB, 115kbps) from inside evolution.
2. Press the sync button.
  
Actual results:
gpilotd hangs, palm device times out.

Expected results:
Full sync with evolution.

Additional info:
Comment 1 Gilboa Davara 2006-04-18 16:49:05 EDT
Created attachment 127946 [details]
gdb log
Comment 2 Gilboa Davara 2006-04-18 16:50:49 EDT
Created attachment 127947 [details]
trace log
Comment 3 Matt Davey 2006-04-19 06:26:03 EDT
Created attachment 127978 [details]
Patch to the 'gb-309130-attach-48413-backup-conduit-valgrind-fixes.patch' patch in gnome-pilot-2.0.13-7.fc5.5

I can verify this bug.
The problem appears to be in the
'gb-309130-attach-48413-backup-conduit-valgrind-fixes.patch'
patch included in the gnome-pilot RPM.

This patch causes pi_file_close(NULL) to be called, causing the error
reported above.  This patch was developed for pilot-link 0.12.0, in which
pi_file_close detects a null argument and returns -1 without attempting
to free anything.  On 0.11.8, we hit trouble.

I'm attaching a patch to that patch, which fixes the problem for me.
Comment 4 Gilboa Davara 2006-04-19 09:54:08 EDT
Wow.... That was fast. Thanks!

I'd suggest you post something about this fix in -devel (and/or testing) to get
the ball rolling.

Thanks again,
Gilboa
Comment 5 Ngo Than 2006-04-19 17:10:43 EDT
i have added the patch in 2.0.13-7.fc5.6, which will be available in fc5-updates
soon. Matt, thanks for the patch
Comment 6 Matt Davey 2006-04-20 05:08:27 EDT
duplicate of #189206.
Comment 7 Frank Ch. Eigler 2006-04-21 14:18:32 EDT
Should we keep this bug open until the update actually appears?  (It's not on
fc5-updates yet.)
Comment 8 Frank Ch. Eigler 2006-04-21 16:36:00 EDT
I just received 2.0.13-7.fc5.6 from fc5-updates.  It appears to have the same
bug in another place too.  Syncing is successful only one out of many attempts.
 At times, an strace on the gpilotd process indicates an glibc error message
very similar to the one identified above.

Several syncs in a row, with the control-panel closed, appears to trigger a
failure like this with high probability.

write(2, "gpilotd-Message: Client appears "..., 54) = 54
write(2, "\n(gnome-pilot:26049): gpilotd-WA"..., 114) = 114
write(2, "gpilotd-Message: removing monito"..., 57) = 57
open("/dev/tty", O_RDWR|O_NONBLOCK|O_NOCTTY) = -1 ENXIO (No such device or address)
writev(2, [{"*** glibc detected *** ", 23}, {"/usr/libexec/gpilotd", 20}, {": ",
2}, {"free(): invalid pointer", 23}, {": 0x", 4}, {"0860edc0", 8}, {" ***\n",
5}], 7) = 85
open("/etc/ld.so.cache", O_RDONLY)      = 36
fstat64(36, {st_mode=S_IFREG|0644, st_size=133374, ...}) = 0
mmap2(NULL, 133374, PROT_READ, MAP_PRIVATE, 36, 0) = 0x1fd7e000
close(36)                               = 0
open("/lib/libgcc_s.so.1", O_RDONLY)    = 36
read(36, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\320g\r"..., 512) = 512
mmap2(NULL, 2097152, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0)
= 0x1fb7e000
munmap(0x1fb7e000, 532480)              = 0
munmap(0x1fd00000, 516096)              = 0
mprotect(0x1fc00000, 135168, PROT_READ|PROT_WRITE) = 0
fstat64(36, {st_mode=S_IFREG|0755, st_size=46684, ...}) = 0
mmap2(0x480d5000, 48356, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 36, 0)
= 0x480d5000
mmap2(0x480e0000, 4096, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 36, 0xa) = 0x480e0000
close(36)                               = 0
munmap(0x1fd7e000, 133374)              = 0
futex(0x480179d4, FUTEX_WAKE, 2147483647) = 0
futex(0x480e0bc4, FUTEX_WAKE, 2147483647) = 0
write(2, "======= Backtrace: =========\n", 29) = 29
writev(2, [{"/lib/libc.so.6", 14}, {"[0x", 3}, {"47f49f18", 8}, {"]\n", 2}], 4) = 27
writev(2, [{"/lib/libc.so.6", 14}, {"(", 1}, {"__libc_free", 11}, {"+0x", 3},
{"79", 2}, {")", 1}, {"[0x", 3}, {"47f4d41d", 8}, {"]\n", 2}], 9) = 45
writev(2, [{"/usr/lib/libglib-2.0.so.0", 25}, {"(", 1}, {"g_free", 6}, {"+0x",
3}, {"31", 2}, {")", 1}, {"[0x", 3}, {"99a5a1", 6}, {"]\n", 2}], 9) = 49

writev(2, [{"/usr/libexec/gpilotd", 20}, {"[0x", 3}, {"8052a0e", 7}, {"]\n",
2}], 4) = 32
writev(2, [{"/usr/libexec/gpilotd", 20}, {"[0x", 3}, {"8052b11", 7}, {"]\n",
2}], 4) = 32
writev(2, [{"/usr/libexec/gpilotd", 20}, {"[0x", 3}, {"8053c70", 7}, {"]\n",
2}], 4) = 32
writev(2, [{"/usr/libexec/gpilotd", 20}, {"(", 1}, {"orbed_notify_connect", 20},
{"+0x", 3}, {"b3", 2}, {")", 1}, {"[0x", 3}, {"8054073", 7}, {"]\n", 2}], 9) = 59
writev(2, [{"/usr/libexec/gpilotd", 20}, {"[0x", 3}, {"804dd5f", 7}, {"]\n",
2}], 4) = 32
writev(2, [{"/usr/libexec/gpilotd", 20}, {"[0x", 3}, {"804e708", 7}, {"]\n",
2}], 4) = 32
writev(2, [{"/usr/lib/libglib-2.0.so.0", 25}, {"[0x", 3}, {"993876", 6}, {"]\n",
2}], 4) = 36
writev(2, [{"/usr/lib/libglib-2.0.so.0", 25}, {"(", 1},
{"g_main_context_dispatch", 23}, {"+0x", 3}, {"16d", 3}, {")", 1}, {"[0x", 3},
{"99315d", 6}, {"]\n", 2}], 9) = 67
writev(2, [{"/usr/lib/libglib-2.0.so.0", 25}, {"[0x", 3}, {"9963ef", 6}, {"]\n",
2}], 4) = 36
writev(2, [{"/usr/lib/libglib-2.0.so.0", 25}, {"(", 1},
{"g_main_context_iteration", 24}, {"+0x", 3}, {"65", 2}, {")", 1}, {"[0x", 3},
{"996955", 6}, {"]\n", 2}], 9) = 67
writev(2, [{"/usr/libexec/gpilotd", 20}, {"(", 1}, {"main", 4}, {"+0x", 3},
{"314", 3}, {")", 1}, {"[0x", 3}, {"804d214", 7}, {"]\n", 2}], 9) = 44
writev(2, [{"/lib/libc.so.6", 14}, {"(", 1}, {"__libc_start_main", 17}, {"+0x",
3}, {"dc", 2}, {")", 1}, {"[0x", 3}, {"47efb7e4", 8}, {"]\n", 2}], 9) = 51
writev(2, [{"/usr/libexec/gpilotd", 20}, {"[0x", 3}, {"804cd91", 7}, {"]\n",
2}], 4) = 32
Comment 9 Gilboa Davara 2006-05-07 05:10:46 EDT
I second the above.
Sync still broken.... Sigh.
Comment 10 Gilboa Davara 2006-05-07 05:33:47 EDT
Created attachment 128704 [details]
0.11.8-12.4.fc5 gdb crash log
Comment 11 Matt Davey 2006-05-08 03:41:45 EDT
Hmm.

Without a backtrace with full debugging symbols it's hard to identify the culprit.

Gilboa/Frank, can you add the following info:
What does gpilotd seem to be doing when it crashes? (you might want to start
gpilotd from a terminal window, to see messages).
Does it crash on every sync?
Does it crash if you disable all conduits?  If not, can you find which conduit
causes trouble?

Thanks
Comment 12 Gilboa Davara 2006-05-08 04:48:56 EDT
Matt,

Like before, gpilot rarely competes the sync; it just hangs there till long
after the pilot times out.
When I started it under gdb, it first got SIGABRT due to invalid memory release
(check my gdb log, above), when I continue (cont), it just hangs there waiting
on a mutex (__lll_mutex_lock_wait) within the signal handler called by raise.
I didn't have time to really dive into the code... but it seems like a normal
invalid release to me - rather like the first bug I found.

G.
Comment 13 Gilboa Davara 2006-05-08 04:51:24 EDT
Oh... forgot to add: like before it happens during the initial "identifying
user" phase; long before any conduit is executed.
Comment 14 Frank Ch. Eigler 2006-05-08 08:37:58 EDT
In my case, it sometimes does manage to sync fully; at other times, it crashes
when it's almost done; at other times, it dies early during the "identifying
user" stage.  Here is one backtrace of the latter type:

*** glibc detected *** /usr/libexec/gpilotd: free(): invalid pointer: 0x09923e08 ***
======= Backtrace: =========
/lib/libc.so.6[0x47f49f18]
/lib/libc.so.6(__libc_free+0x79)[0x47f4d41d]
/usr/lib/libglib-2.0.so.0(g_free+0x31)[0x99a5a1]
/usr/libexec/gpilotd[0x8052a0e]
/usr/libexec/gpilotd[0x8052b11]
/usr/libexec/gpilotd[0x8053c70]
/usr/libexec/gpilotd(orbed_notify_connect+0xb3)[0x8054073]
/usr/libexec/gpilotd[0x804dd5f]
/usr/libexec/gpilotd[0x804e708]
/usr/lib/libglib-2.0.so.0[0x993876]
/usr/lib/libglib-2.0.so.0(g_main_context_dispatch+0x16d)[0x99315d]
/usr/lib/libglib-2.0.so.0[0x9963ef]
/usr/lib/libglib-2.0.so.0(g_main_context_iteration+0x65)[0x996955]
/usr/libexec/gpilotd(main+0x314)[0x804d214]
/lib/libc.so.6(__libc_start_main+0xdc)[0x47efb7e4]
/usr/libexec/gpilotd[0x804cd91]
======= Memory map: ========
[...]

#3  0x47f42a1b in __libc_message (do_abort=2,
    fmt=0x47fffd34 "*** glibc detected *** %s: %s: 0x%s ***\n")
    at ../sysdeps/unix/sysv/linux/libc_fatal.c:170
#4  0x47f49f18 in _int_free (av=0x48016120, mem=0x9923e08) at malloc.c:5616
#5  0x47f4d41d in *__GI___libc_free (mem=0x9923e08) at malloc.c:3447
#6  0x0099a5a1 in g_free () from /usr/lib/libglib-2.0.so.0
#7  0x08052a0e in monitor_off_remover (list=0x9a36d88, client_id=Variable
"client_id" is not available.
)
    at orbit_daemon_glue.c:678
#8  0x08052b11 in monitor_off_helper (pilot=0x98c4268 "ique",
    client_id=0x990f1e8
"IOR:010000001b00000049444c3a474e4f4d452f50696c6f742f436c69656e743a312e300000030000000054424f580000000101020005000000554e4958000000000a0000006c6f63616c686f73740000002e0000002f7661722f746d702f6f72626974"...)
    at orbit_daemon_glue.c:702
#9  0x08053c70 in purge_ior_foreach (pilot=0x98c4268 "ique", list=0x9923f40)
    at orbit_daemon_glue.c:1404
#10 0x08054073 in orbed_notify_connect (pilot_id=0x98c4268 "ique", user_info=
      {passwordLength = 0, username = "Frank Ch. Eigler\000���\035��G
a\001HHW\222\t\200��\000\200��\000�O\001H@�\222\tȯ\221\t���\000\002\000\000\000��\237\000HW\222\t\200��\000H�����\231\000HW\222\t�O\001H
a\001H�\232\222\t\214��\000,�\222\tx���h��\000HW\222\t\000\000\000\000�O\001H
a\001H", password = "@�\222\t����
\227�G\214��\000,�\222\t\200��\000ȱ��\026��\000IW\222\th�\005\b
\227�G!)�\000\214��\000$�\222\t
\227�Gh��\000�\232\222\t\004\000\000\000\000\000\000\000��\237\000H\000\000\000\000\000\000\000
a\001H\214��\000@\000\000\000\020\000\000\000���G���\000,�\222\t\200��\000\001\000\000\000���",
userID = 21461, viewerID = 0, lastSyncPC = 2312385601, successfulSyncDate =
1147092098, lastSyncDate = 1147092098}) at orbit_daemon_glue.c:1539
#11 0x0804dd5f in sync_device (device=0x991aab0, context=0x98f0fb0)
    at gpilotd.c:618
#12 0x0804e708 in visor_devices_timeout (data=0x98f0fb0) at gpilotd.c:918
#13 0x00993876 in g_source_get_current_time () from /usr/lib/libglib-2.0.so.0
Comment 15 Matt Davey 2006-05-08 10:05:53 EDT
Created attachment 128744 [details]
patch to orbit_daemon_glue.c to avoid accessing freed memory

Thanks guys.
I may have found a culprit, but if so I've no idea why this is biting you now,
as it looks like it has been in the code for a long long time.	Hence, I could
be completely wrong I guess :)

If you are rpm-adept, please try adding the attached patch to the list of
patches applied by the SRPM and rebuilding.

Let us know how you get on.
Comment 16 Matt Davey 2006-05-09 05:17:35 EDT
Update: I have now been able to test the above patch with an fc5 system.  It
does appear to fix the crasher.  I'll send a mail to the fedora packager.

This bug should be closed, IMO.
Comment 17 Gilboa Davara 2006-05-09 06:02:49 EDT
Hello Matt,

I've patched the SRPM and build it and it seems to fix the crash. Thanks.
Sadly enough, evolution-data-server itself is now causing grief (crashed a
couple of times during sync.)
I'll report it once the official gnome-pilot RPM(s) are released.

Thanks!
Gilboa
Comment 18 Matthew Barnes 2006-08-09 13:13:25 EDT
Matt, can you confirm that this bug is fixed in the latest Fedora-Updates
packages?  If so I'll close this.

gnome-pilot-2.0.13-7.fc5.6
gnome-pilot-conduits-2.0.13-3.FC5.3
pilot-link-0.11.8-12.4.fc5
Comment 19 Matt Davey 2006-08-10 05:51:49 EDT
Created attachment 133922 [details]
add a brief sleep after a sync, to avoid re-attaching to the ttyUSB device

The patch (128744) from comment 15 does not seem to be present in the current
package and so I can still trigger the crash from comment 14.

I'm also attaching another patch from CVS that I'd recommend applying.	It aims
to prevent gnome-pilot from re-attaching to the ttyUSB device immediately after
finishing a sync.   Otherwise you end up unable to sync a second time.
Comment 20 Matthew Barnes 2006-08-10 15:01:52 EDT
Applied both patches and submitted an updated package to Rawhide.
Please report back whether this fixes the problem.

gnome-pilot-2.0.13-16
Comment 21 Matt Davey 2006-08-11 07:59:25 EDT
Seems fine.  I had to recompile from the src rpm on fc5, due to the rtld
dependency in the rawhide i386:

bob $ sudo rpm -Uvh ~/gnome-pilot-2.0.13-16.i386.rpm
~/gnome-pilot-devel-2.0.13-16.i386.rpm
error: Failed dependencies:
        rtld(GNU_HASH) is needed by gnome-pilot-2.0.13-16.i386

Regards,

Matt
Comment 22 Gilboa Davara 2006-09-03 10:48:21 EDT
Matt,

Did the patched RPMs find their way into updates/updates-testing?
I'm using the self built 2.0.13-7.fc5.7 while the latest in updates is
gnome-pilot-2.0.13-7.fc5.6 which, AFAIK still suffers from heap corruption.

- Gilboa
Comment 23 Matt Davey 2006-09-04 10:22:00 EDT
Which Matt? :)
Looks to me like Matthew Barnes' update only got as far as:
fedora/linux/core/development/
Comment 24 Gilboa Davara 2006-09-04 11:27:14 EDT
... You ;)

We -really- need to get your patch into updates/testing before FC5 goes into
legacy mode :(
Comment 25 Ingo Schaefer 2006-10-10 04:39:32 EDT
I have not found the update in testing, so I recompiled the SRPM from
development and now sync works fine, even the second time or more times.

Thanks!

Regards,
Ingo
Comment 26 Gilboa Davara 2006-10-10 05:29:09 EDT
Sigh.

I'll test FC6/devel... Hopefully this bug will be fully fixed in FC6.

- Gilboa
Comment 27 Matthew Barnes 2006-12-31 12:29:22 EST
*** Bug 189627 has been marked as a duplicate of this bug. ***
Comment 28 Matthew Barnes 2007-01-02 22:03:24 EST
Is it safe to assume this is fixed now in Fedora Core 6?
Comment 29 Gilboa Davara 2007-01-03 05:03:37 EST
I can't really tell.
USB gnome-pilot support is completely broken at this point. (Pilot-link does work)
(Luckily for us, the same update the broke the USB sync fixed the net/BT sync,
so we are even ;))

- Gilboa
Comment 30 Matt Davey 2007-01-03 05:11:20 EST
Gilboa, is that with gnome-pilot-2.0.15-1.fc6 ?

I don't have access to a usb/fc6 setup at this moment, I could test this
evening.  I would be dismayed if usb sync was broken in gnome-pilot with the
2.0.15 release (especially if it is working with pilot-link).

That said, I haven't been testing with usb and fc6 for the last while.
Comment 31 Gilboa Davara 2007-01-03 05:30:33 EST
Created attachment 144689 [details]
gpilotd trace

$ rpm -qa | grep gnome-pilot | sort
gnome-pilot-2.0.15-1.fc6.i386
gnome-pilot-2.0.15-1.fc6.x86_64
gnome-pilot-conduits-2.0.15-1.fc6.x86_64
gnome-pilot-devel-2.0.15-1.fc6.i386
gnome-pilot-devel-2.0.15-1.fc6.x86_64
Comment 32 Matt Davey 2007-01-03 05:59:58 EST
Took a look at the trace.  Seems odd that gp seems to be configured to connect
to '/etc/dev/pilot'.

If you are using udev rules to set up a dev/pilot symlink, then this could be
the problem.  I have experienced cases where udev does not reliably choose the
right ttyUSB device to link to /dev/pilot.  If this is the case, you might have
luck by configuring gp to connect to either ttyUSB0 or ttyUSB1 explicitly.

Aside: in pilot-link 0.12.x, it is possible to configure libusb syncing, which
bypasses the usbserial/visor/ttyUSB path altogether, which is nice.
Comment 33 Gilboa Davara 2007-01-03 06:16:30 EST
My bad.
I'm got a bit tired of having pilot-link/gnome-pilot time out due to slow node
creation by udev. (Even with recent FC6 udev builds)
As a result, I tend to create static device nodes under /etc/dev and link
/etc/dev/pilot to /etc/dev/ttyUSB1.

AFAIK FC7 will include pl 0.12.x.

- Gilboa
Comment 34 Gilboa Davara 2007-01-03 06:19:05 EST
%s/include pl/included in/g
Comment 35 Matthew Barnes 2007-01-03 07:30:08 EST
(In reply to comment #33)
> AFAIK FC7 will include pl 0.12.x.

Correct.  It's already in Rawhide.
Comment 36 Matěj Cepl 2007-08-31 11:23:39 EDT
The distribution against which this bug was reported is no longer supported,
could you please reproduce this with the updated version of the currently
supported distribution (Fedora Core 6, or Fedora 7, or Rawhide)? If this issue
turns out to still be reproducible, please let us know in this bug report.  If
after a month's time we have not heard back from you, we will have to close this
bug as INSUFFICIENT_DATA.

Setting status to NEEDINFO, and awaiting information from the reporter.

Thanks in advance.
Comment 37 Matthew Barnes 2007-10-02 13:34:18 EDT
Closing as INSUFFICIENT_DATA.

Note You need to log in before you can comment on or make changes to this bug.