Bug 712888 - pilot-link never ends sync
Summary: pilot-link never ends sync
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: pilot-link
Version: 15
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Peter Schiffer
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-06-13 13:45 UTC by Patrick C. F. Ernzer
Modified: 2012-06-01 16:57 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-01 16:57:41 UTC
Type: ---


Attachments (Terms of Use)
backtrace requested in Comment 11 (8.45 KB, text/plain)
2011-06-21 11:24 UTC, Patrick C. F. Ernzer
no flags Details

Description Patrick C. F. Ernzer 2011-06-13 13:45:02 UTC
Description of problem:
If I sync my Palm Tungsten T5 with jpilot-1.8.0-6.fc15.x86_64 it syncs just fine, Palm reports sync is done, but jpilot believes it is still running.

Opening bug against pilot-link as that is used in backend AFAIk and pilot-xfer -l also does it's job fine but then remains waiting, needing to be killed

Version-Release number of selected component (if applicable):
jpilot-1.8.0-6.fc15.x86_64
pilot-link-0.12.5-5.fc15.x86_64
pilot-link-libs-0.12.5-5.fc15.x86_64

How reproducible:
always

Steps to Reproduce:
in jpilot:
1. open jpilot
2. click sync button
3. wait for sync to finish (Palm returns to  HotSync initiate screen)
4. click sync button again in jpilot

in pilot-xfer
1. pilot-xfer -l
  
Actual results:
J-Pilot: sync process already in progress (process ID = 1653
)J-Pilot: press the hotsync button on the cradle
      or stop the sync by typing "kill 1653" at the command line

If at this point I kill 1653, then I can click the sync button in jpilot again and do one sync (which again never returns after finishing)

in pilot-xfer: List complete. 503 files ... and then it just sits there until I ctrl-C

Expected results:
sync ends cleanly, like it did in F14 and previous

Additional info:
last few lines of strace pilot-xfer -l
utex(0x37dda424a4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 161, {1307972215, 909142000}, ffffffff) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x37dda42460, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x37dda424a4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 163, {1307972215, 909279000}, ffffffff) = 0
futex(0x37dda42460, FUTEX_WAKE_PRIVATE, 1) = 0
write(1, "\n", 1
)                       = 1
write(1, "   List complete. 503 files foun"..., 36   List complete. 503 files found.

) = 36
ioctl(7, USBDEVFS_SUBMITURB or USBDEVFS_SUBMITURB32, 0x744680) = 0
timerfd_settime(6, TFD_TIMER_ABSTIME, {it_interval={0, 0}, it_value={423381, 242109000}}, NULL) = 0
futex(0x7445f0, FUTEX_WAKE_PRIVATE, 1)  = 1
futex(0x731f64, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 97, {1307972245, 909868693}, ffffffff) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x731f38, FUTEX_WAKE_PRIVATE, 1)  = 0
ioctl(7, USBDEVFS_SUBMITURB or USBDEVFS_SUBMITURB32, 0x744680) = 0
timerfd_settime(6, TFD_TIMER_ABSTIME, {it_interval={0, 0}, it_value={423381, 242524000}}, NULL) = 0
futex(0x731f64, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 99, {1307972245, 910203096}, ffffffff) = 0
futex(0x731f38, FUTEX_WAKE_PRIVATE, 1)  = 0
futex(0x37dda424a4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 165, {1307972215, 910375000}, ffffffff) = 0
futex(0x37dda42460, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x37dda424a4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 167, {1307972215, 916553000}, ffffffff) = 0
futex(0x37dda42460, FUTEX_WAKE_PRIVATE, 1) = 0
ioctl(7, USBDEVFS_SUBMITURB or USBDEVFS_SUBMITURB32, 0x7445d0) = 0
timerfd_settime(6, TFD_TIMER_ABSTIME, {it_interval={0, 0}, it_value={423381, 249280000}}, NULL) = 0
futex(0x731f64, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 101, {1307972245, 916962139}, ffffffff) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x731f38, FUTEX_WAKE_PRIVATE, 1)  = 0
ioctl(7, USBDEVFS_SUBMITURB or USBDEVFS_SUBMITURB32, 0x7445d0) = 0
timerfd_settime(6, TFD_TIMER_ABSTIME, {it_interval={0, 0}, it_value={423381, 249616000}}, NULL) = 0
futex(0x744540, FUTEX_WAKE_PRIVATE, 1)  = 1
futex(0x37dda424a4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 169, {1307972215, 917376000}, ffffffff) = 0
futex(0x37dda42460, FUTEX_WAKE_PRIVATE, 1) = 0
open("/etc/ld.so.cache", O_RDONLY)      = 8
fstat(8, {st_mode=S_IFREG|0644, st_size=110021, ...}) = 0
mmap(NULL, 110021, PROT_READ, MAP_PRIVATE, 8, 0) = 0x7fe47c4e1000
close(8)                                = 0
open("/lib64/libgcc_s.so.1", O_RDONLY)  = 8
read(8, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240(\0\3247\0\0\0"..., 832) = 832
fstat(8, {st_mode=S_IFREG|0755, st_size=89544, ...}) = 0
mmap(0x37d4000000, 2182488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 8, 0) = 0x37d4000000
mprotect(0x37d4015000, 2093056, PROT_NONE) = 0
mmap(0x37d4214000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 8, 0x14000) = 0x37d4214000
close(8)                                = 0
munmap(0x7fe47c4e1000, 110021)          = 0
tgkill(6743, 6767, SIGRTMIN)            = 0
ioctl(7, SNDRV_CTL_IOCTL_ELEM_LIST or USBDEVFS_RELEASEINTERFACE, 0x7fff55c7704c) = 0
write(5, "\1", 1)                       = 1
futex(0x731f08, FUTEX_WAIT_PRIVATE, 2, NULL^C <unfinished ...>


do say if you want a full strace or any other test done on this end.

While this is not a bad bug, I can kill the process and syncing does work, it is rather annoying.

Comment 1 Alex Lancaster 2011-06-13 18:05:09 UTC
So this doesn't happen in F14 with the exact same version of upstream pilot-link?  Odd.  Perhaps something has changed in the libusb interaction with the system that causes it to behave differently.

Comment 2 Patrick C. F. Ernzer 2011-06-14 12:49:23 UTC
(In reply to comment #1)
> So this doesn't happen in F14

Indeed. F14 is fine.

> with the exact same version of upstream pilot-link? 

That I can not vouch for, when both affected systems were running F14 x86_64 they would usually be fully updated, but I do not recall which was the version of pilot-link in F14 (either GA or just before F15 was released)

> Odd.  Perhaps something has changed in the libusb interaction with
> the system that causes it to behave differently.

Could well be.

What traces or logfiles would useful. I'll be happy to provide all that can help but I am not sure where to look for useful info to track this down.

Comment 3 Patrick C. F. Ernzer 2011-06-15 09:28:18 UTC
(In reply to comment #2)
> That I can not vouch for, when both affected systems were running F14 x86_64
> they would usually be fully updated, but I do not recall which was the version
> of pilot-link in F14 (either GA or just before F15 was released)

Booted a box (third affected) into old F14, pilot-xfer -l works and ends as expected but on F15 it has the same problem of never finishing.

F14:
libusb1-1.0.8-1.fc14.x86_64
libusb-0.1.12-23.fc14.x86_64
pilot-link-0.12.5-2.fc14.x86_64

F15:
libusb-0.1.3-7.fc15.x86_64
pilot-link-0.12.5-5.fc15.x86_64
pilot-link-libs-0.12.5-5.fc15.x86_64
libusb1-1.0.8-7.fc15.x86_64

Comment 4 Alex Lancaster 2011-06-15 19:28:59 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > That I can not vouch for, when both affected systems were running F14 x86_64
> > they would usually be fully updated, but I do not recall which was the version
> > of pilot-link in F14 (either GA or just before F15 was released)
> 
> Booted a box (third affected) into old F14, pilot-xfer -l works and ends as
> expected but on F15 it has the same problem of never finishing.
> 
> F14:
> libusb1-1.0.8-1.fc14.x86_64
> libusb-0.1.12-23.fc14.x86_64
> pilot-link-0.12.5-2.fc14.x86_64
> 
> F15:
> libusb-0.1.3-7.fc15.x86_64
> pilot-link-0.12.5-5.fc15.x86_64
> pilot-link-libs-0.12.5-5.fc15.x86_64
> libusb1-1.0.8-7.fc15.x86_64

Hmm, odd that the NVR on libusb  F15 < F14, i.e.

libusb-0.1.3-7.fc15.x86_64 < libusb-0.1.12-23.fc14.x86_64

Is that right?  Was an Epoch tag bumped reverting the update, or is this a broken upgrade path from F14 to F15?

Comment 5 Patrick C. F. Ernzer 2011-06-15 21:47:21 UTC
(In reply to comment #4)
> Hmm, odd that the NVR on libusb  F15 < F14, i.e.
> 
> libusb-0.1.3-7.fc15.x86_64 < libusb-0.1.12-23.fc14.x86_64
> 
> Is that right? Was an Epoch tag bumped reverting the update,

I'm not sure, but of I read http://koji.fedoraproject.org/koji/packageinfo?buildOrder=-nvr&packageID=782&tagOrder=name&tagStart=0#buildlist correctly, these two NVR are latest for the respective Fedora versions.

> or is this a broken upgrade path from F14 to F15?

I installed F15 into a separate logical volume for /

Comment 6 Alex Lancaster 2011-06-15 23:39:23 UTC
Looking at the spec:

https://pkgs.fedoraproject.org/gitweb/?p=libusb.git;a=blob;f=libusb.spec;h=973cb51cf38096b4fefff2c8e82d67f1de509d0d;hb=master

it looks there have been some internal changes in libusb and a compat- package introduced.  I don't understand what was done, but an Epoch tag was clearly added meaning so that 0.1.3 was backgraded from 0.1.12.  Perhaps somewhere in the changes to libusb might cause this bug.  Cc'ing the libusb maintainer and requesting NEEDINFO.

Comment 7 Jan Vcelak 2011-06-17 10:00:15 UTC
Hi Alex.

libusb was replaced by libusb-compat, which works as a wrapper above libusb1. The upstream threw away old libusb support and suggests this solution. The epoch was raised to reflect versioning of libusb-compat.

There are just a few differences between libusb and libusb-compat, described in README file:
http://git.libusb.org/?p=libusb-compat-0.1.git;a=blob;f=README

Is this bug related to some of these differences?

Comment 8 Alex Lancaster 2011-06-17 19:16:12 UTC
(In reply to comment #7)
> Hi Alex.
> 
> libusb was replaced by libusb-compat, which works as a wrapper above libusb1.
> The upstream threw away old libusb support and suggests this solution. The
> epoch was raised to reflect versioning of libusb-compat.
> 
> There are just a few differences between libusb and libusb-compat, described in
> README file:
> http://git.libusb.org/?p=libusb-compat-0.1.git;a=blob;f=README
> 
> Is this bug related to some of these differences?

I don't know enough about the innards of pilot-link, but change #4:

 Some libusb-0.1 users may have implemented I/O cancellation by running transfers in their own threads and simply killing the thread when they don't want to do the transfer any more. 

sounds like it might be related to this bug (which is about pilot-link not exiting upon a sync).

Comment 9 Patrick C. F. Ernzer 2011-06-18 12:37:45 UTC
addition; if the Palm has an error during sync, pilot-link does return (i.e. it does need killing before next run).

If you make a scratch build for F15 x86_64 against libusb-compat, I'll be happy to test it and report back.

Comment 10 Jan Vcelak 2011-06-21 08:44:54 UTC
pilot-link-0.12.5-5.fc15.x86_64 is already compiled against libusb-compat

http://kojipkgs.fedoraproject.org/packages/pilot-link/0.12.5/5.fc15/data/logs/x86_64/root.log

Comment 11 Jan Vcelak 2011-06-21 09:01:10 UTC
(In reply to comment #8)
> > Is this bug related to some of these differences?
> 
> I don't know enough about the innards of pilot-link, but change #4:
> 
>  Some libusb-0.1 users may have implemented I/O cancellation by running
> transfers in their own threads and simply killing the thread when they don't
> want to do the transfer any more. 
> 
> sounds like it might be related to this bug (which is about pilot-link not
> exiting upon a sync).

If this is true, you might also read: "This is bad programming practice for obvious reasons, and this lack of functionality was one of the primary drivers for libusb-1.0 development." And it is quite clear, that this cannot be fixed.

But I'm not really sure that this is the problem. Please, can we get at least a backtrace?

# debuginfo-install <program>
$ run pilot-link or whatever is needed to get into the "wrong situation"
$ gdb -p <pid-of-the-program>
(gbd) bt full

Comment 12 Patrick C. F. Ernzer 2011-06-21 11:24:28 UTC
Created attachment 505800 [details]
backtrace requested in Comment 11

Comment 13 Fedora Admin XMLRPC Client 2011-07-12 11:48:37 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 14 Alex Lancaster 2012-04-16 22:38:36 UTC
Is this fixed now?  If not as I point out in bug 449968 comment 36, this may never be fixed with the effective obsolescence of the old Palm platform.

Comment 15 Patrick C. F. Ernzer 2012-05-03 15:36:25 UTC
I still see this happening with pilot-link-0.12.5-7.fc16.x86_64

and yes, I agree with bug 449968 comment 36, this is old hardware. So, as far as I am personally concerned, you can close this WONTFIX if the backtrace in comment #12 did not help you with this bug. It's annoying to have to click cancel sync in jpilot, but hardly the end of the world.

Comment 16 Peter Schiffer 2012-06-01 16:57:41 UTC
Hi Patrick,

we weren't able to trace down this problem, so I am closing this bug as WONTFIX.

peter


Note You need to log in before you can comment on or make changes to this bug.