Bug 1003193 - libusbx-1.0.16-3 makes scanimage hang on Canon CanoScan LiDE 210
Summary: libusbx-1.0.16-3 makes scanimage hang on Canon CanoScan LiDE 210
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: sane-backends
Version: 19
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Hans de Goede
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1003686 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-08-31 17:19 UTC by John Heidemann
Modified: 2014-02-14 09:39 UTC (History)
10 users (show)

Fixed In Version: sane-backends-1.0.23-18.fc18
Clone Of:
Environment:
Last Closed: 2013-09-12 02:01:43 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
output of scanimage with LIBUSB_DEBUG=4, for libusbx-1.0.15 (3.46 MB, text/plain)
2013-09-02 19:33 UTC, John Heidemann
no flags Details
output of scanimage with LIBUSB_DEBUG=4, for libusbx-1.0.16 (3.54 MB, text/plain)
2013-09-02 19:33 UTC, John Heidemann
no flags Details

Description John Heidemann 2013-08-31 17:19:33 UTC
Description of problem:
After upgrading to libusbx-1.0.16-3,
scanimage hangs for the Canon CanoScan LiDE 210.
It worked before (with 1.0.15)


Version-Release number of selected component (if applicable):
libusbx-1.0.16-3.fc19.x86_64
sane-backends-1.0.23-13.fc19.x86_64


How reproducible:
Every time.

Steps to Reproduce:
1.scanimage -d genesys:libusb:001:004 --mode lineart --resolution 300 -l 0 -t 0 -x 216 -y 279
2.
3.

Actual results:

output from scanimage is
libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
libusbx: warning [libusb_close] internal signalling write failed, closing anyway

but no image.  The then hangs. Control-c terminates it.

Expected results:
scanned image on stdout.


Additional info:
Looks like upstream ticket at
http://www.libusb.org/ticket/131
which points to
http://sourceforge.net/mailarchive/message.php?msg_id=30803244

another report at 
http://sane.10972.n7.nabble.com/scanimage-hanging-td17670.html

Rolling back to 
libusbx-1.0.15-2.fc19.x86_64.rpm 
allows scanimage to produce output, but then it segfaults.

Comment 1 Hans de Goede 2013-09-02 14:51:01 UTC
Hi,

(In reply to John Heidemann from comment #0)
> Description of problem:
> After upgrading to libusbx-1.0.16-3,
> scanimage hangs for the Canon CanoScan LiDE 210.
> It worked before (with 1.0.15)

Thanks for the bugreport.

> Version-Release number of selected component (if applicable):
> libusbx-1.0.16-3.fc19.x86_64
> sane-backends-1.0.23-13.fc19.x86_64
> 
> 
> How reproducible:
> Every time.
> 
> Steps to Reproduce:
> 1.scanimage -d genesys:libusb:001:004 --mode lineart --resolution 300 -l 0
> -t 0 -x 216 -y 279
> 2.
> 3.
> 
> Actual results:
> 
> output from scanimage is
> libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
> libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
> libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)

Errno 9 means bad filedescriptor.

> libusbx: warning [libusb_close] internal signalling write failed, closing
> anyway

And this likely also is a bad filedescriptor error. This means that either libusb's internal structures (where these file descriptors are stored) are getting corrupted, or the kernels fd table for the process is getting corrupted. Unless you're using some custom kernel stuff, my first guess would be a userspace problem.

> but no image.  The then hangs. Control-c terminates it.
> 
> Expected results:
> scanned image on stdout.
> 
> 
> Additional info:
> Looks like upstream ticket at
> http://www.libusb.org/ticket/131

Ah that one.

> which points to
> http://sourceforge.net/mailarchive/message.php?msg_id=30803244

Yes I posted the comment pointing there, but that was in error. Later analysis of that problem is done here:
https://github.com/libusbx/libusbx/issues/25

Which points to (and this has been confirmed by testers) a libhid bug.

So that problem is not a libusb bug, and the issue the sourceforge mailarchive link points to has long been fixed.

> another report at 
> http://sane.10972.n7.nabble.com/scanimage-hanging-td17670.html

Yes that one looks similar, but it is with a fujitsu scanner rather then with a canon scanner, so it would be interesting to find out what the 

> Rolling back to 
> libusbx-1.0.15-2.fc19.x86_64.rpm 
> allows scanimage to produce output, but then it segfaults.

Which indicates that there still is a problem, so it seems that there is some memory corruption going on, likely with its root outside of libusb! And with 1.0.15 this happens to corrupts things in such a way to crash on exit, where as with 1.0.16 things go wrong earlier, still the real underlying problem is the memory corruption, re-assigning to sane-backends for now.

Can you try moving /usr/lib64/sane/libsane-gphoto2* to somewhere else, and see if that helps ?

Also it could be useful to let abrt file a bugreport with a detailed backtrace of the segfault when using libusbx-1.0.15

Regards,

Hans

Comment 2 John Heidemann 2013-09-02 18:10:10 UTC
I'm running a stock kernel (3.10.3-300.fc19.x86_64)

abrt backtrace with libusbx-1.0.15 filed as #1003686

Comment 3 Hans de Goede 2013-09-02 18:19:44 UTC
*** Bug 1003686 has been marked as a duplicate of this bug. ***

Comment 4 Hans de Goede 2013-09-02 18:21:54 UTC
Hi,

(In reply to John Heidemann from comment #2)
> I'm running a stock kernel (3.10.3-300.fc19.x86_64)
> 
> abrt backtrace with libusbx-1.0.15 filed as #1003686

Thanks, unfortunately that just shows more memory corruption of the libusb_context struct, no hints in what is causing the corruption :|

Did you try moving /usr/lib64/sane/libsane-gphoto2* to somewhere else ? libgphoto creates its own libusb_context, and maybe that is homehow messing with things ?

Comment 5 John Heidemann 2013-09-02 18:28:57 UTC
Sorry wrt the abrt not being helpful.  Corruption is hard.

I have no libsane-gphoto2* on my system (either in /usr/lib64/sane, or anywhere else, at least according to locate).  I do have a /usr/lib64/libgphoto2* if that's what you mean.

Comment 6 Hans de Goede 2013-09-02 18:34:25 UTC
(In reply to John Heidemann from comment #5)
> Sorry wrt the abrt not being helpful.  Corruption is hard.
> 
> I have no libsane-gphoto2* on my system (either in /usr/lib64/sane, or
> anywhere else, at least according to locate).  I do have a
> /usr/lib64/libgphoto2* if that's what you mean.

Ah, this is part of a separate package called sane-backends-drivers-cameras which you then likely don't have installed, so that is a dead end too.

Comment 7 Hans de Goede 2013-09-02 18:45:16 UTC
Ok, so one last thing which may help, can you try adding LIBUSB_DEBUG=4 to the front of the cmdline, so that you get:
LIBUSB_DEBUG=4 scanimage ...

Then redirect stderr to a log-file (add " 2> log" at the end of the cmdline) and attach the logfile here. If possible please do 2 test runs one with libusb-1.0.15 and one with 1.0.16.

Thanks,

Hans

Comment 8 John Heidemann 2013-09-02 19:33:05 UTC
Created attachment 792962 [details]
output of scanimage with LIBUSB_DEBUG=4, for libusbx-1.0.15

Comment 9 John Heidemann 2013-09-02 19:33:45 UTC
Created attachment 792963 [details]
output of scanimage with LIBUSB_DEBUG=4, for libusbx-1.0.16

Comment 10 John Heidemann 2013-09-02 19:45:56 UTC
In the test run of scanimage with libusbx-1.0.16, I terminate the program by pressing control-C twice.   (Otherwise it hangs.)  1.0.15 goes straight to segfault, much cleaner :-)

Comment 11 hasuf 2013-09-03 07:20:53 UTC
I'm getting the same

libusbx: warning [libusb_close] internal signalling write failed, closing anyway

error from an 

   device `epson2:libusb:002:008' is a Epson Perfection1240 flatbed scanner

and Fujitsu ScanSnap scanner

   Bus 003 Device 038: ID 04c5:10fe Fujitsu, Ltd S500

Comment 12 Hans de Goede 2013-09-04 11:34:08 UTC
Thanks for the log files, these make pretty clear what is going on:

[ 9.641307] [00006300] libusbx: debug [libusb_exit] 
[ 9.641312] [00006300] libusbx: debug [libusb_exit] destroying default context
[ 9.641317] [00006300] libusbx: warning [libusb_exit] application left some devices open
libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
libusbx: warning [libusb_close] internal signalling write failed, closing anyway
scanimage: received signal 2
scanimage: trying to stop scanner
libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
scanimage: received signal 2
scanimage: aborting

Note the "warning [libusb_exit] application left some devices open" and then the errors which stem from trying to use libusb after calling libusb_exit. This also explains the badfd errors, as the fds in question have been closed at this time by libusb_exit.

So this clearly is a sane bug.

If you want to debug this further you can try rebuilding sane-backends with a patch commenting out the
libusb_exit call in sanei/sanei_usb.c, that will likely work around your problem, and would confirm my suspicion that the problem is that some code in sane is causing libusb_exit to be called too soon.

Comment 13 John Heidemann 2013-09-04 16:31:01 UTC
Thanks wrt comment #12.

But there appears to be no call to libusb_exit in the sane-backends source code, except for one call in 

./tools/sane-find-scanner.c:	libusb_exit (sfs_usb_ctx);

(This is sane-backends-1.0.23 from the F19 SRPM; no updates as of today.)

There is a libusb_close in sanei_usb_init(), and later in sanei_usb_open(),
as well as several calls to libusb_free_*.

Can you be more specific what you think would remove the symptom (even if leaking data structures)?

Comment 14 Hans de Goede 2013-09-04 19:04:07 UTC
(In reply to John Heidemann from comment #13)
> Thanks wrt comment #12.
> 
> But there appears to be no call to libusb_exit in the sane-backends source
> code, except for one call in 
> 
> ./tools/sane-find-scanner.c:	libusb_exit (sfs_usb_ctx);
> 
> (This is sane-backends-1.0.23 from the F19 SRPM; no updates as of today.)

Ah, yes you're right, the libusb_exit call is not there in sane-backends-1.0.23, where as it is there in sane-backends git master. My bad.

So I've done some more research, and the problem is a combination of sane-backends still (unintentionally) using the old libusb-0.1.5, which is a compat wrapper around libusb-1.0, and libusb-0.1.5 calling libusb_exit from atexit + scanimage using an atexit handler to do its own cleanup.

So what I believe is happening is libusb-compat-0.1.5's atexit handler runs first calling libusb_exit, followed by scanimage's atexit handler which tries to talk to the device, and then things go boom.

Can you try downgrading libusb to 0.1.4, ie the build from here:
http://koji.fedoraproject.org/koji/buildinfo?buildID=380886

?  That should fix the issue. In the mean time I'll start a sane-backends build which really switches it over to libusb-1.0 (The spec was changed to do this a while back, but we need to explicitly tell ./configure we want libusb-1.0 for this to work), which should also fix this, and is the right thing to do moving forward.

Comment 15 Hans de Goede 2013-09-04 19:22:16 UTC
Ok, so here is (will be once finished) an F-19 build of sane-backends, which properly uses libusb-1.0 directly, and should fix this bug: http://koji.fedoraproject.org/koji/taskinfo?taskID=5895922

Please do first try downgrading to libusb-0.1.4, and add a comment if that fixes things, then upgrade libusb again, check things are broken again, and check if this build fixes things.

I want to know if reverting libusb to 0.1.4 fixes things, so that I can advocate removing the atexit calling of libusb_exit from 0.1.5 with upstream libusb.

Comment 16 John Heidemann 2013-09-04 20:17:41 UTC
yes, downgrading to libusb-0.1.4 results in crash-free operation.
That seems like a viable work-around.

Comment 17 Jim Haynes 2013-09-05 21:00:56 UTC
I seem to have encountered the same bug.  

Scanner is genesys:libusb:001:003' is a Canon LiDE 35/40/50 flatbed scanner

On trying to scan a document it either segfaults or just hangs and says
it is trying to stop the scanner when given a ctrl-C.  downgrading to
libusb-0.1.4 makes it work correctly.

Comment 18 Fedora Update System 2013-09-06 10:45:10 UTC
sane-backends-1.0.23-17.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/sane-backends-1.0.23-17.fc20

Comment 19 Fedora Update System 2013-09-06 10:45:49 UTC
sane-backends-1.0.23-17.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/sane-backends-1.0.23-17.fc19

Comment 20 Hans de Goede 2013-09-06 10:55:37 UTC
Hi All,

Thanks for testing with libusb-0.1.4, as mentioned before I believe the best way to fix this is to simply compile sane-backens with libusb-1.0 rather then against the old libusb-0.1 . The 2 updates I've just pushed do just that. Please upgrade libusb back to the 0.1.5 version, and test these sane-backends update.

If the sane-backends update fix things for you please leave some positive karma (requires a FAS account) here:
https://admin.fedoraproject.org/updates/sane-backends-1.0.23-17.fc19
https://admin.fedoraproject.org/updates/sane-backends-1.0.23-17.fc20

Regards,

Hans

Comment 21 John Heidemann 2013-09-06 14:58:10 UTC
Yes, sane-backends-1.0.23 (with -drivers-scanners and -libs)
works without segfault or hang
on a system with libusb-0.1.5.

Thanks for the fix!

(I assume you'll mark as closed when those packages show up in updates?)

Comment 22 Fedora Update System 2013-09-06 17:24:31 UTC
Package sane-backends-1.0.23-17.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing sane-backends-1.0.23-17.fc20'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-15942/sane-backends-1.0.23-17.fc20
then log in and leave karma (feedback).

Comment 23 Fedora Update System 2013-09-09 10:06:00 UTC
sane-backends-1.0.23-18.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/sane-backends-1.0.23-18.fc18

Comment 24 Fedora Update System 2013-09-09 10:06:19 UTC
sane-backends-1.0.23-18.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/sane-backends-1.0.23-18.fc19

Comment 25 Fedora Update System 2013-09-09 10:06:33 UTC
sane-backends-1.0.23-18.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/sane-backends-1.0.23-18.fc20

Comment 26 Fedora Update System 2013-09-12 02:01:43 UTC
sane-backends-1.0.23-18.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 27 Fedora Update System 2013-09-23 00:25:45 UTC
sane-backends-1.0.23-18.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 28 Fedora Update System 2013-09-26 06:05:22 UTC
sane-backends-1.0.23-18.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 29 Radim Janku 2013-12-04 23:57:08 UTC
>sane-backends-1.0.23-18.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.

Fedora 19, sane-backends-1.0.24-7.fc19 - problem still persist:
scanimage -d epkowa: -p --resolution 200 --mode Color > "temp.pnm"
libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
libusbx: warning [libusb_close] internal signalling write failed, closing anyway

Comment 30 Hans de Goede 2013-12-05 08:25:24 UTC
Hi,

(In reply to Radim Janku from comment #29)
> >sane-backends-1.0.23-18.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.
> 
> Fedora 19, sane-backends-1.0.24-7.fc19 - problem still persist:
> scanimage -d epkowa: -p --resolution 200 --mode Color > "temp.pnm"
> libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
> libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
> libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
> libusbx: warning [libusb_close] internal signalling write failed, closing
> anyway

The Fedora rpms don't include the epkowa driver, see:
https://bugzilla.redhat.com/show_bug.cgi?id=437308

So you likely got that from a third part source ? And as such it is likely still using libusb-0.1 rather then libusb-1.0 .

You can try downgrading libusb-compat to fix this, download the old version here:
http://koji.fedoraproject.org/koji/buildinfo?buildID=380886

Then do:
rpm -Uvh --oldpackage libusb-0.1.4-1.fc19.*.rpm

That should fix this. Please let us know it this helps, if it does I'll start working on a fix for libusb-0.1.5 for this.

Regards,

Hans

Comment 31 Radim Janku 2013-12-05 10:47:23 UTC
(In reply to Hans de Goede from comment #30)
> Hi,
> 
> (In reply to Radim Janku from comment #29)
> > >sane-backends-1.0.23-18.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.
> > 
> > Fedora 19, sane-backends-1.0.24-7.fc19 - problem still persist:
> > scanimage -d epkowa: -p --resolution 200 --mode Color > "temp.pnm"
> > libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
> > libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
> > libusbx: warning [add_to_flying_list] failed to arm first timerfd (errno 9)
> > libusbx: warning [libusb_close] internal signalling write failed, closing
> > anyway
> 
> The Fedora rpms don't include the epkowa driver, see:
> https://bugzilla.redhat.com/show_bug.cgi?id=437308
> 
> So you likely got that from a third part source ? And as such it is likely
> still using libusb-0.1 rather then libusb-1.0 .
> 
> You can try downgrading libusb-compat to fix this, download the old version
> here:
> http://koji.fedoraproject.org/koji/buildinfo?buildID=380886
> 
> Then do:
> rpm -Uvh --oldpackage libusb-0.1.4-1.fc19.*.rpm
> 
> That should fix this. Please let us know it this helps, if it does I'll
> start working on a fix for libusb-0.1.5 for this.
> 
> Regards,
> 
> Hans


epkowa driver:
esci-interpreter-gt-f720.0.1.1-2
Plugin for the GT-f720/S620 and Perfection V30/V300 Photo Epson scanners
http://avasys.jp/english/linux_e/
This plugin converts between the native protocol of the supported
devices and the ESC/I protocol.  The latter protocol is supported
by the SANE 'epkowa' backend, part of Image Scan!, and some other
SANE backends.  However, only the 'epkowa' backend provides the
hooks needed to use this plugin.
You only need this plugin if you have one of the following models:
 - Epson GT-F720
 - Epson GT-S620
 - Epson Perfection V30 (note: my scanner)
 - Epson Perfection V300 PHOTO
*************

Downgrade libusb-0.1.4-1.fc19.*.rpm resolve the problem. Before it is necessary remove also libusb-devel, if installed. 

Thanks for your help.

Radim

Comment 32 Hans de Goede 2013-12-05 11:07:53 UTC
(In reply to Radim Janku from comment #31)
> Downgrade libusb-0.1.4-1.fc19.*.rpm resolve the problem. Before it is
> necessary remove also libusb-devel, if installed. 

Thanks for testing I've filed a new bug to fix the misbehaviour in libusb which ultimately is the cause of all this trouble, see bug 1038566.

Comment 33 Bob Whitinger 2014-02-13 02:37:37 UTC
Just to document that this issue seems to still be active.  I experienced this problem in Fedora 20 with an Epson V500 scanner that was working under F19 but failed with "libusbx: warning [libusb_close] internal signalling write failed, closing anyway" after moving to F20.

Downgrading from libusb-0.1.5-2.fc20 to libusb-0.1.4-1.fc19 fixed the problem for me.  Fortunately the F19 rpm appears to be binary compatible in the F20 environment.

Comment 34 Hans de Goede 2014-02-14 09:39:16 UTC
Hi,

(In reply to Bob Whitinger from comment #33)
> Just to document that this issue seems to still be active.  I experienced
> this problem in Fedora 20 with an Epson V500 scanner that was working under
> F19 but failed with "libusbx: warning [libusb_close] internal signalling
> write failed, closing anyway" after moving to F20.
> 
> Downgrading from libusb-0.1.5-2.fc20 to libusb-0.1.4-1.fc19 fixed the
> problem for me.  Fortunately the F19 rpm appears to be binary compatible in
> the F20 environment.

Yes, this is my bad, I've not gotten around to fixing libusb for this, thanks for the reminder. A fix for this is now on its way to updates-testing. Note that this is a separate bug which is being tracked in bug 1038566, so any notifications about the updated packages will be going there instead of to this bug.

Regards,

Hans


Note You need to log in before you can comment on or make changes to this bug.