Bug 437500 - hald-addon-input eats 100% CPU with ThinkPad fingerprint reader
Summary: hald-addon-input eats 100% CPU with ThinkPad fingerprint reader
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: hal
Version: rawhide
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: David Zeuthen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: F9Target
TreeView+ depends on / blocked
 
Reported: 2008-03-14 16:26 UTC by Zdenek Kabelac
Modified: 2013-03-06 03:55 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-04-30 02:02:13 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Zdenek Kabelac 2008-03-14 16:26:31 UTC
Description of problem:
Today I've notice that during hald-addon-input suddenly started to eat 100%CPU

I've just grabbed some strace of process to show what the loop was doing:


poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN},
{fd=8, events=POLLIN}, {fd=9, events=POLLIN,
 revents=POLLNVAL}, {fd=3, events=POLLIN}, {fd=4, events=POLLIN}], 7, -1) = 1
poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN},
{fd=8, events=POLLIN}, {fd=9, events=POLLIN,
 revents=POLLNVAL}, {fd=3, events=POLLIN}, {fd=4, events=POLLIN}], 7, -1) = 1
...poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN},
{fd=8, events=POLLIN}, {fd=9, events=POLLIN,
 revents=POLLNVAL}, {fd=3, events=POLLIN}, {fd=4, events=POLLIN,
revents=POLLIN}], 7, -1) = 2
read(4,
"\217\243\332G\0\0\0\0=l\10\0\0\0\0\0\4\0\4\0\34\0\0\0\217\243\332G\0\0\0\0"...,
1024) = 72
read(4, 0x60e4f0, 1024)                 = -1 EAGAIN (Resource temporarily
unavailable)
poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN},
{fd=8, events=POLLIN}, {fd=9, events=POLLIN,
 revents=POLLNVAL}, {fd=3, events=POLLIN}, {fd=4, events=POLLIN}], 7, -1) = 1
.....



Version-Release number of selected component (if applicable):
1344688 hal-0.5.11-0.git20080304.4.fc9.x86_64

How reproducible:
don't know :(

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Zdenek Kabelac 2008-03-20 21:41:40 UTC
Happened to me again - now I've attached with gdb (unfortunately withoug
debuginfo this time :(

#0  0x00000035336dc30f in poll () from /lib64/libc.so.6
#1  0x0000003b9c83aaf8 in ?? () from /lib64/libglib-2.0.so.0
#2  0x0000003b9c83b18d in g_main_loop_run () from /lib64/libglib-2.0.so.0
#3  0x0000000000401d75 in g_str_equal ()
#4  0x000000353361e47a in __libc_start_main () from /lib64/libc.so.6
#5  0x0000000000401b29 in g_str_equal ()
#6  0x00007fffdd6b5368 in ?? ()
#7  0x000000000000001c in ?? ()
#8  0x0000000000000001 in ?? ()
#9  0x00007fffdd6b5e55 in ?? ()
#10 0x0000000000000000 in ?? ()


Hopefully I'll be better prepared...
(When I've detached gdb - the HAL daemon stopped its looping)
So I could not check again with debugs

Will try next time...


Comment 2 Zdenek Kabelac 2008-03-21 21:43:16 UTC
Ok - got the place with debuginfo - 
(gdb) bt
#0  0x00000035336dc30f in poll () from /lib64/libc.so.6
#1  0x0000003b9c83aaf8 in g_main_context_iterate (context=<value optimized out>,
block=<value optimized out>, 
    dispatch=<value optimized out>, self=<value optimized out>) at gmain.c:2945
#2  0x0000003b9c83b18d in IA__g_main_loop_run (loop=<value optimized out>) at
gmain.c:2844
#3  0x0000000000401d75 in main (argc=<value optimized out>, argv=<value
optimized out>) at addon-input.c:487

Thought the places was probably already obvious - so just some variables must be
set in weird way.
I guess it should be easy to see when this loop appears - most probably returned
descriptor is probably not being read for some reason ?

strace:

poll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN},
{fd=9, events=POLLIN, revents=POLLNVAL}, {fd=9, events=POLLIN,
revents=POLLNVAL}, {fd=5, events=POLLIN}, {fd=-1}, {fd=7, events=POLLIN}], 8,
-1) = 2
poll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN},
{fd=9, events=POLLIN, revents=POLLNVAL}, {fd=9, events=POLLIN,
revents=POLLNVAL}, {fd=5, events=POLLIN}, {fd=-1}, {fd=7, events=POLLIN}], 8,
-1) = 2
poll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN},
{fd=9, events=POLLIN, revents=POLLNVAL}, {fd=9, events=POLLIN,
revents=POLLNVAL}, {fd=5, events=POLLIN}, {fd=-1}, {fd=7, events=POLLIN}], 8,
-1) = 2
poll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN},
{fd=9, events=POLLIN, revents=POLLNVAL}, {fd=9, events=POLLIN,
revents=POLLNVAL}, {fd=5, events=POLLIN}, {fd=-1}, {fd=7, events=POLLIN,
revents=POLLIN}], 8, -1) = 3
read(7,
"i*\344G\0\0\0\0\245\337\3\0\0\0\0\0\4\0\4\0\35\0\0\0i*\344G\0\0\0\0"..., 1024) = 72
read(7, 0x60dfb0, 1024)                 = -1 EAGAIN (Resource temporarily
unavailable)
poll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN},
{fd=9, events=POLLIN, revents=POLLNVAL}, {fd=9, events=POLLIN,
revents=POLLNVAL}, {fd=5, events=POLLIN}, {fd=-1}, {fd=7, events=POLLIN}], 8,
-1) = 2
poll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN},
{fd=9, events=POLLIN, revents=POLLNVAL}, {fd=9, events=POLLIN,
revents=POLLNVAL}, {fd=5, events=POLLIN}, {fd=-1}, {fd=7, events=POLLIN}], 8,
-1) = 2
poll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN},
{fd=9, events=POLLIN, revents=POLLNVAL}, {fd=9, events=POLLIN,
revents=POLLNVAL}, {fd=5, events=POLLIN}, {fd=-1}, {fd=7, events=POLLIN}], 8,
-1) = 2


ltrace:g_io_channel_read_chars(0x60d720, 0x60d690, 24, 0x7fff703dfde0,
0x7fff703dfde8) = 1
g_io_channel_read_chars(0x60d720, 0x60d690, 24, 0x7fff703dfde0, 0x7fff703dfde8) = 1
g_io_channel_read_chars(0x60d720, 0x60d690, 24, 0x7fff703dfde0, 0x7fff703dfde8) = 3
g_io_channel_read_chars(0x60d720, 0x60d690, 24, 0x7fff703dfde0, 0x7fff703dfde8) = 1
g_io_channel_read_chars(0x60d720, 0x60d690, 24, 0x7fff703dfde0, 0x7fff703dfde8) = 1
g_io_channel_read_chars(0x60d720, 0x60d690, 24, 0x7fff703dfde0, 0x7fff703dfde8) = 1
g_io_channel_read_chars(0x60d720, 0x60d690, 24, 0x7fff703dfde0, 0x7fff703dfde8) = 3
g_io_channel_read_chars(0x60d720, 0x60d690, 24, 0x7fff703dfde0, 0x7fff703dfde8) = 1
g_io_channel_read_chars(0x60d720, 0x60d690, 24, 0x7fff703dfde0, 0x7fff703dfde8) = 1
g_io_channel_read_chars(0x60d720, 0x60d690, 24, 0x7fff703dfde0, 0x7fff703dfde8) = 1
g_io_channel_read_chars(0x60d720, 0x60d690, 24, 0x7fff703dfde0, 0x7fff703dfde8) = 3


Are you able to get something from this - or do you need some more help ?


Comment 3 Zdenek Kabelac 2008-04-04 18:00:41 UTC
Problem still persists with the version hal-0.5.11-0.2.rc2.fc9.x86_64
in case it would be interesting...


Comment 4 Jeremy Fitzhardinge 2008-04-12 05:21:50 UTC
I'm seeing this too, in hal-0.5.11-0.5.rc2.fc9.i386.  The strace is a continious
series of:

poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=POLLIN},
{fd=7, events=POLLIN}, {fd=8, events=POLLIN}, {fd=11, events=POLLIN,
revents=POLLNVAL}, {fd=3, events=POLLIN}, {fd=10, events=POLLIN}, {fd=9,
events=POLLIN}], 9, -1) = 1

Note that fd=11 is getting POLLNVAL; its not an open fd for the process (it
doesn't appear in /proc/<pid>/fd for hald-addon-input).

Comment 5 Jeremy Fitzhardinge 2008-04-12 05:29:55 UTC
I wonder if its caused by thinkfinger, the thinkpad fingerprint reader?  I'm
using thinkfinger configured to allow me to login with gdm.  When it does so, it
creates a virtual keyboard device, which is then removed once the login is complete.

As an experiment, I ran "su" while hald-addon-input was spinning.  The new input
device appeared, and hald-addon-input opened it as fd 11, which stopped the poll
spinning.  When I swiped my finger and the thinkfinger device was removed,
hald-addon-input dealt with the device removal properly and is no longer polling
on fd 11 and is not spinning.

I wonder, however, if there's a race or something which makes it fail to deal
devices being removed, and it ends up spinning on their now-closed fds?

Comment 6 Zdenek Kabelac 2008-04-14 09:30:14 UTC
Hmmm - I'm also using thinkpad fingerprint reader - and I also often usually
notice this problem after login via gdm after suspend  (and usually in this case
for some reason gdb doens't even show me availability of reader - thus only
password is displayd - not sure if this combines together.

But I'm sure that in at least one case the hal started to loop without gdm
interaction, but maybe I've been logging on console - thus reader might be
involved (usually there is slight delay when I notice that 1 cpu is permanently
busy).

I'll keep an eye on this and instead of restarting haldaemon I'll try to play
with thinkfinger device.

Comment 7 Zdenek Kabelac 2008-04-15 09:11:19 UTC
I can confirm that as soon as the next thinfinger reading appears via the login
the spinnig of hald-addon-input stops.


Comment 8 Matthias Clasen 2008-04-17 18:54:43 UTC
Bastien, you have been playing with thinkfinger earlier, any insights ?

Comment 9 Bastien Nocera 2008-04-17 19:32:38 UTC
If by earlier you mean last year :)

It's probably caused by thinkfinger's uinput crap (they're using uinput to feed
newlines into the auth when you've swiped a finger, so it's like the user
pressing "Enter" after having typed in their password).

This is the code:
http://thinkfinger.svn.sourceforge.net/viewvc/thinkfinger/pam/pam_thinkfinger-uinput.c?view=markup

Comment 10 Bastien Nocera 2008-04-21 09:26:41 UTC
This should fix it:
http://thread.gmane.org/gmane.comp.freedesktop.hal/11326

Comment 11 Christopher Aillon 2008-04-29 20:45:06 UTC
The link in that thread is wrong.

http://gitweb.freedesktop.org/?p=hal.git;a=commitdiff;h=38d16647bf66f2e26a882ccd9bb5a0273278338d

is the diff that fixes it.  David, can you please add this to the F9 package?

Comment 12 David Zeuthen 2008-04-30 02:02:13 UTC
Fixed in 0.5.11-0.7.rc2.fc9 and 0.5.11-0.7.rc2.fc10.


Note You need to log in before you can comment on or make changes to this bug.