Created attachment 460829 [details] strace output from elinks Description of problem: elinks startup is very slow if pcscd isn't running. At startup elinks looks for /var/run/pcscd.comm If it doesn't exist, it waits and looks again, and again... Version-Release number of selected component (if applicable): elinks-0.12-0.23.pre5.fc14.x86_64 How reproducible: Every time. Steps to Reproduce: 1. service pcscd stop 2. elinks 3. Actual results: 10 second delay looking for /var/run/pcscd.comm Expected results: Instant startup Additional info: "touch /var/run/pcscd.comm" cures the problem
Thanks for reporting the bug! I can confirm it. It ends up hanging in pcsc-lite-libs on call of NSS_Init(): #0 __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:82 #1 SYS_USleep (iTimeVal=<value optimized out>) at sys_unix.c:69 #2 SCardEstablishContext (dwScope=<value optimized out>, ...) at winscard_clnt.c:411 #3 ckyCardContext_establish (ctx=0x9c0970, scope=0) at cky_card.c:464 #4 CKYCardContext_Create (scope=0) at cky_card.c:488 #5 SlotList::SlotList (this=0x9c08f0, log_=<value optimized out>) at slot.cpp:69 #6 C_Initialize (pInitArgs=<value optimized out>) at coolkey.cpp:253 #7 secmod_ModuleInit (mod=0x9b52b0, reload=0x7fffffffd910, ...) at pk11load.c:252 #8 secmod_LoadPKCS11Module (mod=0x9b52b0, oldModule=0x7fffffffd910) at pk11load.c:492 #9 SECMOD_LoadModule (modulespec=0x9ae730 "library=libcoolkeypk11.so ...) at pk11pars.c:1108 #10 SECMOD_LoadModule (modulespec=0x9aa2f0 "name=\"NSS Internal Module\" ...) at pk11pars.c:1143 #11 nss_InitModules (configdir=0x39d580b9a2 "/etc/pki/nssdb", ...) at nssinit.c:461 #12 nss_Init (configdir=0x39d580b9a2 "/etc/pki/nssdb", ...) at nssinit.c:620 #13 NSS_Init (configdir=<value optimized out>) at nssinit.c:714 #14 SSL_library_init () at ssl.c:1258 #15 init_openssl (module=<value optimized out>) at ssl.c:52 #16 init_module (module=0x73b2a0) at module.c:116 #17 init_module (module=0x73b160) at module.c:120 #18 init_modules (modules=0x739fa0) at module.c:173 #19 init () at main.c:184 #20 select_loop (init=0x46f870 <init>) at select.c:222 #21 main (argc=<value optimized out>, argv=<value optimized out>) at main.c:353 (gdb) info line SCardEstablishContext Line 388 of "winscard_clnt.c" starts at address 0x7ffff18e1f70 <SCardEstablishContext> and ends at 0x7ffff18e1f72 <SCardEstablishContext+2>. (gdb) info sym SCardEstablishContext SCardEstablishContext in section .text of /usr/lib64/libpcsclite.so.1 $ rpm -qf /usr/lib64/libpcsclite.so.1 pcsc-lite-libs-1.6.4-1.fc14.x86_64
running within or without MUTT elinks hangs for me. strace shows: nanosleep({0, 100000000}, NULL) = 0 stat("/var/run/pcscd.comm", 0x7ffff40acf60) = -1 ENOENT (No such file or directory) nanosleep({0, 100000000}, NULL) = 0 stat("/var/run/pcscd.comm", 0x7ffff40acf60) = -1 ENOENT (No such file or directory) nanosleep({0, 100000000}, NULL) = 0 ....... over and over ......
A workaround is to just create that file: # touch /var/run/pcscd.comm ... but I have no idea what the file is supposed to be, neither what it is useful for. Bob, any idea? OpenSSL-powered elinks does not seem to suffer from that flaw...
It's part of the pcscd subsystem. Is pcsc-lite running? I think it's trying to start the pcsc-lite subsystem. In the old days it just failed if pcsc-lite was not running. You could probably do the following: set pcsc-lite to start from services automatically. or remove libcoolkey from /etc/pki/nssdb bob
(In reply to comment #4) > It's part of the pcscd subsystem. Is pcsc-lite running? The reproducer is just two commands (see comment #0). I guess that answers your question. But elinks needs to be started immediately, no matter if pcsc-lite is running or not. > You could probably do the following: > > set pcsc-lite to start from services automatically. > or > remove libcoolkey from /etc/pki/nssdb That means?
Looking at the backtrace from comment #1 I tend to think that performing such operations on call of NSS_Init() is really bad idea. Do you realize that one needs to initialize NSS, in order to be able to just perform some DES encryption using an internal slot? This bug hits also elinks' users that do not use SSL at all...
okay, confirmed this worked I did the following: yum install pcsc-lite /sbin/chkconfig --level 3 pcscd on /etc/init.d/pcscd start elinks google.com yah! I think I dropped pcsc when my machine was having h/w problems and thought it may be causing it.
Have the same problem after upgrading from Fedora 13 to 14. Don't really understand why "pcscd" service needs to be running for elinks to start without delay. Have never seen that with any other application before. Is this a bug or a feature? ;-)
(In reply to comment #8) > Is this a bug or a feature? ;-) The delay on _startup_ of elinks is definitely a bug. It needs to be fixed.
The delay is caused by the libpcsclite userspace library which is trying to start the pcscd daemon on demand. I think the fix which makes sense for F14 is to just disable the on-demand startup code. I'll come up with a patch to do that and submit an update shortly. Assigning the bug to myself.
pcsc-lite-1.6.4-2.fc14 has been submitted as an update for Fedora 14. https://admin.fedoraproject.org/updates/pcsc-lite-1.6.4-2.fc14
pcsc-lite-1.6.4-2.fc14 has been pushed to the Fedora 14 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update pcsc-lite'. You can provide feedback for this update here: https://admin.fedoraproject.org/updates/pcsc-lite-1.6.4-2.fc14
Sigh, we have a couple of things interacting here. 1. pcsc-lite got the automatic restart functionality so that pcscd didn't need to be started by the system at startup. 2. coolkey's installation always looks for pcscd, which always triggers the pcscd restart. So the if we remove the pcsc-lite automatic restart functionality, then we should turn on pcscd by default. A better solution is to not install coolkey by default, then there won't be a trigger to start pcsc-lite. I'm worried that users of other smart cards may have regressions if we just blindly disable the startup code. (Of course if these complaints don't materialize, then this patch is clearly the simplest). bob
I don't think this is something to worry about: the automatic startup functionality never worked in Fedora, as the users don't have enough permissions to run pcscd. pcscd needs to be suid root for that the userspace library to be able to start it, but in Fedora we never installed pcscd as suid root. So I'm disabling here something that could not have possibly worked for anybody else than the root user. With pcsc 1.6.6 / ccid 1.4.1, however, I think we'll be able to get proper hotplug working through udev and can get pcscd to automatically start up when USB readers are plugged in. This also means getting rid of the libhal dep.
Bob, as I've already said above, NSS_Init() is really not the right place to start system services. It has to be called on startup of many applications that mostly can't gain anything from that services. Why don't you try to employ another part of the NSS API for that purpose? NSS_Init() needs to be fast.
> I don't think this is something to worry about: the automatic startup > functionality never worked in Fedora, as the users don't have enough > permissions to run pcscd. OK, then I don't have as much problem with it. > With pcsc 1.6.6 / ccid 1.4.1, however, I think we'll be able to get proper > hotplug working through udev and can get pcscd to automatically start up when > USB readers are plugged in. In general, that would be preferable. I think we may have some coolkey work to handle the case where your smart card has the usb reader built in (not a rare case). bob
pcsc-lite-1.6.4-2.fc14 has been pushed to the Fedora 14 stable repository. If problems still persist, please make note of it in this bug report.
I think I fixed the problem in SVN revision 6016 http://lists.alioth.debian.org/pipermail/pcsclite-cvs-commit/2011-October/005567.html The idea is to not wait if pcscd is not installed. By default the waiting time was 5 seconds not 10. PS: it would have been fine to report this bug upstream so I can fix it much sooner.
(In reply to comment #18) > The idea is to not wait if pcscd is not installed. By default the waiting time > was 5 seconds not 10. Does it handle the case when pcscd is installed but not started and the user has insufficient permissions to start it? > PS: it would have been fine to report this bug upstream so I can fix it much > sooner. I completely agree on this. This is the guideline that Fedora packagers _should_ follow: http://fedoraproject.org/wiki/Staying_close_to_upstream_projects
(In reply to comment #19) > Does it handle the case when pcscd is installed but not started and the user > has insufficient permissions to start it? It should be the case now after revisions 6018 and 6019. http://lists.alioth.debian.org/pipermail/pcsclite-cvs-commit/2011-October/005569.html http://lists.alioth.debian.org/pipermail/pcsclite-cvs-commit/2011-October/005570.html
Thanks for the fixes Ludovic! Unfortunately, I cannot confirm they resolve the issue as reported in comment #0 -- if pcsc is compiled without --disable-autostart, it still hangs with the same backtrace. If pcsc is compiled with --disable-autostart, elinks starts immediately, no matter if r6015 or r6019 is used. Additionally, the new process created by pcsc calls nearly close() on all descriptors up to 1023, while vast majority of them ends up with EBADF. There is still some badness going on...
Created attachment 527013 [details] strace of elinks' startup with upstream pcsc r6019
(In reply to comment #21) > Thanks for the fixes Ludovic! Unfortunately, I cannot confirm they resolve the > issue as reported in comment #0 -- if pcsc is compiled without > --disable-autostart, it still hangs with the same backtrace. If pcsc is > compiled with --disable-autostart, elinks starts immediately, no matter if > r6015 or r6019 is used. You need to use r6016 (not r6015), 6018 and 6019. You can use pcsc-lite-1.7.4-svn-6019.tar.bz2 from http://ludovic.rousseau.free.fr/softwares/pcsc-lite/ In your trace the problem is that /usr/local/sbin/pcscd does not exist (the daemon is not installed). This is solved in r6016. > Additionally, the new process created by pcsc calls nearly close() on all > descriptors up to 1023, while vast majority of them ends up with EBADF. There > is still some badness going on... I know. I don't think this is harmfull. Do you know a portable/POSIX way to get the list of opened file descriptors?
(In reply to comment #23) > You need to use r6016 (not r6015), 6018 and 6019. I wanted to compare the original version (r6015 is the unfixed one) with the new version (r6019 is the latest). > You can use pcsc-lite-1.7.4-svn-6019.tar.bz2 from > http://ludovic.rousseau.free.fr/softwares/pcsc-lite/ Thanks. I have no problems with using SVN directly. > In your trace the problem is that /usr/local/sbin/pcscd does not exist (the > daemon is not installed). The daemon is installed to /usr/sbin/pcscd > This is solved in r6016. So why am I observing elinks hanging up on startup with r6016? > I know. I don't think this is harmfull. If this is intentional (I have never seen it before), then the problem must be elsewhere. > Do you know a portable/POSIX way to get the list of opened file descriptors? I do not know the code. Is fcntl/FD_CLOEXEC what you are looking for?
(In reply to comment #24) > (In reply to comment #23) > > You need to use r6016 (not r6015), 6018 and 6019. > > I wanted to compare the original version (r6015 is the unfixed one) with the > new version (r6019 is the latest). It looks like your trace in comment 22 does not include the fix of libpcsclite from revision 6016. Can you regenerate a trace using the the latest pcsc-lite revision? > So why am I observing elinks hanging up on startup with r6016? Does it hang or just wait for 5 seconds before continuing? > > Do you know a portable/POSIX way to get the list of opened file descriptors? > > I do not know the code. Is fcntl/FD_CLOEXEC what you are looking for? I want to close file descriptors I did not open myself. I do not want the pcscd daemon to inherit from file descriptors opened by the application. Starting pcscd from the libpcsclite has problems like this one. Using systemd instead should solve them.
Created attachment 527018 [details] strace of elinks' startup with upstream pcsc r6020 (In reply to comment #25) > It looks like your trace in comment 22 does not include the fix of libpcsclite > from revision 6016. Do I need to rebuild pcscd for the fix to take effect? > Can you regenerate a trace using the the latest pcsc-lite revision? Attached. > Does it hang or just wait for 5 seconds before continuing? It waits 10 seconds before continuing.
Thanks for the fixes, but the code paths involved aren't used at all in Fedora packages. The whole forking system daemon from a user space library concept is flawed. One of the many problems is that it doesn't play nice with SELinux at all. The proper way to do on-demand startup is to tell some system daemon to start pcscd, instead of using fork() to spawn a new process from libpcsclite. See e.g. https://bugzilla.redhat.com/show_bug.cgi?id=639373 "SELinux is preventing /usr/lib/nspluginwrapper/npviewer.bin "execute" access on /usr/sbin/pcscd." One way to fix it would be to use DBUS bus activation. Another way would be using systemd socket activation. I have implemented the on-demand startup using systemd in upstream pcsc-lite and this is what Fedora packages starting from F16+ are using. http://lists.alioth.debian.org/pipermail/pcsclite-cvs-commit/2011-September/005480.html http://lists.alioth.debian.org/pipermail/pcsclite-cvs-commit/2011-September/005482.html http://lists.alioth.debian.org/pipermail/pcsclite-cvs-commit/2011-September/005484.html Kamil, are you actually having any problems with Fedora packages or are you just interested in upstream pcsc-lite development?
(In reply to comment #27) > Kamil, are you actually having any problems with Fedora packages or are you > just interested in upstream pcsc-lite development? No, there is no problem in Fedora any more since pcsc is configured with --disable-autostart. I was just checking if the current upstream fixes took effect.
(In reply to comment #26) > Created attachment 527018 [details] > strace of elinks' startup with upstream pcsc r6020 > > (In reply to comment #25) > > It looks like your trace in comment 22 does not include the fix of libpcsclite > > from revision 6016. > > Do I need to rebuild pcscd for the fix to take effect? No. Only the libpcsclite part is involved. But you configured with --prefix=/usr instead of /usr/local as in the previous trace. And /usr/sbin/pcscd do exist. So the problem is not the same as in the previous trace. So in your new configuration you have to rebuild and install the new pcscd. Can you try again: - with not pcscd installed on your system - with the latest pcscd installed and tell me. If you still have the problem generate a trace. > > Does it hang or just wait for 5 seconds before continuing? > > It waits 10 seconds before continuing. The coolkey PKCS#11 (used my elinks) may be doing 2 calls to SCardEstablishContext() so you have to pay the 5 seconds delay two times. Thanks for your efforts (even if the bug is not present in Fedora).
(In reply to comment #29) > No. Only the libpcsclite part is involved. But you configured with > --prefix=/usr instead of /usr/local as in the previous trace. And > /usr/sbin/pcscd do exist. So the problem is not the same as in the previous > trace. My bad. I was running it against pcscd from the Fedora package. However, the same happens with pcscd from usptream r6018. > So in your new configuration you have to rebuild and install the new pcscd. Indeed. Installing pcscd from r6019 to /usr/sbin/pcscd finally fixed it. > The coolkey PKCS#11 (used my elinks) may be doing 2 calls to > SCardEstablishContext() so you have to pay the 5 seconds delay two times. Your are right. Actually the coolkey module itself is being initialized twice per one call of NSS_Init(). > Thanks for your efforts (even if the bug is not present in Fedora). The long-term goal is to make elinks over NSS runnable on other distros, too. But that is another story :-)
Good. So the 10 seconds delay is now fixed upstream. Right?
Yes. At least for the case I was testing. That is, pcscd is installed but not running and elinks being started by a regular user.