Bug 1392016

Summary: idling calibre busy with select polling with short timeouts
Product: [Fedora] Fedora Reporter: Frank Ch. Eigler <fche>
Component: calibreAssignee: Kevin Fenzi <kevin>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 25CC: chkr, frankly3d, helio, kevin, nushio, zbyszek
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-12 10:41:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Frank Ch. Eigler 2016-11-04 15:15:13 UTC
Run calibre-2.60.0-2.fc24.x86_64
Minimize it, do not interact with it on network etc.
strace -f it.
Observe active polling loops with very short timeouts:

[pid 26246] <... select resumed> )      = 0 (Timeout)
[pid 26246] select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 26242] <... select resumed> )      = 0 (Timeout)
[pid 26242] select(0, NULL, NULL, NULL, {0, 35727} <unfinished ...>
[pid 26245] <... select resumed> )      = 0 (Timeout)
[pid 26245] select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 26247] <... select resumed> )      = 0 (Timeout)
[pid 26247] select(0, NULL, NULL, NULL, {0, 100000} <unfinished ...>
[pid 26242] <... select resumed> )      = 0 (Timeout)
[pid 26242] select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
[pid 26242] select(0, NULL, NULL, NULL, {0, 2000}) = 0 (Timeout)
[pid 26242] select(0, NULL, NULL, NULL, {0, 4000} <unfinished ...>
[pid 26282] <... select resumed> )      = 0 (Timeout)
[pid 26282] select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 26242] <... select resumed> )      = 0 (Timeout)
[pid 26242] select(0, NULL, NULL, NULL, {0, 8000} <unfinished ...>
[pid 26246] <... select resumed> )      = 0 (Timeout)
[pid 26246] select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 26242] <... select resumed> )      = 0 (Timeout)
[pid 26242] select(0, NULL, NULL, NULL, {0, 16000} <unfinished ...>
[pid 26245] <... select resumed> )      = 0 (Timeout)
[pid 26245] select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 26242] <... select resumed> )      = 0 (Timeout)
[pid 26242] select(0, NULL, NULL, NULL, {0, 32000} <unfinished ...>

... so the process sucks up processor time for no apparent reason.

Comment 1 Kevin Fenzi 2016-11-04 16:25:21 UTC
Can you catch it eariler and see what exactly it's trying to poll here?

Comment 2 Frank Ch. Eigler 2016-11-04 16:38:34 UTC
Did some random gdb backtraces, and it looks like it's all over the place; multiple threads doing multiple things.   The calibre server functionality is part of it, but even with it turned off, we get other threads doing 

[pid 23766] <... select resumed> )      = 0 (Timeout)
[pid 23766] select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 23763] <... select resumed> )      = 0 (Timeout)
[pid 23763] select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
[pid 23763] select(0, NULL, NULL, NULL, {0, 2000}) = 0 (Timeout)
[pid 23763] select(0, NULL, NULL, NULL, {0, 4000}) = 0 (Timeout)
[pid 23763] select(0, NULL, NULL, NULL, {0, 8000}) = 0 (Timeout)
[pid 23763] select(0, NULL, NULL, NULL, {0, 16000}) = 0 (Timeout)
[pid 23763] select(0, NULL, NULL, NULL, {0, 32000} <unfinished ...>
[pid 23767] <... select resumed> )      = 0 (Timeout)
[pid 23767] select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 23766] <... select resumed> )      = 0 (Timeout)
[pid 23766] select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 23763] <... select resumed> )      = 0 (Timeout)


calibre uses several other downstream python libraries for communications & such, and a brief search didn't immediately identify a single calibre-side knob for general purpose polling intervals.  For example, that 50000-us poll is done by :

(gdb) py-bt
Traceback (most recent call first):
  File "/usr/lib64/python2.7/threading.py", line 362, in wait
    _sleep(delay)
  File "/usr/lib64/python2.7/Queue.py", line 177, in get
    self.not_empty.wait(remaining)
  File "/usr/lib64/calibre/calibre/utils/ipc/server.py", line 246, in run
    job = self.add_jobs_queue.get(True, 0.2)
  File "/usr/lib64/python2.7/threading.py", line 804, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.7/threading.py", line 777, in __bootstrap
    self.__bootstrap_inner()

so that 0.2 could be way too small.  Another thread:

(gdb) thread 9
[Switching to thread 9 (Thread 0x2b240702f700 (LWP 23767))]
#0  0x00002b23ca5d31c3 in select () at ../sysdeps/unix/syscall-template.S:84
84	T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) py-bt
Traceback (most recent call first):
  File "/usr/lib64/python2.7/threading.py", line 362, in wait
    _sleep(delay)
  File "/usr/lib64/python2.7/threading.py", line 617, in wait
    self.__cond.wait(timeout, balancing)
  File "/usr/lib64/calibre/calibre/db/backup.py", line 44, in wait
    if self.stop_running.wait(interval):
  File "/usr/lib64/calibre/calibre/db/backup.py", line 50, in run
    self.wait(self.interval)
  File "/usr/lib64/python2.7/threading.py", line 804, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.7/threading.py", line 777, in __bootstrap
    self.__bootstrap_inner()


... etc., so kind of all over.  Just would be nice to whack one mole at a time until the strace quiets down.

Comment 3 Kevin Fenzi 2016-11-06 02:34:00 UTC
It might be worth bringing this up upstream ( https://bugs.launchpad.net/calibre ) as he might know what areas might be causing this.

Comment 4 Fedora End Of Life 2017-07-25 23:46:12 UTC
This message is a reminder that Fedora 24 is nearing its end of life.
Approximately 2 (two) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 24. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '24'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 24 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 5 Fedora End Of Life 2017-08-08 19:08:35 UTC
Fedora 24 changed to end-of-life (EOL) status on 2017-08-08. Fedora 24 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 6 Fedora End Of Life 2017-11-16 19:10:52 UTC
This message is a reminder that Fedora 25 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 25. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '25'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 25 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 7 Fedora End Of Life 2017-12-12 10:41:02 UTC
Fedora 25 changed to end-of-life (EOL) status on 2017-12-12. Fedora 25 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.