| Summary: | idling calibre busy with select polling with short timeouts | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Frank Ch. Eigler <fche> |
| Component: | calibre | Assignee: | Kevin Fenzi <kevin> |
| Status: | CLOSED EOL | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 25 | CC: | chkr, frankly3d, helio, kevin, nushio, zbyszek |
| Target Milestone: | --- | Keywords: | Reopened |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-12-12 10:41:02 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Can you catch it eariler and see what exactly it's trying to poll here? Did some random gdb backtraces, and it looks like it's all over the place; multiple threads doing multiple things. The calibre server functionality is part of it, but even with it turned off, we get other threads doing
[pid 23766] <... select resumed> ) = 0 (Timeout)
[pid 23766] select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 23763] <... select resumed> ) = 0 (Timeout)
[pid 23763] select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
[pid 23763] select(0, NULL, NULL, NULL, {0, 2000}) = 0 (Timeout)
[pid 23763] select(0, NULL, NULL, NULL, {0, 4000}) = 0 (Timeout)
[pid 23763] select(0, NULL, NULL, NULL, {0, 8000}) = 0 (Timeout)
[pid 23763] select(0, NULL, NULL, NULL, {0, 16000}) = 0 (Timeout)
[pid 23763] select(0, NULL, NULL, NULL, {0, 32000} <unfinished ...>
[pid 23767] <... select resumed> ) = 0 (Timeout)
[pid 23767] select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 23766] <... select resumed> ) = 0 (Timeout)
[pid 23766] select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 23763] <... select resumed> ) = 0 (Timeout)
calibre uses several other downstream python libraries for communications & such, and a brief search didn't immediately identify a single calibre-side knob for general purpose polling intervals. For example, that 50000-us poll is done by :
(gdb) py-bt
Traceback (most recent call first):
File "/usr/lib64/python2.7/threading.py", line 362, in wait
_sleep(delay)
File "/usr/lib64/python2.7/Queue.py", line 177, in get
self.not_empty.wait(remaining)
File "/usr/lib64/calibre/calibre/utils/ipc/server.py", line 246, in run
job = self.add_jobs_queue.get(True, 0.2)
File "/usr/lib64/python2.7/threading.py", line 804, in __bootstrap_inner
self.run()
File "/usr/lib64/python2.7/threading.py", line 777, in __bootstrap
self.__bootstrap_inner()
so that 0.2 could be way too small. Another thread:
(gdb) thread 9
[Switching to thread 9 (Thread 0x2b240702f700 (LWP 23767))]
#0 0x00002b23ca5d31c3 in select () at ../sysdeps/unix/syscall-template.S:84
84 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) py-bt
Traceback (most recent call first):
File "/usr/lib64/python2.7/threading.py", line 362, in wait
_sleep(delay)
File "/usr/lib64/python2.7/threading.py", line 617, in wait
self.__cond.wait(timeout, balancing)
File "/usr/lib64/calibre/calibre/db/backup.py", line 44, in wait
if self.stop_running.wait(interval):
File "/usr/lib64/calibre/calibre/db/backup.py", line 50, in run
self.wait(self.interval)
File "/usr/lib64/python2.7/threading.py", line 804, in __bootstrap_inner
self.run()
File "/usr/lib64/python2.7/threading.py", line 777, in __bootstrap
self.__bootstrap_inner()
... etc., so kind of all over. Just would be nice to whack one mole at a time until the strace quiets down.
It might be worth bringing this up upstream ( https://bugs.launchpad.net/calibre ) as he might know what areas might be causing this. This message is a reminder that Fedora 24 is nearing its end of life. Approximately 2 (two) weeks from now Fedora will stop maintaining and issuing updates for Fedora 24. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '24'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 24 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 24 changed to end-of-life (EOL) status on 2017-08-08. Fedora 24 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. This message is a reminder that Fedora 25 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 25. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '25'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 25 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 25 changed to end-of-life (EOL) status on 2017-12-12. Fedora 25 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. |
Run calibre-2.60.0-2.fc24.x86_64 Minimize it, do not interact with it on network etc. strace -f it. Observe active polling loops with very short timeouts: [pid 26246] <... select resumed> ) = 0 (Timeout) [pid 26246] select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...> [pid 26242] <... select resumed> ) = 0 (Timeout) [pid 26242] select(0, NULL, NULL, NULL, {0, 35727} <unfinished ...> [pid 26245] <... select resumed> ) = 0 (Timeout) [pid 26245] select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...> [pid 26247] <... select resumed> ) = 0 (Timeout) [pid 26247] select(0, NULL, NULL, NULL, {0, 100000} <unfinished ...> [pid 26242] <... select resumed> ) = 0 (Timeout) [pid 26242] select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) [pid 26242] select(0, NULL, NULL, NULL, {0, 2000}) = 0 (Timeout) [pid 26242] select(0, NULL, NULL, NULL, {0, 4000} <unfinished ...> [pid 26282] <... select resumed> ) = 0 (Timeout) [pid 26282] select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...> [pid 26242] <... select resumed> ) = 0 (Timeout) [pid 26242] select(0, NULL, NULL, NULL, {0, 8000} <unfinished ...> [pid 26246] <... select resumed> ) = 0 (Timeout) [pid 26246] select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...> [pid 26242] <... select resumed> ) = 0 (Timeout) [pid 26242] select(0, NULL, NULL, NULL, {0, 16000} <unfinished ...> [pid 26245] <... select resumed> ) = 0 (Timeout) [pid 26245] select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...> [pid 26242] <... select resumed> ) = 0 (Timeout) [pid 26242] select(0, NULL, NULL, NULL, {0, 32000} <unfinished ...> ... so the process sucks up processor time for no apparent reason.