| Summary: | 'fedpkg build' high CPU consumption | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Jaroslav Škarvada <jskarvad> | ||||
| Component: | rpkg | Assignee: | Dennis Gilmore <dennis> | ||||
| Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 23 | CC: | bochecha, cqi, dennis, lsedlar, pbabinca, s | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2016-08-22 08:33:48 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
|
Description
Jaroslav Škarvada
2016-03-22 17:35:51 UTC
(In reply to Jaroslav Škarvada from comment #0) > Why does it use aggressive polling? Typo, the right question: Why? Does it use aggressive polling? Can you try running the build with koji directly to see if it has the same symptoms? koji build f24 --scratch $(fedpkg giturl) When watching tasks, it polls Koji for the status once per second. As far as I know, there is no other way to get the same information. I guess all we could do here is change the interval. I guess the code is coming from the rpkg package which, regarding the watch method, seems to be just copy&paste of the code from the koji client. Koji client itself seems to consume a bit less CPU, but it still seems too much for me. It seems koji uses 5 seconds polling interval, but rpkg uses 1 second polling interval. I think it shouldn't matter much, 1 second is pretty long, but I think the following is the problem: http://fpaste.org/344262/ Notice the number of select calls in less than 100 milliseconds - this is what's utilizing the CPU. Increasing the sleep delay to 5 seconds didn't help much:
18:28:53.190394 select(4, [3], [], [], {43200, 0}) = 1 (in [3], left {43199, 585421})
18:28:53.605273 read(3, "\27\3\3\1,", 5) = 5
18:28:53.605495 read(3, "e\210\v8\3101+yQR\234p\343[\366\311\213\276b\227\217\226V1z\356\371\34/\16r\7"..., 300) = 300
18:28:53.605877 select(4, [3], [], [], {43200, 0}) = 1 (in [3], left {43199, 999994})
18:28:53.606229 select(4, [3], [], [], {43200, 0}) = 1 (in [3], left {43199, 999995})
.....
18:28:53.630567 select(4, [3], [], [], {43200, 0}) = 1 (in [3], left {43199, 598114})
18:28:54.032746 read(3, "\27\3\3\1+", 5) = 5
18:28:54.032944 read(3, "e\210\v8\3101+{\274\177\251%Ze8\347l\255\201\231\34]\267\242\33-\360\276].4\352"..., 299) = 299
18:28:54.033262 select(4, [3], [], [], {43200, 0}) = 1 (in [3], left {43199, 999994})
18:28:54.033516 select(4, [3], [], [], {43200, 0}) = 1 (in [3], left {43199, 999996})
...
You can see it never sleeps longer than 0.4 seconds.
The following hack of _watch_koji_tasks seems to improve the situation: - time.sleep(1) + clk = time.time() + while time.time() < clk + 5: + time.sleep(5) Or maybe better hack:
clk = time.time()
clkn = clk
while clkn < clk + 5:
time.sleep(clk + 5 - clkn)
clkn = time.time()
I retried on another machine and it seems that just: time.sleep(5) should be enough. (In reply to Jaroslav Škarvada from comment #7) > I retried on another machine and it seems that just: > time.sleep(5) > should be enough. and consistent with koji cli Created attachment 1139677 [details]
Proposed fix
Lower CPU utilization by increasing the sleep interval and make it consistent with koji-cli defaults.
There may be more to optimize (e.g. the number of select calls), but this should be a good start. I just tried running a scratch-build (didn't have anything to build), and fedpkg barely uses any CPU here. Those select calls you mention, do they happen during the "time.sleep(1)" ? Or do they happen in between sleeps, when fedpkg is actually doing something? (In reply to Mathieu Bridon from comment #11) > I just tried running a scratch-build (didn't have anything to build), and > fedpkg barely uses any CPU here. > > Those select calls you mention, do they happen during the "time.sleep(1)" ? > Or do they happen in between sleeps, when fedpkg is actually doing something? It seems it happen between sleeps. By looking on the strace, it's full of: 17:08:55.278088 select(4, [3], [], [], {43200, 0}) = 1 (in [3], left {43199, 999994}) 17:08:55.278412 select(4, [3], [], [], {43200, 0}) = 1 (in [3], left {43199, 999994}) i.e. waiting for one FD #3, waiting for input, waiting for 43200 seconds, returned after 43199.999994 seconds. Is the timing so critical? Cannot it wait longer and read the FD in batches? As of fedpkg-1.24-3.fc24.noarch, this does not happen. Thanks for your effort to dig into this issue. To close this bug at this moment. Feel free to reopen if this performance issue happens again. |