Bug 2036631 - Large Copr repositories time out on `copr list-packages`
Summary: Large Copr repositories time out on `copr list-packages`
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Copr
Classification: Community
Component: backend
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Copr Team
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-03 13:08 UTC by Karolina Surma
Modified: 2022-01-03 13:35 UTC (History)
1 user (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2022-01-03 13:24:30 UTC
Embargoed:


Attachments (Terms of Use)

Description Karolina Surma 2022-01-03 13:08:47 UTC
Description of problem:
For large Copr repositories it becomes impossible to use CLI/API requests to process the results due to server timeout. 
In my case repository with 20000 builds still worked fine, but when it grew to 50000, I can't query its packages anymore.

What I want to achieve, is understanding which builds failed to produce SRPM and which failed during the RPM build.
I can request Copr repository monitor but I won't get the information I'm interested in. I can workaround it by querying each particular failed build (`copr-cli get-package --name foo`) to understand where they failed, but it was much more straightforward with the `copr-cli list-packages`

It would be better if I could set the extended timeout value on CLI if I suspect my query shall require more time.


Version-Release number of selected component (if applicable):
current copr.fedorainfracloud.org (I'm sorry, I'm unable to find any version information on the website)
I have installed: copr version 1.97

How reproducible:
Always

Steps to Reproduce:
1. Run: copr-cli --debug list-packages --with-latest-build @copr/PyPI


Actual results:
$ copr-cli --debug list-packages --with-latest-build @copr/PyPI
#  Debug log enabled  #
[13:19:50] {/usr/lib/python3.10/site-packages/urllib3/connectionpool.py:971} DEBUG - Starting new HTTPS connection (1): copr.fedorainfracloud.org:443
[13:19:51] {/usr/lib/python3.10/site-packages/urllib3/connectionpool.py:452} DEBUG - https://copr.fedorainfracloud.org:443 "GET /api_3/package/list?ownername=%40copr&projectname=PyPI&with_latest_build=True&with_latest_succeeded_build=False HTTP/1.1" 308 520
[13:20:51] {/usr/lib/python3.10/site-packages/urllib3/connectionpool.py:452} DEBUG - https://copr.fedorainfracloud.org:443 "GET /api_3/package/list/?ownername=%40copr&projectname=PyPI&with_latest_build=True&with_latest_succeeded_build=False HTTP/1.1" 504 247

Something went wrong:
Error: Request is not in JSON format, there is probably a bug in the API code.

Server response:
----------------


504 Gateway Timeout

Gateway Timeout
The gateway did not receive a timely response
from the upstream server or application.


Expected results:
json with packages is fetched.

Comment 1 Pavel Raiskup 2022-01-03 13:24:30 UTC
Thanks for reporting this!

This is tracked here: https://pagure.io/copr/copr/issue/757
I will close this to avoid duplication.

> copr-cli --debug list-packages --with-latest-build @copr/PyPI

FTR, it should be possible to use the optimized "monitor" call:
$ time copr monitor --output-format text-row @copr/PyPI --fields "name, chroot, build_id, state, url_build_log, url_backend_log, url_build" > /tmp/full.json
real    0m23.367s
user    0m4.181s
sys     0m1.167s

> What I want to achieve, is understanding which builds failed to produce SRPM and which failed during the RPM build.

Check the field `url_build_log` when it is None.   We could create a separate field for this (upstream issue preferred).

Comment 2 Karolina Surma 2022-01-03 13:35:31 UTC
Thank you for the prompt response as well as providing the alternative, I will check this out.


Note You need to log in before you can comment on or make changes to this bug.