Bug 1275422

Summary: Bugzilla.getProducts() method throws 502 Server Error: proxy error for url: https://bugzilla.redhat.com/xmlrpc.cgi
Product: [Community] Bugzilla Reporter: Spam Kicha <spamkicha>
Component: Bugzilla GeneralAssignee: PnT DevOps Devs <hss-ied-bugs>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.4CC: crobinso, dzickus, jmcdonal, jskarvad, mtahir, qgong, wwoods
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Mac OS   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-05 01:58:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Spam Kicha 2015-10-26 21:11:39 UTC
Description of problem:
I'm using python-bugzilla to access bugzilla.redhat.com and I get http 502 server error when I call Bugzilla.getproducts() method.

How reproducible:
very consistent

Steps to Reproduce:
Here is the python REPL output

Python 2.7.10 (v2.7.10:15c95b7d81dc, May 23 2015, 09:33:12)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import bugzilla
>>> print bugzilla.version
1.2.2
>>> b = bugzilla.Bugzilla(url="https://bugzilla.redhat.com")
>>> print b.getbug(2)
#2      CLOSED     - Preston Brown - broken links of default index.html
>>> b.getproducts()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bugzilla/base.py", line 766, in getproducts
    self._products = self._getproducts(**kwargs)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bugzilla/base.py", line 900, in _getproducts
    r = self._getproductinfo(product_ids['ids'], **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bugzilla/base.py", line 895, in _getproductinfo
    ret = self._proxy.Product.get(kwargs)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xmlrpclib.py", line 1240, in __call__
    return self.__send(self.__name, args)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bugzilla/base.py", line 168, in _ServerProxy__request
    ret = ServerProxy._ServerProxy__request(self, methodname, params)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xmlrpclib.py", line 1599, in __request
    verbose=self.__verbose
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bugzilla/base.py", line 259, in request
    return self._request_helper(url, request_body)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bugzilla/base.py", line 236, in _request_helper
    response.raise_for_status()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/models.py", line 837, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 502 Server Error: Proxy Error for url: https://bugzilla.redhat.com/xmlrpc.cgi

Comment 1 Cole Robinson 2015-10-28 13:52:53 UTC
This isn't anything specific to python-bugzilla: the bare Products.get() call hits the proxy error AKA timeout because it is trying to return a ton of data: info on every product and every component

However if you tell Product.get to only return product names for example, it takes under 3 seconds:

$ cat test.py 
import bugzilla

bzapi = bugzilla.Bugzilla("bugzilla.redhat.com")
bzapi.getproducts(include_fields=["name"])

$ time python test.py 

real	0m2.390s
user	0m0.148s
sys	0m0.039s

So moving this back to bugzilla.redhat.com but I don't know if there's really any 'fix' here

Comment 2 Jason McDonald 2015-10-29 05:24:45 UTC
In my opinion, this is not a bug in BZ per-se, but rather an over-ambitious use of the API.

I am not currently aware of any valid use-cases that require fetching all product metadata for all products in Red Hat Bugzilla in one go. If you really do need to get data for all products, I would be interested to hear more about your use-case.

Note that approximately one-third of the products in the database are retired, and it's rare that anybody needs to fetch data about those.

Second, some of the active products contain very large lists of components, and I believe that these are the main reason for the call timing out.  Fedora has ~18,000 components. Each RHEL version has a few thousand.

If you only need to know about specific products, Bugzilla will respond faster if you only request data for those products.  Likewise, if you don't need all the fields, Bugzilla will respond faster if you only ask for the fields you need.

In case you haven't seen it, there is some documentation for Product.get and related API calls at https://bugzilla.redhat.com/docs/en/html/api/Bugzilla/WebService/Product.html.  This includes information about how to limit the fields you ask for.

Comment 3 Spam Kicha 2015-11-05 00:44:26 UTC
Hi, sorry this maybe my ignorance. I didn't expect bugzilla.getproducts() to return massive amount of data back. I was actually trying to refine my query to reduce the load on the server and wanted to get a listing of products tracked in bugzilla.

Comment 4 Cole Robinson 2015-11-05 01:58:53 UTC
Simplest way to get just the list of product names is

  bzapi.getproducts(include_fields=["name"])