Bug 1220362
| Summary: | 502 proxy error when calling bugzilla.add_edit_component() from python-bugzilla | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Community] Bugzilla | Reporter: | Pierre-YvesChibon <pingou> | ||||
| Component: | WebService | Assignee: | PnT DevOps Devs <hss-ied-bugs> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | tools-bugs <tools-bugs> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 4.4 | CC: | jmcdonal, kevin, qgong | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2015-08-19 01:53:57 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Pierre-YvesChibon
2015-05-11 12:06:54 UTC
I've started another run of the script to see if I can generate the error again, but regardless of the 502 error, the script is now running more than 6 times slower than it used to :( The Bugzilla master database server has been experiencing some severe load spikes recently. Investigation is ongoing. I believe that mkeir added some mod_qos config to throttle the rate of requests from your script in an attempt to determine whether that was the source of the load spikes. The high load and the throttling will both be contributing to the increase in run-time for your script. At present, the evidence I can see in the logs suggests that the pkgdb-sync-bugzilla script is not the primary source of the load spikes we are seeing, but that the script can be improved. The script seems to make an excessive number of Component.update RPC calls, and there was a dramatic increase in the quantity of those on May 10. For the three days prior to that, pkgdb02.phx2.fedoraproject.org hit Bugzilla about 25,000 times per day. On May 10, it hit Bugzilla about 200,000 times. From inspection of logging data, it appears that the script is rewriting all component ownership information on every run, rather than just what has changed since the previous run. For example, the script called Component.update 608 times for the MySQL component on May 10. Looking at the attached log extract, it appears that there may be some other problems for that component, as the script is making separate calls for "MySQL" and "mysql", with different ownership details for each. Also, some calls include a component description and some do not. In any case, one can see that the assignee and default_cc for the component does not change from one call to the next, so there's no need for those calls to be made. Ideally, the script should only sync changes to the data and not rewrite data that Bugzilla already has up-to-date. Doing that would result in much shorter run-time for the script as well as reducing the load on Bugzilla. Finally, note that it is possible to update multiple components in a single call to Component.update, as described in https://bugzilla.redhat.com/docs/en/html/api/extensions/RedHat/lib/WebService/Component.html#Update_Components. Updating multiple components in a single call is considerably faster than updating the same number of components in separate calls, and again will help to reduce the load on Bugzilla. I agree that this script could be optimized, but it's still taking it 6 times longer than it used to :) At the time I ran the tests and before when I was running into these 502 errors I was speaking with mkeir on IRC whoe told me that the mod_qos config had been disabled. Together with him we figure out that we had several instances of our cron running which is what was producing the traffic you were seeing. I think these several instances of the script were, at least partly, due to the script taking much longer and thus the cron starting another copy while the first was still running. We are now using a lock to ensure this situation does not occur again in the future. As for the MySQL vs mysql example you give, pkgdb has both components: https://admin.fedoraproject.org/pkgdb/package/mysql/ https://admin.fedoraproject.org/pkgdb/package/MySQL/ Both are retired but they still exists and I guess they probably exists also in bugzilla's component list for Fedora. We have made some work on pkgdb to list all the package that are retired on all active branches, with the idea that they could be removed from bugzilla: https://admin.fedoraproject.org/pkgdb/api/#list_packages_retired but this might deserve a bug report of its own. I will see if we can port the script to be and smarter and doing the update of multiple components at once. (In reply to Pierre-YvesChibon from comment #4) > I agree that this script could be optimized, but it's still taking it 6 > times longer than it used to :) The increased run-time was at least partly due to another user overloading the system with large numbers of concurrent search queries. We're working with that user to improve their application so that it uses Bugzilla more fairly and we have also asked them to use staging systems for testing rather than the live system. > As for the MySQL vs mysql example you give, pkgdb has both components: > https://admin.fedoraproject.org/pkgdb/package/mysql/ > https://admin.fedoraproject.org/pkgdb/package/MySQL/ > Both are retired but they still exists and I guess they probably exists also > in bugzilla's component list for Fedora. This might be a problem when Bugzilla changes backend databases from MySQL to PostgreSQL in a few months time. Pg is case-insensitive by default, so I would expect that it might match both of these for a Component.update call instead of matching only the correct one. For retired components that's probably not a problem, but if there are other instances where the names of live components differ only in case, Component.update is likely to misbehave. > We have made some work on pkgdb to list all the package that are retired on > all active branches, with the idea that they could be removed from bugzilla: > https://admin.fedoraproject.org/pkgdb/api/#list_packages_retired > but this might deserve a bug report of its own. Note that a component that is retired can't be deleted if it is referenced by one or more bugs. Instead, the component is marked as "Not Enabled for Bugs", which prevents any new bugs being filed against the component and prevents any existing bugs being moved to that component. > I will see if we can port the script to be and smarter and doing the update > of multiple components at once. That would be appreciated. Feel free to ping the Bugzilla team if you have any further questions about the API. I think the issues that lead to the reported problem have been resolved. The user who triggered the original load spike has been moved to a test instance of Bugzilla and recent logging data shows that the number of hits from pkgdb02.phx2.fedoraproject.org fell from 633866 in March to 54241 in July, with a corresponding fall in the number of calls to Component.update. Please file new bugs for any future problems with accessing Red Hat Bugzilla. |