Red Hat Bugzilla – Bug 1017534
Bugzilla Rules Engine (aka Bugbot 2)
Last modified: 2015-09-28 00:52:34 EDT
This bug exists to track discussion of the future design of Bugbot. Please feel free to raise issues and make constructive suggestions here....
The current implementation of Bugbot generally satisfies the needs for which it was designed, but still has few recurring issues:
1. Bugbot currently runs separately from the Bugzilla system and polls Bugzilla at high frequency, typically accounting for 15-20% of the hits on the Bugzilla web servers each month.
2. Changes to Bugbot rules generally require the involvement of the Bugbot maintainer for review, testing and deployment.
3. Misconfigured rules can "run away" and modify a large number of bugs before the misconfiguration can be detected and corrected. Correction may need to be done manually. This has various flow-on effects including adding spurious noise to the bug histories and causing large numbers of notification emails to be distributed.
4. Conflicting rules can repeatedly flip bugs back and forth between two states.
Going forward, we should investigate whether it is desirable and feasible to move to a different architecture that would make it easier to address some or all of the above problems.
For reducing the load on Bugzilla, possibilities include, but are not limited to:
* listening to a message bus for bug changes and creation instead of polling Bugzilla,
* running as part of the Bugzilla system (e.g. via the job queue) and triggering directly on bug changes and creation.
Further suggestions about the future of Bugbot are welcome.
Other issues with the current system:
1) Rule writing can introduce many errors. At least two bugbot changes this week had errors (bug 1016772 and bug 1017444) that meant they were causing errors that weren't expected.
2) It can take a long time from when a Bugbot bug is filed to when it is implemented. This has speed up in recent times, since we aren't reliant on HSS Engineering Operations to push the changes live (this now happens automatically hourly)
3) A single change can spew out many e-mails. Just look at publician bugs in the hss-ied-bugs-list
4) There is no emergency kill switch for bugbot.
5) Because bugbot works on all bugs that match a pattern, it is possible to modify a lot of bugs at once. If that is an error, it can take a while to undo.
My preferred solution is to replace it with a 'rules engine' that runs as an extension in Bugzilla, using the job queue. This fixes most of the problems that are been mentioned above.
The one thing that this approach cannot do is change bugs historically. There are two possible options around this:
a) Once we have implemented the ability to change flags in the 'Change multiple bugs at once' page, we can make it the responsibility of EPM to update the bugs.
b) A method to manually add bug numbers into the rule engine queue (which will probably be a link at the bottom of buglist.cgi for those that have the appropriate permission.
I'm in favour of having both options available. Updating 1,000 bugs via the 'Change multiple bugs at once' option is always going to lead to a proxy timeout.
Having said that, this is not a small change. At a guess, it would be at least one month of development (160 hours), including the time to convert from the existing bug bot. Once live, I would want to change a few rules over at a time rather than do it all at once.
Another important capability that comes to mind is the ability to log the changes that are made and reverse them automatically should it turn out that a rule modifies bugs that it wasn't intended to touch.
(In reply to Jason McDonald from comment #3)
> Another important capability that comes to mind is the ability to log the
> changes that are made and reverse them automatically should it turn out that
> a rule modifies bugs that it wasn't intended to touch.
I was thinking about this last night. We can log bugbot activity in the bugs_activity (added = rule was run). If we decide to implement the ability to mass reverse changes, then the removed column can be used to remove it.
I'm going to spend the next two days speaking out my ideas for what bugbot2 will look like. If anyone has ideas, now is the time to contribute.
Having written most of the document today. Hope to finish it off tomorrow, and then let everyone review it. Rough estimate is that this is five weeks of work.
I've finished writing up my design document. Jason McDonald is going to look over it, and then I'll let EIP have a look. After that, I'll send an e-mail to the Bugzilla lists and EPM for their say.
I've estimated this to be approximately five weeks work.
I've reviewed the design document (with which I'm am very impressed) and Simon has addressed my feedback. Now waiting on feedback from EIP.
This change is now live. If there are any issues, do not reopen this bug.
Instead, you should create a new bug and reference this bug.