Bug 1017534 - (bre, bugbot2) Bugzilla Rules Engine (aka Bugbot 2)
Bugzilla Rules Engine (aka Bugbot 2)
Status: CLOSED CURRENTRELEASE
Product: Bugzilla
Classification: Community
Component: Internal Tools (Show other bugs)
4.4
Unspecified Unspecified
high Severity high (vote)
: ---
: ---
Assigned To: Simon Green
Matt Tyson
:
Depends On: 956199 1088120
Blocks: 1080284 1076328 1080672 1080673 1080748 1081769 1083000 1083010 1083013 1085569 1086449 1119090 1119092 1266813
  Show dependency treegraph
 
Reported: 2013-10-10 02:35 EDT by Jason McDonald
Modified: 2015-09-28 00:52 EDT (History)
10 users (show)

See Also:
Fixed In Version: 4.4.4019
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-05-06 20:58:19 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jason McDonald 2013-10-10 02:35:01 EDT
This bug exists to track discussion of the future design of Bugbot.  Please feel free to raise issues and make constructive suggestions here....

The current implementation of Bugbot generally satisfies the needs for which it was designed, but still has few recurring issues:

1. Bugbot currently runs separately from the Bugzilla system and polls Bugzilla at high frequency, typically accounting for 15-20% of the hits on the Bugzilla web servers each month.

2. Changes to Bugbot rules generally require the involvement of the Bugbot maintainer for review, testing and deployment.

3. Misconfigured rules can "run away" and modify a large number of bugs before the misconfiguration can be detected and corrected. Correction may need to be done manually.  This has various flow-on effects including adding spurious noise to the bug histories and causing large numbers of notification emails to be distributed.

4. Conflicting rules can repeatedly flip bugs back and forth between two states.


Going forward, we should investigate whether it is desirable and feasible to move to a different architecture that would make it easier to address some or all of the above problems.

For reducing the load on Bugzilla, possibilities include, but are not limited to:
* listening to a message bus for bug changes and creation instead of polling Bugzilla,
* running as part of the Bugzilla system (e.g. via the job queue) and triggering directly on bug changes and creation.

Further suggestions about the future of Bugbot are welcome.
Comment 1 Simon Green 2013-10-10 03:23:13 EDT
Other issues with the current system:

1) Rule writing can introduce many errors. At least two bugbot changes this week had errors (bug 1016772 and bug 1017444) that meant they were causing errors that weren't expected.

2) It can take a long time from when a Bugbot bug is filed to when it is implemented. This has speed up in recent times, since we aren't reliant on HSS Engineering Operations to push the changes live (this now happens automatically hourly)

3) A single change can spew out many e-mails. Just look at publician bugs in the hss-ied-bugs-list

4) There is no emergency kill switch for bugbot.

5) Because bugbot works on all bugs that match a pattern, it is possible to modify a lot of bugs at once. If that is an error, it can take a while to undo.


My preferred solution is to replace it with a 'rules engine' that runs as an extension in Bugzilla, using the job queue. This fixes most of the problems that are been mentioned above.

The one thing that this approach cannot do is change bugs historically. There are two possible options around this:

a) Once we have implemented the ability to change flags in the 'Change multiple bugs at once' page, we can make it the responsibility of EPM to update the bugs.

b) A method to manually add bug numbers into the rule engine queue (which will probably be a link at the bottom of buglist.cgi for those that have the appropriate permission.

I'm in favour of having both options available. Updating 1,000 bugs via the 'Change multiple bugs at once' option is always going to lead to a proxy timeout.

Having said that, this is not a small change. At a guess, it would be at least one month of development (160 hours), including the time to convert from the existing bug bot. Once live, I would want to change a few rules over at a time rather than do it all at once.
Comment 3 Jason McDonald 2013-10-13 22:26:50 EDT
Another important capability that comes to mind is the ability to log the changes that are made and reverse them automatically should it turn out that a rule modifies bugs that it wasn't intended to touch.
Comment 4 Simon Green 2013-11-29 01:35:14 EST
(In reply to Jason McDonald from comment #3)
> Another important capability that comes to mind is the ability to log the
> changes that are made and reverse them automatically should it turn out that
> a rule modifies bugs that it wasn't intended to touch.

I was thinking about this last night. We can log bugbot activity in the bugs_activity (added = rule was run). If we decide to implement the ability to mass reverse changes, then the removed column can be used to remove it.

I'm going to spend the next two days speaking out my ideas for what bugbot2 will look like. If anyone has ideas, now is the time to contribute.
Comment 5 Simon Green 2013-12-02 02:02:10 EST
Having written most of the document today. Hope to finish it off tomorrow, and then let everyone review it. Rough estimate is that this is five weeks of work.
Comment 6 Simon Green 2013-12-02 22:35:59 EST
I've finished writing up my design document. Jason McDonald is going to look over it, and then I'll let EIP have a look. After that, I'll send an e-mail to the Bugzilla lists and EPM for their say.

I've estimated this to be approximately five weeks work.

  -- simon
Comment 7 Jason McDonald 2013-12-03 21:42:21 EST
I've reviewed the design document (with which I'm am very impressed) and Simon has addressed my feedback.  Now waiting on feedback from EIP.
Comment 61 Simon Green 2014-05-06 20:58:19 EDT
This change is now live. If there are any issues, do not reopen this bug.
Instead, you should create a new bug and reference this bug.

  -- simon

Note You need to log in before you can comment on or make changes to this bug.