Hide Forgot
Description of problem: In UrlResource.grabLastMod() and UrlResource.grabStream(), UrlConnection is used but timeout is not set (setConnectTimeout/setReadTimeout). It will cause infinite Scanner thread blocking in case that Guvnor is unresponsive. Steps to Reproduce: 1. Start BRMS 2. Start a client with KnowledgeAgent 3. Set a breakpoint in PackageDeploymentServlet.doGet() in Guvnor === protected void doGet(final HttpServletRequest req, final HttpServletResponse res) throws ServletException, IOException { doAuthorizedAction(req, res, new Command() { public void execute() throws Exception { HERE!==> PackageDeploymentURIHelper helper = new PackageDeploymentURIHelper(req.getRequestURI()); log.info("PackageName: " + helper.getPackageName()); log.info("PackageVersion: " + helper.getVersion()); log.info("PackageIsLatest: " + helper.isLatest()); log.info("PackageIsSource: " + helper.isSource()); === Actual results: A client scanner thread gets stuck at UrlResource.grabStream() Expected results: Users can configure timeout value (e.g. system property). If timeout, an Exception is thrown and logged in client side.
This could cause a side effect. When a scanner thread gets stuck at UrlResource.grabStream(), it holds a lock of KnowledgeAgentImpl.registeredResources so KnowledgeAgentImpl.getKnowledgeBase() in another thread could get stuck as well. KnowledgeAgentImpl: ==== public void applyChangeSet(ChangeSet changeSet) { synchronized ( this.registeredResources ) { this.eventSupport.fireBeforeChangeSetApplied( changeSet ); this.listener.info( "KnowledgeAgent applying ChangeSet" ); ChangeSetState changeSetState = new ChangeSetState(); changeSetState.scanDirectories = this.scanDirectories; // incremental build is inverse of newInstance changeSetState.incrementalBuild = !(this.newInstance); // Process the new ChangeSet processChangeSet( changeSet, changeSetState ); // Rebuild or do an update to the KnowledgeBase buildKnowledgeBase( changeSetState ); // Rebuild the resource mapping //buildResourceMapping(); this.eventSupport.fireAfterChangeSetApplied( changeSet ); } } ... public KnowledgeBase getKnowledgeBase() { synchronized ( this.registeredResources ) { return this.kbase; } } ====
In the worst case of network issues, the TCP socket might be left without receiving FIN nor RST. Hence the Scanner thread would be blocked infinitely even after the issue is recovered.
Fixed by https://github.com/droolsjbpm/drools/commit/4506dc93e I added a system property named "drools.resource.urltimeout" through which you can specify a timeout in milliseconds. If not specified the default timeout is 10 seconds.