Bug 1035338

Summary: Timeout for UrlConnection in UrlResource
Product: [JBoss] JBoss Enterprise BRMS Platform 5 Reporter: Toshiya Kobayashi <tkobayas>
Component: BRE (Expert, Fusion)Assignee: Mario Fusco <mfusco>
Status: VERIFIED --- QA Contact: Lukáš Petrovický <lpetrovi>
Severity: high Docs Contact:
Priority: unspecified    
Version: BRMS 5.3.1CC: nwallace
Target Milestone: GA   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1022758    

Description Toshiya Kobayashi 2013-11-27 15:17:18 UTC
Description of problem:

In UrlResource.grabLastMod() and UrlResource.grabStream(), UrlConnection is used but timeout is not set (setConnectTimeout/setReadTimeout). It will cause infinite Scanner thread blocking in case that Guvnor is unresponsive.


Steps to Reproduce:
1. Start BRMS
2. Start a client with KnowledgeAgent
3. Set a breakpoint in PackageDeploymentServlet.doGet() in Guvnor

===
    protected void doGet(final HttpServletRequest req,
                         final HttpServletResponse res) throws ServletException,
            IOException {

        doAuthorizedAction(req, res, new Command() {
            public void execute() throws Exception {
   HERE!==>     PackageDeploymentURIHelper helper = new PackageDeploymentURIHelper(req.getRequestURI());

                log.info("PackageName: " + helper.getPackageName());
                log.info("PackageVersion: " + helper.getVersion());
                log.info("PackageIsLatest: " + helper.isLatest());
                log.info("PackageIsSource: " + helper.isSource());
===

Actual results:

A client scanner thread gets stuck at UrlResource.grabStream()

Expected results:

Users can configure timeout value (e.g. system property). If timeout, an Exception is thrown and logged in client side.

Comment 1 Toshiya Kobayashi 2013-11-27 15:23:16 UTC
This could cause a side effect. When a scanner thread gets stuck at UrlResource.grabStream(), it holds a lock of KnowledgeAgentImpl.registeredResources so KnowledgeAgentImpl.getKnowledgeBase() in another thread could get stuck as well.

KnowledgeAgentImpl:
====
    public void applyChangeSet(ChangeSet changeSet) {
        synchronized ( this.registeredResources ) {
            this.eventSupport.fireBeforeChangeSetApplied( changeSet );

            this.listener.info( "KnowledgeAgent applying ChangeSet" );

            ChangeSetState changeSetState = new ChangeSetState();
            changeSetState.scanDirectories = this.scanDirectories;
            // incremental build is inverse of newInstance
            changeSetState.incrementalBuild = !(this.newInstance);

            // Process the new ChangeSet
            processChangeSet( changeSet,
                              changeSetState );
            // Rebuild or do an update to the KnowledgeBase
            buildKnowledgeBase( changeSetState );
            // Rebuild the resource mapping
            //buildResourceMapping();

            this.eventSupport.fireAfterChangeSetApplied( changeSet );
        }
    }
...
    public KnowledgeBase getKnowledgeBase() {
        synchronized ( this.registeredResources ) {
            return this.kbase;
        }
    }
====

Comment 2 Toshiya Kobayashi 2013-11-29 04:41:47 UTC
In the worst case of network issues, the TCP socket might be left without receiving FIN nor RST. Hence the Scanner thread would be blocked infinitely even after the issue is recovered.

Comment 3 Mario Fusco 2014-01-15 16:56:23 UTC
Fixed by https://github.com/droolsjbpm/drools/commit/4506dc93e

I added a system property named "drools.resource.urltimeout" through which you can specify a timeout in milliseconds. If not specified the default timeout is 10 seconds.