Bug 817890

Summary: jbossas-7 configure hook is expensive
Product: OKD Reporter: Rob Millner <rmillner>
Component: ContainersAssignee: Dan Mace <dmace>
Status: CLOSED NOTABUG QA Contact: libra bugs <libra-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.xCC: dmace, mfisher, mmcgrath
Target Milestone: ---Keywords: FutureFeature, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-10-10 18:30:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Rob Millner 2012-05-01 17:41:52 UTC
The jenkins benchmark job was run on stage build #185.  

The rhc-mcollective-log-profile script was run on the resulting 24 mcollective log files and produced a profile of the calls by their wall clock run time in milliseconds.

As can be seen from the results, the jbossas-7 configure hook takes almost 19s on average to run and was the most expensive call.

Examine the jbossas-7 configure hook and see if there's anything which can be done to speed it up.

tot msec:  calls:    avg  :       action        
--------: ------: --------: --------------------
 4940021:    266:    18571: jbossas-7 configure
 3626211:   3620:     1001: stickshift-node connector-execute
 3357679:    197:    17044: embedded/haproxy-1.4 deconfigure
 1941056:    229:     8476: jbossas-7 deconfigure
 1814393:    198:     9163: embedded/haproxy-1.4 configure
 1432663:   2036:      703: stickshift-node authorized-ssh-key-add
 1364407:    703:     1940: stickshift-node app-create
 1179695:    624:     1890: stickshift-node app-destroy
  653076:     60:    10884: ruby-1.8 deconfigure
  516860:    812:      636: stickshift-node broker-auth-key-add
  431055:     60:     7184: ruby-1.8 configure
  405231:    160:     2532: php-5.3 configure
  380163:    246:     1545: jbossas-7 expose-port
  353920:    120:     2949: php-5.3 deconfigure
  268603:     56:     4796: nodejs-0.6 deconfigure
  213069:     60:     3551: python-2.6 deconfigure
  198251:     60:     3304: perl-5.10 deconfigure
  182229:     56:     3254: nodejs-0.6 configure
  134233:     20:     6711: jenkins-1.4 configure
  127571:     60:     2126: python-2.6 configure
  121638:     60:     2027: perl-5.10 configure
  104503:     20:     5225: jenkins-1.4 deconfigure
   77762:    140:      555: php-5.3 expose-port
   63007:    120:      525: stickshift-node env-var-remove
   43063:     60:      717: stickshift-node env-var-add
   38130:    266:      143: jbossas-7 preconfigure
   29905:     20:     1495: diy-0.1 configure
   29132:     36:      809: nodejs-0.6 expose-port
   28374:     40:      709: perl-5.10 expose-port
   25431:     40:      635: python-2.6 expose-port
   21002:     40:      525: stickshift-node authorized-ssh-key-remove
   20815:    160:      130: php-5.3 preconfigure
   19005:     40:      475: ruby-1.8 expose-port
   15775:     20:      788: diy-0.1 deconfigure
    9681:     20:      484: jenkins-1.4 preconfigure
    8922:     56:      159: nodejs-0.6 preconfigure
    8470:     60:      141: python-2.6 preconfigure
    7963:     60:      132: perl-5.10 preconfigure
    7866:     60:      131: ruby-1.8 preconfigure
    3510:      2:     1755: stickshift-node cartridge-list
    2333:     20:      116: diy-0.1 preconfigure

Comment 1 Bill DeCoste 2012-05-04 15:18:38 UTC
This is almost entirely the JBoss start time. The configure hook waits for JBoss HTTP to become available and this takes the vast majority of the time in the hook. There isn't much we can do about this. The cpu limitations are the critical factor - if we remove the cpu limit, JBoss starts significantly faster.

Comment 2 Rob Millner 2012-05-04 17:07:59 UTC
Passing to Mike for followup on cgroups cpu quota.  The numbers may need to be adjusted or it may just cost too much.

Comment 3 Rob Millner 2012-08-08 18:51:22 UTC
A new benchmark run has started.  It should run until tomorrow and then some analysis is required to process the results.

https://ci.dev.openshift.redhat.com/jenkins/job/libra_benchmark/25/

Also, the other Jenkins jobs and build scripts have changed significantly since this task was last run so it may require fixing and re-run.

Comment 4 Rob Millner 2012-08-09 01:51:23 UTC
The previous job had failed.  Bug has been fixed and a new benchmark run started:
https://ci.dev.openshift.redhat.com/jenkins/job/libra_benchmark/26/

Comment 5 Rob Millner 2012-08-13 17:19:39 UTC
Came across another issue.  Our external DNS provider has become unstable enough that its proving impossible to get a good benchmark over the 8 hours it takes to run.

I'll see what can be done to work around this issue but suspect its not going to be done in this sprint.

Comment 6 Mike McGrath 2012-08-29 18:37:13 UTC
At present /etc/resolv.conf references localhost as a valid resolver, it shouldn't that should be removed before we can test if there's additional external issues.

Comment 7 Mike McGrath 2012-10-10 18:30:58 UTC
Bug Triage - We believe this to be mostly fixed now.