Bug 802902 (JOPR-429)

Summary: jboss-cache-v3-plugin: Shows invalid "unavailable" state for cache services that are not currently deployed/used in EAP
Product: [JBoss] JBoss Operations Network Reporter: Larry O'Leary <loleary>
Component: Plugin -- OtherAssignee: RHQ Project Maintainer <rhq-maint>
Status: NEW --- QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: urgent    
Version: JON 3.2CC: fbrychta, hrupp, jshaughn, loleary
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugzilla.redhat.com/show_bug.cgi?id=803776
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
JON 2.3 with EAP plug-in pack Sun JVM 1.6.0_16 RHEL5 (kernel: 2.6.18-128.el5) jopr-jboss-cache-v3-plugin-2.3.0.GA.jar EAP5 production install/configuration
Last Closed: Type: Enhancement
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Larry O'Leary 2012-03-13 13:34:44 EDT
Issue was originally identified as https://issues.jboss.org/browse/JOPR-429


When JON discovers JBoss Cache 3 services all seems fine. At some future point all the discovered cache services show an unavailable state. This is because EAP will only deploy the caches that are needed or requested by a deployed application. For example, standard-session-cache is only deployed when a web application that utilizes the session cache is deployed and started.

So, if a user deploys an application to EAP that requires a cache, the cache will be deployed and started. Later, if the application no longer requires the cache or is un-deployed, the user will continue to see this false unavailable state.

To demonstrate:

   1. Make a copy of production configuration as jon-jbcache3-status-issue
      cd ${JBOSS_HOME}/server
      cp -R production jon-jbcache3-status-issue
      cd jon-jbcache3-status-issue/conf/props
      sed -i.orig 's/^# admin=admin/admin=admin/' jmx-console-users.properties
      cd ../../..

      # Statup EAP5 instance using service binding manager
      cd ${JBOSS_HOME}/server/jon-jbcache3-status-issue/log
      rm boot.log cluster.log server.log
      cd ${JBOSS_HOME}/bin
      ./run.sh -c jon-jbcache3-status-issue -Djboss.partition.name=MyPartition -Djboss.platform.mbeanserver -Djboss.service.binding.set=ports-01 -b &
      sleep 10
      # EAP instance should be at http://localhost:8180

      # Startup RHQ-Server
      cd ${JON_HOME}/logs
      rm boot.log rhq-server-log4j.log
      cd ${JON_HOME}/bin
      ./rhq-server.sh start
      sleep 10
      # JON Server should be at http://localhost:7080

      # Startup RHQ-Agent
      cd ${JON_AGENT_HOME}/logs
      rm agent.log
      cd ${JON_AGENT_HOME}/bin
      export RHQ_AGENT_DEBUG=1
      ./rhq-agent-wrapper.sh start
      sleep 10

      * From JON, import the newly deployed EAP instance
      * Once imported and available, expand its JBoss Cache resource
      Notice ha-partition and MyPartition-HAPartitionCache are the only caches listed
      * Expand ha-partition
      Notice all cache services show as available
      * Expand MyPartition-HAPartitionCache
      Notice all cache services show as available

      Deploy the sample counter.war web application:

      cd /tmp
      curl -O http://community.jboss.org/servlet/JiveServlet/download/11823-10-5589/counter.zip
      rm -r counter_DIR
      unzip counter.zip -d counter_DIR
      cp counter_DIR/counter/dist/counter.war ${JBOSS_HOME}/server/jon-jbcache3-status-issue/deploy

    * Wait for EAP to pick-up the new deployment
    * Wait for JON to discover and inventory new WAR
    * Once inventoried and available, expand JBoss Cache resource
      Notice ha-partition, MyPartition-HAPartitionCache, MyPartition-SessionCache, and standard-session-cache are listed
    * Select each of the four cache services
      Notice all cache services show as available

You can also see these cache services via JMX in EAP:
cd ${JBOSS_HOME}/bin
./twiddle.sh -s localhost:1199 -u admin -p admin query 'jboss.cache:*'

    * Un-inventory and remove the counter.war web application (Inventory tab of EAP instance)

At this point (once JON reflects the change in availability), some of the cache services for standard-session-cache and SessionCache will show as unavailable.

If you restart the EAP instance, all the cache services for these two cache resources will show unavailable. This is because as application deployments no longer need the caches service, they are no longer needed.

If the counter.war application is re-deployed, the caches will reflect available again.

It is understood that "unavailable" mean that something is no longer available, but it gives the user a false sense of a failure in this situation.
Comment 2 Mike Foley 2012-03-19 12:16:38 EDT
re-evaluate post 3.1 (per triage asantos, loleary, ccrouch, mfoley)
Comment 3 Heiko W. Rupp 2013-09-10 03:34:58 EDT
We may solve that with the DISABLED availability state
Comment 4 Larry O'Leary 2013-09-10 10:28:09 EDT
I'm not sure DISABLED will help here. Services are deployed as needed. In which case, if an application isn't actively using the session cache for example, the session cache appears to _disappear_.

From what I understand _DISABLE_ not only disables availability checking for the resource but also metric collection? I wouldn't want that. Instead, I would want metrics collected for the resource when they are available. I would also like to know when the resource is active or not-active. And I would like to know if the resource wasn't deployed due to deployment error (i.e. it will never become available).
Comment 13 Larry O'Leary 2014-10-15 14:48:29 EDT
Setting target to 3.4 as this can not be addressed in 3.3.

It also is not yet clear how this can or will be fixed. Please note that there is no 3.4 release planned at this time and this is only being targeted for consideration in a 3.4 release if one comes into existence.
Comment 16 Larry O'Leary 2015-09-24 10:10:52 EDT
Removing the target on this BZ. This needs to be reviewed by product management as a feature/design change. This will allow for proper planning and triage.