If RHQ_PLUGINS has plugin but that plugin jar is not on server filesystem, this causes the agent to go into a loop trying to get the plugin. unless you put the plugin jar on the server file system, the agent will infinitely loop. fixing the parent issue will implicitly fix this issue, since plugins are never on file system in the first place
"since plugins are never on file system in the first place" - this is actually not true. in fact, we will assure the file system has all plugins listed in rhq_plugins at startup.
the parent issue won't implicitly fix this because we are still going to have the server stream from the file system when serving plugins to the agent. this issue still needs to be addressed.
Actually, now that I think about it, this may be OK as it is today. The new agent deployer code will ensure that the file in rhq-plugins is up to date - if another server deployed it, the content will be in the database - the next scan on all other servers will detect a new plugin in the database and that new plugin will be pulled down to each server's filesystem. Therefore, once all scans complete, all servers have the file - thus all agents will get the same, updated plugin. The UI now has a way to force a server to perform a scan so you do not have to wait for a scan cycle to start. In the future, we can even add a service resource with a "scan" operation so you can request all servers to run a agent plugin scan immediately so they can all pick up the latest plugins at the same time. To avoid the possibility of server A updating a plugin to the database and an agent requesting that plugin from server B before server B had a chance to scan and find the new plugin, users should put all servers in maintenance mode, then force them all to scan then bring them back to normal mode. Because agent plugin updates go through the RHQ comm layer, they will be asked to "pause" while the servers are in MM. Once they go back to NORMAL, the agents can again begin to download the plugins.
I think with the changes made as part of the parent issue, this issue should not be a problem.
Opps... reading the parent issue, now I see what this issue will need to fix: 1) server A starts up 2) someone deploys a new plugin P to another server in the cloud 3) an agent asks server A for updated plugins 4) server A will see plugin P in the database but server A does not yet have the plugin P on the filesystem If the server sees it is missing a plugin on the file system, it should immediately request an agent plugin scan, at which point, the new plugin will be downloaded to from the DB. The agent can then be given the new plugin.
to test this, its simple: 1) configure server for a very long scan period in rhq-server.properties so it doesn't interfere with the test 2) start server 3) delete an agent plugin file from rhq-plugins 4) start a clean agent that has no plugins 5) the agent will get an error because the server doesn't have the plugin file anymore This issue should fix this such that when the agent asks the server, if the server is missing the plugin file, it should immediatley perform a agent plugin scan, at which point, the plugin file will get written and can be streamed to the agent. The code to do this immediate scan is something like: // Make sure file_to_stream exists - its possible another server deployed this // but our server hasn't had a chance to download it from the database yet. // If this file does not exist, we need to immediately perform an agent scan // which will pull down the plugin file from the database. if (!file_to_stream.exists()) { log.debug("Agent is asking for a plugin that isn't on file system [" + file_to_stream + "] - performing plugin scan"); LookupUtil.getAgentPluginURLDeploymentScanner().scan(); }
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-1360