+++ This bug was initially created as a clone of Bug #867609 +++ Description of problem: The embedded cartridge list from "rhc app cartridge list" doesn't show a new cartridge after adding a cartridge rpm on a node. Version-Release number of selected component (if applicable): node: stickshift-abstract-0.16.1-1.el6_3.noarch rubygem-stickshift-common-0.15.2-1.el6_3.noarch stickshift-mcollective-agent-0.3.1-1.el6_3.noarch rubygem-stickshift-node-0.16.2-1.el6_3.noarch broker: rubygem-stickshift-controller-0.16.1-1.el6_3.noarch stickshift-broker-0.6.8-1.el6_3a.noarch How reproducible: Steps to Reproduce: 1. Install a node with minimal set of cartridges. e.g. php-5.3 and cron-1.4 2. Run "rhc app cartridge list" on a client, which should return only cron-1.4 3. yum install another cartridge on the node. e.g. cartridge-mysql-5.1 4. Run "rhc app cartridge list" on a client I ran into this in a single broker, single node installation. Actual results: The cartridge list in #4 only shows cron-1.4. ss-cartridge-list on the node shows php-5.3, cron-1.4, and mysql-5.1. Expected results: List of supported embedded cartridges: Obtaining list of cartridges (please excuse the delay)... cron-1.4, mysql-5.1 Additional info: In my case, this turned out to be caching problem on the broker. I manually cleared the cache via a rails dev console on the broker, and the rhc cartridge list immediately added mysql-5.1. Is it possible to expire the broker's cache when a new cartridge is registered? --- Additional comment from dmcphers on 2012-10-17 17:09:46 EDT --- This is currently as designed. You have to clear the cache to get new data picked up immediately. We could provide a command to run that would clear the cache but that doesn't seem much different than manually clearing the cache. Is your complaint really that it wasn't obvious your needed to clear the cache? --- Additional comment from john on 2012-10-17 18:10:15 EDT --- Near term, a note in the docs about the need to flush cache on the broker would be great. Long term, it seems like the cache should expire or something should trigger a flush when a new cartridge is available. There are probably other state change conditions that should trigger a flush, as well. After installing the mysql cartridge, I restarted httpd on the broker, restarted services on the node, removed and reinstalled the mysql cartridge a few times, rebooted the node, rebooted the broker, rebooted my client, and ended up leaving the entire thing overnight. It still didn't show up the next morning. The command "rhc app cartridge add -a testphpapp -c mysql-5.1" failed with an invalid type error. I guess requiring a manual cache flush isn't any more deus ex machina than most apps require. It would just be nice for the software to self manage. --- Additional comment from xjia on 2012-10-22 03:26:08 EDT --- I have one node and one broker, and don't install any node packages on broker. Firstly, I install all the cartridge packages except jbosseap and jbossews. Then I create one php application. Thirdly I install jbosseap and jbossews cartridge on node. Modify "config.action_controller.perform_caching = false" in the file "/var/www/openshift/broker/config/environments/development.rb", then restart openshift-broker service. Execute "rhc setup", it show that "Connection to server timed out. It is possible the operation finished without being able to report success. Use 'rhc domain show' or 'rhc app status' to check the status of your applications" Version-Release number of selected component (if applicable): http://buildvm-devops.usersys.redhat.com/puddle/build/OpenShiftEnterprise/Beta/2012-10-19.4/ How reproducible: Always Steps to Reproduce: 1.First install all kinds of cartridge except jbosseap and jbossews cartridge 2.Create one php cartridge successfully 3.Install jbosseap and jbossews cartridge on node 4.Modify "config.action_controller.perform_caching = false" in the file "/var/www/openshift/broker/config/environments/development.rb" 5.Restart Openshift-broker service 6.execute "rhc setup" 7.Modify "config.action_controller.perform_caching = true" in the file "/var/www/openshift/broker/config/environments/development.rb", and restart openshift-broker service 8.execute "rhc setup" Actual results: 6. Show message "Connection to server timed out. It is possible the operation finished without being able to report success. Use 'rhc domain show' or 'rhc app status' to check the status of your applications." 8.Can't see the jbosseap and jbossews cartridge in the list. Expected results: 6. Should success and the cartridge list should contain jbossews and jbosseap 8. The cartridge list should contain jbossews and jbosseap Additional info: Maybe I misunderstand something about it, I hope you can help me resolve this problem. The log in development.log <--snip--> Started GET "/broker/rest/cartridges" for 10.66.65.143 at Mon Oct 22 02:48:17 -0400 2012 Processing by CartridgesController#index as JSON CURRENT SERVER: node0.example.com DEBUG: find_one_impl: current_server: node0.example.com DEBUG: rpc_exec_direct: rpc_client=#<MCollective::RPC::Client:0x7fee4bf45ba0> DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"cartridge-list", :cartridge=>"openshift-origin-node", :args=>{"--with-descriptors"=>true, "--porcelain"=>true}}, @id, {'identity' => @id}) CURRENT SERVER: node0.example.com DEBUG: find_one_impl: current_server: node0.example.com DEBUG: rpc_exec_direct: rpc_client=#<MCollective::RPC::Client:0x7fee4bf64ed8> DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"cartridge-list", :cartridge=>"openshift-origin-node", :args=>{"--with-descriptors"=>true, "--porcelain"=>true}}, @id, {'identity' => @id}) CURRENT SERVER: node0.example.com DEBUG: find_one_impl: current_server: node0.example.com DEBUG: rpc_exec_direct: rpc_client=#<MCollective::RPC::Client:0x7fee4bf893a0> DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"cartridge-list", :cartridge=>"openshift-origin-node", :args=>{"--with-descriptors"=>true, "--porcelain"=>true}}, @id, {'identity' => @id}) CURRENT SERVER: node0.example.com DEBUG: find_one_impl: current_server: node0.example.com DEBUG: rpc_exec_direct: rpc_client=#<MCollective::RPC::Client:0x7fee4bfa5758> DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"cartridge-list", :cartridge=>"openshift-origin-node", :args=>{"--with-descriptors"=>true, "--porcelain"=>true}}, @id, {'identity' => @id}) CURRENT SERVER: node0.example.com DEBUG: find_one_impl: current_server: node0.example.com DEBUG: rpc_exec_direct: rpc_client=#<MCollective::RPC::Client:0x7fee4bfc3460> DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"cartridge-list", :cartridge=>"openshift-origin-node", :args=>{"--with-descriptors"=>true, "--porcelain"=>true}}, @id, {'identity' => @id}) CURRENT SERVER: node0.example.com DEBUG: find_one_impl: current_server: node0.example.com DEBUG: rpc_exec_direct: rpc_client=#<MCollective::RPC::Client:0x7fee4bfdc848> DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"cartridge-list", :cartridge=>"openshift-origin-node", :args=>{"--with-descriptors"=>true, "--porcelain"=>true}}, @id, {'identity' => @id}) CURRENT SERVER: node0.example.com DEBUG: find_one_impl: current_server: node0.example.com DEBUG: rpc_exec_direct: rpc_client=#<MCollective::RPC::Client:0x7fee4bffb608> DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"cartridge-list", :cartridge=>"openshift-origin-node", :args=>{"--with-descriptors"=>true, "--porcelain"=>true}}, @id, {'identity' => @id}) CURRENT SERVER: node0.example.com DEBUG: find_one_impl: current_server: node0.example.com DEBUG: rpc_exec_direct: rpc_client=#<MCollective::RPC::Client:0x7fee4c0200c0> DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"cartridge-list", :cartridge=>"openshift-origin-node", :args=>{"--with-descriptors"=>true, "--porcelain"=>true}}, @id, {'identity' => @id}) CURRENT SERVER: node0.example.com DEBUG: find_one_impl: current_server: node0.example.com DEBUG: rpc_exec_direct: rpc_client=#<MCollective::RPC::Client:0x7fee4c0363c0> DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"cartridge-list", :cartridge=>"openshift-origin-node", :args=>{"--with-descriptors"=>true, "--porcelain"=>true}}, @id, {'identity' => @id}) CURRENT SERVER: node0.example.com DEBUG: find_one_impl: current_server: node0.example.com DEBUG: rpc_exec_direct: rpc_client=#<MCollective::RPC::Client:0x7fee4c052a48> DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"cartridge-list", :cartridge=>"openshift-origin-node", :args=>{"--with-descriptors"=>true, "--porcelain"=>true}}, @id, {'identity' => @id}) CURRENT SERVER: node0.example.com DEBUG: find_one_impl: current_server: node0.example.com DEBUG: rpc_exec_direct: rpc_client=#<MCollective::RPC::Client:0x7fee4c069360> DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"cartridge-list", :cartridge=>"openshift-origin-node", :args=>{"--with-descriptors"=>true, "--porcelain"=>true}}, @id, {'identity' => @id}) CURRENT SERVER: node0.example.com <--snip--> --- Additional comment from dmcphers on 2012-10-25 13:18:46 EDT --- xjia, It sounds like your rhc tools aren't pointing to the right server. Either way I don't want to track anything in this bug other than a request for auto clearing the cache. Regarding your problem can you run the command with -d to see what server it's hitting and check the apache access/error logs to see if the request is even getting through. If you are pointing to the right location and the logs don't tell you what's wrong please open a new bug for your issue. Thanks, Dan --- Additional comment from jialiu on 2012-10-31 03:15:57 EDT --- [root@broker broker]# pwd /var/www/openshift/broker [root@broker broker]# bundle exec rake tmp:clear Then the cartridge list case will be cleared. This functionality is also necessary to OSE. I thinks this should be added into openshift-broker service to clear rails cache automatically.
This issue also exists in OSE env after retest it against 2012-10-30.7 puddle. Discuss this issue with bleanhar by irc, this bug should be fixed in OSE 1.0, so I cloned this bug here and set "high" priority.
Unfortunately I don't think we're going to get to this before 1.0. We'll likely ship it shortly after in an Errata. We'll be sure to make the change upstream as well. I'm going to lower the priority for now.
We're still working how we're going to solve this for our next release.
We're creating an R&D story for this.
Clearing the cache in response to stale list of cartridges is mentioned in the Admin guide at https://access.redhat.com/knowledge/docs/en-US/OpenShift_Enterprise/1.0/html-single/Administration_Guide/index.html#sect-OpenShift_Enterprise-Administration_Guide-Clearing_Broker_Application_Cache ... which is admittedly rather obscure. Some possible ways to partially address this problem soon: 1. Add something to the deployment and troubleshooting guides about this. 2. Have "service openshift-broker" clear the cache whenever started. 3. Have oo-accept-systems and/or oo-diagnostics check for cartridge list differing between cache and mcollective and notify to clear cache. In the long term, the problem is that this is a cache, and by design can go stale. It's mainly a usability issue at install time, so it's particularly painful during evaluations, so we would like to address it. The R&D story is to explore ways to address it more permanently, e.g. lower the cache timeout, don't cache this, node notify broker at cart installation time, use memcached instead of files for caching, etc. Creating US3201 for the long-term story.
oo-diagnostics and oo-accept-systems now test for this and recommend a fix. Technical solutions may develop upstream or per US3201. Deployment manual already has https://access.redhat.com/knowledge/docs/en-US/OpenShift_Enterprise/1/html-single/Deployment_Guide/index.html#sect-OpenShift_Enterprise-Deployment_Guide-Continuing_Node_Installation_for_Enterprise-Cartridge_Availability Converting this to a docs bug as all that is left is to mention this in troubleshooting manual.
Recommended fix is live in the OSE Troubleshooting Guide as of Revision 1.1-1: https://access.redhat.com/knowledge/docs/en-US/OpenShift_Enterprise/1/html/Troubleshooting_Guide/sect-OpenShift_Enterprise-Troubleshooting_Guide-Outdated_Cartridge_List.html Per Luke in comment #7, "Technical solutions may develop upstream or per US3201," so closing this out CURRENTRELEASE as the docs update was the last remaining item.