Bug 986080

Summary: Cartridge cache broken while upgrading to 2.0.30 in PROD
Product: OpenShift Online Reporter: Thomas Wiest <twiest>
Component: ContainersAssignee: Jhon Honce <jhonce>
Status: CLOSED DUPLICATE QA Contact: libra bugs <libra-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 2.xCC: dmcphers
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-26 18:59:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Thomas Wiest 2013-07-18 22:41:58 UTC
Description of problem:
During the normal process of updating to 2.0.30 in PROD, we started getting these errors in the mcollective log on the ex-nodes.

Error:
I, [2013-07-18T15:01:48.235593 #3413]  INFO -- : runner.rb:23:in `block in initialize' Reloading all agents after rec
eiving USR1 signal
W, [2013-07-18T15:01:48.250785 #3413]  WARN -- : agent.rb:234:in `metadata' openshift.rb:23:in `<class:Openshift>': s
etting meta data in agents have been deprecated, DDL files are now being used for this information.
W, [2013-07-18T15:01:48.251170 #3413]  WARN -- : agents.rb:95:in `rescue in activate_agent?' Agent activation check f
or openshift failed: NameError: uninitialized constant OpenShift::Runtime::CartridgeRepository
I, [2013-07-18T15:01:49.212931 #3413]  INFO -- : runner.rb:23:in `block in initialize' Reloading all agents after rec
eiving USR1 signal

Restarting mcollective fixed this problem.

This problem caused the broker cartridge cache to become invalid, which made it so people couldn't create apps.


Version-Release number of selected component (if applicable):
rhc-node-1.11.7-1.el6oso.x86_64


How reproducible:
unkown, but we didn't see this in INT or STG.


Steps to Reproduce:
1. Unknown


Actual results:
Rest API was broken


Expected results:
We should have been able to upgrade without downtime.

Comment 1 Thomas Wiest 2013-07-18 23:16:42 UTC
Dan McPherson asked me to add this line from mcolective.log to this bug:

E, [2013-07-18T18:45:45.282900 #18584] ERROR -- : pluginmanager.rb:171:in `rescue in loadclass' Failed to load MCollective::Agent::Openshift: cannot load such file -- openshift-origin-node/plugins/unix_user_observer

Comment 2 Dan McPherson 2013-07-18 23:41:33 UTC
The fundamental problem is we are updating our source which reloads our mco agent but the puppet changes haven't ran yet.  So any config necessary to run isn't available.  The solution to this is we can not reload the agent as part of rpm install.

Comment 3 Jhon Honce 2013-07-26 18:59:25 UTC

*** This bug has been marked as a duplicate of bug 985514 ***