Bug 815932

Summary: Can't read stickshift-proxy.cfg
Product: OKD Reporter: Kenny Woodson <kwoodson>
Component: ContainersAssignee: Rob Millner <rmillner>
Status: CLOSED CURRENTRELEASE QA Contact: libra bugs <libra-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.xCC: mfisher, twiest, xtian
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-04-27 20:45:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Kenny Woodson 2012-04-24 20:29:21 UTC
Description of problem:

I stumbled upon these messages in our broker log when debugging something else:

DEBUG: Mon Apr 23 23:32:19 -0400 2012 Sending to Apptegic:application: user='mmcgrath+nagios' app_uuid='957178c0d2a547a0b363d647c64bb54b' action='deconfigure'
DEBUG: Mon Apr 23 23:32:19 -0400 2012 Done sending to Apptegic
DEBUG: rpc_exec_direct: rpc_client=#<MCollective::RPC::Client:0x7f59ae255148>
DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"deconfigure", :args=>"'chkexsrv4' 'openshiftnagios' '957178c0d2a547a0b363d647c64bb54b'", :cartridge=>"php-5.3"}, @id, {'identity' => @id})
DEBUG: [#<MCollective::RPC::Result:0x7f59ae181078 @agent="libra", @results={:statuscode=>0, :data=>{:exitcode=>0, :output=>"md5sum: /var/lib/stickshift/.stickshift-proxy.d/stickshift-proxy.cfg: No such file or directory\nsed: can't read /var/lib/stickshift/.stickshift-proxy.d/stickshift-proxy.cfg: No such file or directory\nError: Failed at 39111 delete\nError: Bad proxy configuration.\nWaiting for stop to finish\n[Mon Apr 23 23:32:21 2012] [warn] NameVirtualHost *:443 has no VirtualHosts\n[Mon Apr 23 23:32:21 2012] [warn] NameVirtualHost *:80 has no VirtualHosts\n"}, :sender=>"ex-std-node42.prod.rhcloud.com", :statusmsg=>"OK"}, @action="cartridge_do">]
MongoDataStore.save(Application, mmcgrath+nagios, chkexsrv4, #hidden)


Version-Release number of selected component (if applicable):

rhc-broker-0.90.25-1.el6_2.noarch

How reproducible:

I viewed this in our log file in multiple places and on multiple servers.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Thomas Wiest 2012-04-24 20:34:41 UTC
This seems to only happen on new ex-nodes. On the old ex-nodes, the file is already there.

When I try to start stickshift-proxy I get this error.

[ ~]# /etc/init.d/stickshift-proxy start
sed: can't read /var/lib/stickshift/.stickshift-proxy.d/stickshift-proxy.cfg: No such file or directory
[ALERT] 114/163155 (26545) : Could not open configuration file /var/lib/stickshift/.stickshift-proxy.d/stickshift-proxy.cfg : No such file or directory
Errors in configuration file, check with stickshift-proxy check.
[ ~]#


This file has never been in our configuration management tools. I also don't think it should be as it seems to be updated by the stickshift-proxy itself when it adds new nodes.

It seems to me that this file used to be laid down by an RPM, but isn't now (since older ex-nodes have it and new ex-nodes don't).

Comment 2 Rob Millner 2012-04-24 20:52:09 UTC
This file is installed by the rhc-node RPM in /etc/stickshift/stickshift-proxy.cfg and then its %post should copy into place in /var/lib/stickshift/.stickshift-proxy.d if there's not already one in place.

Comment 3 Rob Millner 2012-04-24 21:00:30 UTC
The file exists on an instance of devenv_1739 that's been sync'd to now.

The file exists on a fresh instance of devenv_1739 with no updates.

Comment 4 Rob Millner 2012-04-24 21:12:36 UTC
It seems like either %post for rhc-node or a previous migration script failed.  Here's the work-around copied from rhc-node.

mkdir -p /var/lib/stickshift/.stickshift-proxy.d
/sbin/chkconfig --add stickshift-proxy || :
if ! [ -f /var/lib/stickshift/.stickshift-proxy.d/stickshift-proxy.cfg ]; then
   cp /etc/stickshift/stickshift-proxy.cfg \
      /var/lib/stickshift/.stickshift-proxy.d/stickshift-proxy.cfg
   restorecon /var/lib/stickshift/.stickshift-proxy.d/stickshift-proxy.cfg
fi
/sbin/restorecon /var/lib/stickshift/.stickshift-proxy.d/
/sbin/service stickshift-proxy condrestart

Comment 5 Rob Millner 2012-04-24 21:24:17 UTC
Verified the file exists in stage_174 which has the same broker version.

Are the ex nodes being launched derived from stage_174 or are they from an earlier build?

If they are derived from stage_174, would you be able to determine whether there were errors installing the rhc-node package?

Thanks!

Comment 6 Thomas Wiest 2012-04-24 22:42:15 UTC
ex-nodes are not launched based on devenv AMIs, we have our own AMI.

Comment 7 Thomas Wiest 2012-04-24 22:59:36 UTC
Using the manual workaround from comment 4, I've manually fixed the problem.

Note, only these 5 nodes were affected: 35, 37, 40, 41, 42

Comment 8 Rob Millner 2012-04-24 23:40:13 UTC
Workaround provided.  It doesn't seem like there's enough info to debug this further.

If possible, try to capture debug info from the RPM installation next time you have to spin up more nodes and we'll see if there's a failure in %post.

Passing to Q/E to validate that the proxy configuration file exists on dev and stg.  You can test either by directly checking or by creating a scalable app and verifying that the expose ports worked.

Comment 9 Xiaoli Tian 2012-04-25 06:41:42 UTC
Checked this on devenv_1741, the file exists 
#ls -lh  /var/lib/stickshift/.stickshift-proxy.d/
total 4.0K
-rw-r-----. 1 root root 2.0K Apr 25 02:29 stickshift-proxy.cfg


[root@ip-10-118-31-171 ~]# /etc/init.d/stickshift-proxy restart
Stopping stickshift-proxy:                                 [  OK  ]
Starting stickshift-proxy:                                 [  OK  ]