851417 – scalable app local gear down after server upgrade

Bug 851417 - scalable app local gear down after server upgrade

Summary: scalable app local gear down after server upgrade

Keywords:
Status:	CLOSED DUPLICATE of bug 852598
Alias:	None
Product:	OKD
Classification:	Red Hat
Component:	Containers
Sub Component:
Version:	2.x
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	low
Target Milestone:	---
Target Release:	---
Assignee:	Ram Ranganathan
QA Contact:	libra bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-08-24 07:08 UTC by Meng Bo
Modified:	2015-05-14 22:58 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2012-09-06 01:22:03 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
nodejs_scale_event (8.16 KB, text/x-log) 2012-08-24 07:08 UTC, Meng Bo	no flags	Details
php_scale_event (8.16 KB, text/x-log) 2012-08-24 07:09 UTC, Meng Bo	no flags	Details
error_log_for_python (4.16 MB, text/plain) 2012-08-28 02:20 UTC, Meng Bo	no flags	Details
node_log_for_nodejs (15.55 KB, text/x-log) 2012-08-28 02:22 UTC, Meng Bo	no flags	Details
View All

Description Meng Bo 2012-08-24 07:08:17 UTC

Description of problem:
Old scalable app exist on server, do server upgrade and migrate, check the app /haproxy-status/ page. The local-gear has a down status.

Version-Release number of selected component (if applicable):
stage_2.0.16
devenv_2069

How reproducible:
always

Steps to Reproduce:
1.Create scalable apps on old instance
2.Do upgrade and migrate
3.Check the scalable app /haproxy-status/ page
  
Actual results:
local-gear has down statue.

Expected results:
local-gear should not show as down status.

Additional info:
For php/perl/ruby1.8/ruby1.9/jbossas/jbosseap apps, after a start or restart action, the local-gear can be woke up.
For python and nodejs apps, after a start or restart action, the local-gear still has down status.
Attached the two conditions scale_event.log for debug.

All the apps can be scaled-up by ”haproxy_ctld -u“.

Comment 1 Meng Bo 2012-08-24 07:08:53 UTC

Created attachment 606771 [details]
nodejs_scale_event

Comment 2 Meng Bo 2012-08-24 07:09:27 UTC

Created attachment 606772 [details]
php_scale_event

Comment 3 Meng Bo 2012-08-24 07:34:41 UTC

Set the severity to low, since the app home page can be accessed, and it can be scaled-up by 'haproxy_ctld -u'.
No bad effect to end user.

Comment 4 Ram Ranganathan 2012-08-24 20:27:37 UTC

@Meng, can you try stopping + starting the app and see if that fixes it??  
Thanks.

Comment 5 Meng Bo 2012-08-27 01:47:59 UTC

Hi Ram,

Last Friday, when I tring to stop-start or restart the scalable app, all the other apps' local-gear can up expect python-2.6 and nodejs-0.6.
But after I check it again this morning, scalable jbossas and jbosseap's local-gear cannot be woke up too.

Comment 6 Ram Ranganathan 2012-08-27 22:25:39 UTC

Hi Meng,
    Hmm, that might be because the apps got idled??  Can you do a 
    rhc app stop+start  and see if that works -- that should bring all the gears back
    up.  If not, could you please attach the logs from jboss/node/python(apache). 
    Let's see if there's anything there. 

    On the idle/restore friont, there's a user story to idle/restore haproxy + gears -- 
    US2772. 

    Thanks,

Ram//

Comment 7 Meng Bo 2012-08-28 02:16:14 UTC

Hi Ram,

I checked the apps this morning, and jbossas/jbosseap's local gear is up now.
But for python and nodejs, the local-gear still down and cannot be woke up by rhc app stop and rhc app start.


The apps url are:
http://py1s-qgongstg.stg.rhcloud.com/
http://no1s-qgongstg.stg.rhcloud.com/

======================
[bmeng@localhost openshift_testdir]$ rhc app stop -a py1s 

RESULT:
Success

[bmeng@localhost openshift_testdir]$ rhc app start -a py1s 

RESULT:
Success

[bmeng@localhost openshift_testdir]$ rhc app status -a py1s 

RESULT:
Total Accesses: 6
Total kBytes: 30
CPULoad: .272727
Uptime: 11
ReqPerSec: .545455
BytesPerSec: 2792.73
BytesPerReq: 5120
BusyWorkers: 1
IdleWorkers: 0
Scoreboard: W....
Total Accesses: 7
Total kBytes: 4
CPULoad: .266667
Uptime: 15
ReqPerSec: .466667
BytesPerSec: 273.067
BytesPerReq: 585.143
BusyWorkers: 1
IdleWorkers: 0
Scoreboard: W....

[bmeng@localhost openshift_testdir]$ rhc app stop -a no1s 

RESULT:
Success

[bmeng@localhost openshift_testdir]$ rhc app start -a no1s 

RESULT:
Success

[bmeng@localhost openshift_testdir]$ rhc app status -a no1s 

RESULT:
Application '00d5bb518c' is running
Application 'no1s' is either stopped or inaccessible
Running Processes:
UID        PID  PPID  C    SZ   RSS PSR STIME TTY          TIME CMD
1773     21117     1  0  4110  1304   1 22:04 ?        00:00:00 /usr/sbin/haproxy -f /var/lib/stickshift/1ee8addf6868472dacf5c1c9553d901b//haproxy-1.4/conf/haproxy.cfg
1773     21168     1  0 17492 20020   0 22:04 ?        00:00:00 haproxy_ctld                                                                                            
1773     21698 21673  0  2306  1184   1 22:04 ?        00:00:00 /bin/bash -e /var/lib/stickshift/1ee8addf6868472dacf5c1c9553d901b/no1s/no1s_ctl.sh status
1773     21771 21698  1  2373  1560   0 22:04 ?        00:00:00 /bin/bash -e /usr/libexec/stickshift/cartridges/embedded/haproxy-1.4/info/bin/app_ctl.sh status
1773     22029 21771  0  3342  1016   0 22:04 ?        00:00:00 ps -FCvx -U 1773


Attached the node.log for nodejs app and error_log for python app.

Comment 8 Meng Bo 2012-08-28 02:20:24 UTC

Created attachment 607381 [details]
error_log_for_python

Comment 9 Meng Bo 2012-08-28 02:22:00 UTC

Created attachment 607382 [details]
node_log_for_nodejs

Comment 10 Ram Ranganathan 2012-09-06 01:22:03 UTC

The error here is related to this bug:  
     https://bugzilla.redhat.com/show_bug.cgi?id=852598


Fixed in new apps -- workaround for old apps is to remove SIGPIPE  in server.js for the sample app from the list of signals the node process terminates on.

*** This bug has been marked as a duplicate of bug 852598 ***

Note You need to log in before you can comment on or make changes to this bug.