1077353 – multiple nodejs processes running in a gear

Bug 1077353 - multiple nodejs processes running in a gear

Summary: multiple nodejs processes running in a gear

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	OpenShift Online
Classification:	Red Hat
Component:	Image
Sub Component:
Version:	2.x
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	---
Assignee:	Ben Parees
QA Contact:	libra bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1116817
TreeView+	depends on / blocked

Reported:	2014-03-17 19:27 UTC by Andy Grimm
Modified:	2016-11-08 03:47 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1116817 (view as bug list)
Environment:
Last Closed:	2014-10-10 00:46:58 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Andy Grimm 2014-03-17 19:27:24 UTC

Description of problem:

I saw three cases today where a gear had multiple nodejs supervisor processes running.  The result was that the second instance's child process kept dying, since they could not bind to port 8080.  They kept retrying, consuming the gear's entire CPU quota.

Version-Release number of selected component (if applicable):

openshift-origin-cartridge-nodejs-1.22.4-1.el6oso.noarch

Comment 1 Michal Fojtik 2014-03-17 20:25:24 UTC

Andy: Do you have more details? Does the apps use hot_deploy?

Comment 2 Michal Fojtik 2014-04-11 10:40:26 UTC

Andy, ping? ;-)

Comment 4 Andy Grimm 2014-04-16 19:40:59 UTC

It looks like two of the apps where I'm currently seeing this got unidled twice concurrently.  It's not clear what happened with the third; it was started at 19:40:44 and restarted at 19:41:49.  Maybe the first set of processes didn't die?  

The upcoming fix for BZ 1061926 may fix at least two of these three occurrences.

Comment 5 Ben Parees 2014-06-27 20:29:46 UTC

It looks like this could happen if someone removed the pid file and then issued a restart (the cart logic will just start another instance if the pidfile is not found).

A number of our carts share this logic, but nodejs may be the only one that auto-restarts due to the bind failure.

I will look into making the "is started" checking more robust.

Comment 6 Ben Parees 2014-07-01 14:48:35 UTC

Adding logic to recreate the pid file if it does not exist, prior to checking if the process is started.

https://github.com/openshift/origin-server/pull/5562

Comment 7 Wenjing Zheng 2014-07-02 06:53:04 UTC

Verified on devenv_4932, there is no multiple nodejs process as below:

1. Create a nodejs-0.10 app
2. SSH into gear, delete the cartridge.pid file under $OPENSHIFT_NODEJS_PID_DIR and check the process:
[n10-d.dev.rhcloud.com 53b3df4040b38ce446000001]\> ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
1000      7138     1  0 06:30 ?        00:00:00 node /opt/rh/nodejs010/root/usr
1000      7139     1  0 06:30 ?        00:00:00 /usr/bin/logshifter -tag nodejs
1000      7158  7138  0 06:30 ?        00:00:00 node server.js
1000      9028  9015  0 06:34 ?        00:00:00 sshd: 53b3df4040b38ce446000001@
1000      9029  9028  1 06:34 pts/2    00:00:00 /bin/bash --init-file /usr/bin/
1000      9252  9029  0 06:34 pts/2    00:00:00 ps -ef
3. restart gear and re-check the process
[n10-d.dev.rhcloud.com 53b3df4040b38ce446000001]\> ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
1000     11887     1  0 06:39 ?        00:00:00 node /opt/rh/nodejs010/root/usr/bin/supervisor
1000     11888     1  0 06:39 ?        00:00:00 /usr/bin/logshifter -tag nodejs
1000     11914 11887  0 06:39 ?        00:00:00 node server.js
1000     12017 12004  0 06:39 ?        00:00:00 sshd: 53b3df4040b38ce446000001@pts/2
1000     12018 12017  3 06:39 pts/2    00:00:00 /bin/bash --init-file /usr/bin/rhcsh -i
1000     12230 12018  0 06:39 pts/2    00:00:00 ps -ef

Note You need to log in before you can comment on or make changes to this bug.