Description of problem: node.js supervisor provides several functions. One of these functions is that it watches files for changes. The nodejs is configured to watch a very broad set of files (presumably to avoid the need for restarts). There seems to be a performance issue with this, though, and it's also conceivable that a gear will exceed fs.inotify.max_user_watches, which means it won't perform its job as expected. I would suggest that we bring back the user's ability to simply not use supervisor, but barring that, we should at least give them a way to control which files are watched. See also: http://stackoverflow.com/questions/5394620/node-js-how-does-fs-watchfile-work https://bugzilla.redhat.com/show_bug.cgi?id=1010461 Version-Release number of selected component (if applicable): openshift-origin-cartridge-nodejs-1.15.7-1.el6oso.noarch
If this is too much work for bug, please transfer to a Trello card
https://github.com/openshift/origin-server/pull/3700 allows tuning of the pool interval is that sufficient?
I will leave it to users impacted by these two bugs to say whether resolving 1010461 reduces the severity of this one, but I think this issue still needs to be addressed separately.
Fork AMI created
Test on fork_ami_nodejs_supervisor_939, met below error when git push nodejs-0.10 app. It is ok with nodejs-0.6 app. root@openshift-ubuntu:~/test/n10a# git add .; git commit -ama3; git push [master b377c03] a3 Counting objects: 3, done. Compressing objects: 100% (2/2), done. Writing objects: 100% (2/2), 244 bytes, done. Total 2 (delta 1), reused 0 (delta 0) remote: Stopping NodeJS cartridge remote: Warning: Application 'n10a' unable to stop. Use force-stop to kill. remote: An error occurred executing 'gear prereceive' (exit code: 141) remote: Error message: Failed to execute: 'control stop' for /var/lib/openshift/528b4258ea6bd992e4000045/nodejs remote: remote: For more details about the problem, try running the command again with the '--trace' option. To ssh://528b4258ea6bd992e4000045.rhcloud.com/~/git/n10a.git/ ! [remote rejected] master -> master (pre-receive hook declined) error: failed to push some refs to 'ssh://528b4258ea6bd992e4000045.rhcloud.com/~/git/n10a.git/' BTW, Could you help to give some steps about how to verify this bug? Thanks:)
I've made a small code change and started a new fork ami job for this issue. I wasn't able to reproduce the issue w/ nodejs-0.10. As for how to test this issue, basically what this change amounts to is that until the hot_deploy marker is added, nodejs will run without supervisor. When the marker is added, the node process is stopped and supervisor started. Subsequent pushes will not require a restart until the marker is removed. When the marker is removed, the supervisor process is stopped and nodejs restarted without supervisor.
Hi, Paul Morie "When the marker is added, the node process is stopped and supervisor started." Does this mean after hot_deploy marker added, the node server.js process will stop, and just the supervisor works in app? But when I test the issue on fork_ami_nodejs_supervisor_941, seems after added hot_deploy marker and push, check the app process, both supervisor and node server.js are running. [n6-dyy.dev.rhcloud.com 528c532b663e9eea80000001]\> ps -ef UID PID PPID C STIME TTY TIME CMD 1001 29467 29453 0 01:52 ? 00:00:00 sshd: 528c532b663e9eea80000001@pts/0 1001 29474 29467 0 01:52 pts/0 00:00:00 /bin/bash --init-file /usr/bin/rhcsh -i 1001 30322 1 0 01:53 ? 00:00:00 node /usr/bin/supervisor -e node|js|coffee -p 1000 -- server.js 1001 31144 30322 0 01:55 ? 00:00:00 node server.js 1001 31581 29474 0 01:55 pts/0 00:00:00 ps -ef
Very good question. I noticed the same thing, but if you check the pid of the node process before you add the hot_deploy marker and after, you will notice the pid changes. Supervisor and node run as separate processes so to have them both running is expected.
thanks Paul Morie Test on fork_ami_nodejs_supervisor_941 1. After nodejs-0.6/1.0 app created, nodejs will run without supervisor. 2. Add hot_deploy marker, supervisor started, App won't restart during deploying. Although both the supervisor and nodejs running when check the process in app, but nodejs PID changed after every deploying finished. 3. Remove hot_deploy marker, there is no supervisor running and app can restart normally. Move bug to verified.
Commit pushed to master at https://github.com/openshift/origin-server https://github.com/openshift/origin-server/commit/83c292ffbb4d225523b0181b4ccec2cf03f27af2 Fix bug 1010464: only use supervisor is hot_deploy is enabled